statistics
This module contains functions to perform spatial statistics calculations.
P_VALUE
Description
This function computes the pvalue (twotails test) of a given zscore assuming the population follows a normal distribution where the mean is 0 and the standard deviation is 1. The zscore is a measure of how many standard deviations below or above the population mean a value is. It gives you an idea of how far from the mean a data point is. The pvalue is the probability that a randomly sampled point has a value at least as extreme as the point whose zscore is being tested.
z_score
:FLOAT
Return type
FLOAT
Example
CREATE_SPATIAL_COMPOSITE_SUPERVISED
Description
This procedure derives a spatial composite score as the residuals of a regression model which is used to detect areas of under and overprediction. The response variable should be measurable and correlated with the set of variables defining the score. For each data point. the residual is defined as the observed value minus the predicted value. Rows with a NULL value in any of the individual variables are dropped.
Input parameters
input_query
:STRING
the query to the data used to compute the spatial composite. It must contain all the individual variables that should be included in the computation of the composite as well as a unique geographic id for each row. A qualified table name can be given as well, e.g.<database>.<schema>.<table>
.index_column
:STRING
the name of the column with the unique geographic identifier.output_table
:STRING
the prefix for the output table. It should include database and schema, e.g.<database>.<schema>.<output_table>
.options
:STRING
containing a valid JSON with the different options. Valid options are described below.model_options
:JSON
string containing all the settings to be passed to the ML model function. These settings are:input_label
:STRING
name of the column to be used as a target to train the model and evaluate the predictions.encoder
:JSON
containing the name and parameters of the class from Snowpark ML to be used as an encoder, which will be applied to all the categorical features in your input query. It can beNULL
or omitted, but the function will return an error if there are categorical columns present. It can contain two different values:class
, aSTRING
containing the fully qualified name of the Snowpark ML modeling class to be used in the step.options
, an optionalJSON
dictionary containing the keyword arguments to be passed to theclass
during initialization. Please check the Snowpark ML API reference to check which parameters can be passed to the class.
scaler
:JSON
containing the name and parameters of the class from Snowpark ML to be used as a scaler, which will be applied to all the input features (numerical or encoded categories) in your query. It can beNULL
or omitted to skip this step altogether. It can contain two different values:class
, aSTRING
containing the fully qualified name of the Snowpark ML modeling class to be used in the step.options
, an optionalJSON
dictionary containing the keyword arguments to be passed to theclass
during initialization. Please check the Snowpark ML API reference to check which parameters can be passed to the class.
regressor
:JSON
containing the name and parameters of the class from Snowflake ML to be used as a regressor, and whose predictions will be used to generate the score index. It can contain two different values:class
, aSTRING
containing the fully qualified name of the Snowpark ML modeling class to be used in the step.options
, an optionalJSON
dictionary containing the keyword arguments to be passed to theclass
during initialization. Please check the Snowpark ML API reference to check which parameters can be passed to the class.
bucketize_method
:STRING
the method used to discretize the spatial composite score. The default value is NULL, which will return a continuous variable. Possible options are:EQUAL_INTERVALS_ZERO_CENTERED
: the values of the spatial composite score are discretized into buckets of equal widths centered in zero. The lower and upper limits are derived from the outliersremoved maximum of the absolute values of the score.
nbuckets
:INTEGER
the number of buckets used when a bucketization method is specified. The default number of buckets is selected using Freedman and Diaconis’s (1981) rule. Ignored ifbucketize_method
is not specified.remove_outliers
:BOOL
. Whenbucketize_method
is specified, ifremove_outliers
is set totrue
the buckets are derived from the oulierremoved data. The outliers are computed using Tukey’s fences k parameter for outlier detection. The default value istrue
. Ignored ifbucketize_method
is not specified.r2_thr
:FLOAT
the minimum allowed value for the R2 model score. If the R2 of the regression model is lower than this threshold this implies poor fitting and an error is raised. If it is NULL, a default value of 0.5 is used instead.
Return type
The results are stored in the table named output_table
, which contains the following columns:
index_column
: the unique geographic identifier. The type of this column depends on the type ofindex_column
ininput_query
.spatial_score
: the value of the composite score. The type of this column isFLOAT
if the score is not discretized andINTEGER
otherwise.
When the score is discretized by specifying the bucketize_method
parameter, the procedure also returns a lookup table named <output_table>_lookup_table
with the following columns:
lower_bound
:FLOAT
the lower bound of the bin.upper_bound
:FLOAT
the upper bound of the bin.spatial_score
:INTEGER
the value of the (discretized) composite score.
Example
CREATE_SPATIAL_COMPOSITE_UNSUPERVISED
Description
This procedure combines (spatial) variables into a meaningful composite score. The composite score can be derived using different methods, scaling and aggregation functions and weights. Rows with a NULL value in any of the model predictors are dropped.
Input parameters
input
:STRING
the query to the data used to compute the spatial composite. It must contain all the individual variables that should be included in the computation of the composite as well as a unique geographic id for each row. A qualified table name can be given as well, e.g.<database>.<schema>.<table>
.index_column
:STRING
the name of the column with the unique geographic identifier.output_table
:STRING
the name for the output table. It should include database and schema, e.g.<database>.<schema>.<output_table>_SCORE
.options
:STRING
containing a valid JSON with the different options. Valid options are described below. If options is set to NULL then all options are set to default values, as specified in the table below.scoring_method
:STRING
Possible options are ENTROPY, CUSTOM_WEIGHTS, FIRST_PC. With the ENTROPY method the spatial composite is derived as the weighted sum of the proportion of the minmax scaled individual variables, where the weights are based on the entropy of the proportion of each variable. Only numerical variables are allowed. With the CUSTOM_WEIGHTS method, the spatial composite is computed by first scaling each individual variable and then aggregating them according to userdefined scaling and aggregation methods and individual weights. Depending on the scaling parameter, both numerical and ordinal variables are allowed (categorical and boolean variables need to be transformed to ordinal). With the FIRST_PC method, the spatial composite is derived from a Principal Component Analysis as the first principal component score. Only numerical variables are allowed.weights
:ARRAY
the (optional) weights for each variable used to compute the spatial composite when scoring_method is set to CUSTOM_WEIGHTS. If a different scoring method is selected, then this input parameter is ignored. If specified, the sum of the weights must be lower than 1. If no weights are specified, equal weights are assumed. If weights are specified only for some variables and the sum of weights is less than 1, the remainder is distributed equally between the remaining variables. If weights are specified for all the variables and the sum of weights is less than 1, the remainder is distributed equally between all the variables.scaling
:STRING
the userdefined scaling when the scoring_method is set to CUSTOM_WEIGHTS. Possible options are:MIN_MAX_SCALER: data is rescaled into the range [0,1] based on minimum and maximum values. Only numerical variables are allowed.
STANDARD_SCALER: data is rescaled by subtracting the mean value and dividing the result by the standard deviation. Only numerical variables are allowed.
RANKING: data is replaced by its percent rank, that is by values ranging from 0 lowest to 1. Both numerical and ordinal variables are allowed (categorical and boolean variables need to be transformed to ordinal).
DISTANCE_TO_TARGET_MIN(_MAX,_AVG): data is rescaled by dividing by the minimum, maximum, or mean of all the values. Only numerical variables are allowed.
PROPORTION: data is rescaled by dividing by the sum total of all the values. Only numerical variables are allowed.
aggregation
:STRING
the aggregation function used when the scoring_method is set to CUSTOM_WEIGHTS. Possible options are:LINEAR: the spatial composite is derived as the weighted sum of the scaled individual variables.
GEOMETRIC: the spatial composite is given by the product of the scaled individual variables, each to the power of its weight.
correlation_var
:STRING
when scoring_method is set to FIRST_PC, the spatial score will be positively correlated with the selected variable (i.e. the sign the spatial score is set such that the correlation between the selected variable and the first principal component score is positive).correlation_thr
:FLOAT
the minimum absolute value of the correlation between each individual variable and the first principal component score when scoring_method is set to FIRST_PC.return_range
:ARRAY
the userdefined normalization range of the spatial composite score, e.g [0.0,1.0]. Ignored ifbucketize_method
is specified.bucketize_method
:STRING
the method used to discretize the spatial composite score. Possible options are:EQUAL_INTERVALS: the values of the spatial composite score are discretized into buckets of equal widths.
QUANTILES: the values of the spatial composite score are discretized into buckets based on quantiles.
JENKS: the values of the spatial composite score are discretized into buckets obtained using kmeans clustering.
nbuckets
:INTEGER
the number of buckets used when a bucketization method is specified. Whenbucketize_method
is set to EQUAL_INTERVALS, ifnbuckets
is NULL, the default number of buckets is selected using Freedman and Diaconis's (1981) rule. Whenbucketize_method
is set to JENKS or QUANTILES,nbuckets
cannot be NULL. Whenbucketize_method
is set to JENKS the maximum value is 100, aka the maximum number of clusters allowed by BigQuery with kmeans clustering.bucketize_random_state
:INTEGER
the random state used to run the discretization whenbucketize_method
is set to JENKS. If a different scoring method is selected, then this input parameter is ignored. A nonnegative value must be specified. It defaults to 42.
Option 


 Valid options  Default value 
 Optional  Optional  Optional  ENTROPY, CUSTOM_WEIGHTS, FIRST_PC  ENTROPY 
 Ignored  Optional  Ignored 
 NULL 
 Ignored  Optional  Ignored  MIN_MAX_SCALER, STANDARD_SCALER, RANKING, DISTANCE_TO_TARGET_MIN, DISTANCE_TO_TARGET_MAX, DISTANCE_TO_TARGET_AVG, PROPORTION  MIN_MAX_SCALER 
 Ignored  Optional  Ignored  LINEAR, GEOMETRIC  LINEAR 
 Ignored  Ignored  Mandatory    NULL 
 Ignored  Ignored  Optional    NULL 
 Optional  Optional  Optional    NULL 
 Optional  Optional  Optional  EQUAL_INTERVALS, QUANTILES, JENKS  NULL 
 Optional  Optional  Optional    When 
 Optional  Optional  Optional    When 
Return type
The results are stored in the table named <output_table>
, which contains the following columns:
index_column
: the unique geographic identifier. The type of this column depends on the type ofindex_column
ininput
.spatial_score
: the value of the composite score. The type of this column isFLOAT
if the score is not discretized andINTEGER
otherwise.
When the score is discretized by specifying the bucketize_method
parameter, the procedure also returns a lookup table named <output_table>_LOOKUP_TABLE
with the following columns:
lower_bound
:FLOAT
the lower bound of the bin.upper_bound
:FLOAT
the upper bound of the bin.spatial_score
:INTEGER
the value of the (discretized) composite score.
Examples
With the ENTROPY
method:
With the CUSTOM_WEIGHTS
method:
With the FIRST_PC
method:
CRONBACH_ALPHA_COEFFICIENT
Description
This procedure computes the Cronbach’s alpha coefficient for a set of (spatial) variables. This coefficient can be used as a measure of internal consistency or reliability of the data, based on the strength of correlations between individual variables. Cronbach’s alpha reliability coefficient normally ranges between 0 and 1 but there is actually no lower limit to the coefficient. Higher alpha (closer to 1) vs lower alpha (closer to 0) means higher vs lower consistency, with usually 0.65 being the minimum acceptable value of internal consistency. Rows with a NULL value in any of the individual variables are dropped.
Input parameters
input
:STRING
the query to the data used to compute the coefficient. It must contain all the individual variables that should be included in the computation of the coefficient. A qualified table name can be given as well, e.g.<database>.<schema>.<table>
.output_table
:STRING
the name for the output table. It should include database and schema, e.g.<database>.<schema>.<output_table>
.
Return type
The output table with the following columns:
cronbach_alpha_coef
:FLOAT
the computed Cronbach Alpha coefficient.k
:INTEGER
the number of the individual variables used to compute the composite.mean_var
:FLOAT
the mean variance of all individual variables.mean_cov
:FLOAT
the mean interitem covariance among all variable pairs.
Example
GWR_GRID
Description
Geographically weighted regression (GWR) models local relationships between spatially varying predictors and an outcome of interest using a local least squares regression.
This procedure performs a local least squares regression for every input cell. This approach was selected to improve computation time and efficiency. The number of models is controlled by the selected cell resolution, thus the user can increase or decrease the resolution of the cell index to perform more or less regressions. Note that you need to provide the cell ID (spatial index) for every location as input (see cell_column
parameter), i.e., the cell type and resolution are not passed explicitly, but rather the index has to be computed previously. Hence if you want to increase or decrease the resolution, you need to precompute the corresponding cell ID of every location (see H3 or Quadbin module).
In each regression, the data of the locations in each cell and those of the neighboring cells, defined by the kring_distance
parameter, will be taken into account. The data of the neighboring cells will be assigned a lower weight the further they are from the origin cell, following the function specified in the kernel_function
. For example, considering cell i
and kring_distance
of 1. Having n
locations located inside cell i
, and in the neigheboring cells [n_1
, n_2
, ..., n_k
], then the regression of the cell i
will have in total n
+ n_1
+ n_2
+ ... + n_k
points.
input
:STRING
the query to the input data. A qualified table name can be given as well:<databaseid>.<schemaid>.<tablename>
.features_columns
:ARRAY
array of column names frominput_table
to be used as features in the GWR.label_column
:STRING
name of the target variable column.index_column
:STRING
name of the column containing the cell ids.kring_distance
:INT
distance of the neighboring cells whose data will be included in the local regression of each cell.kernel_function
:STRING
kernel function to compute the spatial weights across the kring. Available functions are: 'uniform', 'triangular', 'quadratic', 'quartic' and 'gaussian'.fit_intercept
:BOOLEAN
whether to calculate the interception of the model or to force it to zero if, for example, the input data is already supposed to be centered. If NULL,fit_intercept
will be considered asTRUE
.output_table
:STRING
name of the output table. It should be a quoted qualified table with project and dataset:<databaseid>.<schemaid>.<tablename>
.
Output
The output table will contain a column named either H3
(STRING
) OR QUADBIN
(BIGINT
) depending on the grid type, storing the unique geographic identifier of each grid cell, and a column for each feature column containing its corresponding coefficient estimate and one extra column for the intercept if fit_intercept
is set to TRUE
.
Examples
GETIS_ORD_H3
Description
This procedure computes the GetisOrd Gi* statistic for each row in the input table.
input
:STRING
the query to the data used to compute the coefficient. A qualified table name can be given as well:<databaseid>.<schemaid>.<tablename>
.output_table
:STRING
qualified name of the output table:<databaseid>.<schemaid>.<tablename>
.index_column
:STRING
name of the column with the H3 indexes.value_column
:STRING
name of the column with the values for each H3 cell.size
:INT
size of the H3 kring (distance from the origin). This defines the area around each index cell that will be taken into account to compute its Gi* statistic.kernel
:STRING
kernel function to compute the spatial weights across the kring. Available functions are: uniform, triangular, quadratic, quartic and gaussian.
Output
The results are stored in the table named <output_table>
, which contains the following columns:
h3
:STRING
the H3 index.gi
:FLOAT
computed Gi* value.p_value
:FLOAT
computed P value.
Example
GETIS_ORD_QUADBIN
Description
This procedure computes the GetisOrd Gi* statistic for each row in the input table.
input
:STRING
the query to the data used to compute the coefficient. A qualified table name can be given as well:<databaseid>.<schemaid>.<tablename>
.output_table
:STRING
qualified name of the output table:<databaseid>.<schemaid>.<tablename>
.index_column
:STRING
name of the column with the Quadbin indexes.value_column
:STRING
name of the column with the values for each Quadbin cell.size
:INT
size of the Quadbin kring (distance from the origin). This defines the area around each index cell that will be taken into account to compute its Gi* statistic.kernel
:STRING
kernel function to compute the spatial weights across the kring. Available functions are: uniform, triangular, quadratic, quartic and gaussian.
Output
The results are stored in the table named <output_table>
, which contains the following columns:
quadbin
:BIGINT
the QUADBIN index.gi
:FLOAT
computed Gi* value.p_value
:FLOAT
computed P value.
Example
GETIS_ORD_SPACETIME_H3
Description
This procedure computes the space temporal GetisOrd Gi* statistic for each H3 index and each datetime timestamp according to the method described in this paper. It extends the GetisOrd Gi* function by including the time domain. The GetisOrd Gi* statistic is a measure of spatial autocorrelation, which is the degree to which data values are clustered together in space and time. The statistic is computed as the sum of the values of the cells in the kring (distance from the origin, space and temporal) weighted by the kernel functions, minus the value of the origin cell, divided by the standard deviation of the values of the cells in the kring. The GetisOrd Gi* statistic is calculated from minimum to maximum datetime with the step defined by the user, in the input array. The datetime timestamp is truncated to the provided level, for example day / hour / week etc. For each spatial index, the missing datetime timestamp, from minimum to maximum, are filled with the default value of 0. Any other imputation of the values should take place outside of the function prior to passing the input to the function. The p value is computed as the probability of observing a value as extreme as the observed value, assuming the null hypothesis that the values are randomly distributed in space and time. The p value is computed using a normal distribution approximation.
input
:STRING
the query to the data used to compute the coefficient. A qualified table name can be given as well:<databaseid>.<schemaid>.<tablename>
.output_table
:STRING
qualified name of the output table:<databaseid>.<schemaid>.<tablename>
.index_column
:STRING
name of the column with the H3 indexes.date_column
:STRING
name of the column with the date.value_column
:STRING
name of the column with the values for each H3 cell.size
:INTEGER
size of the H3 kring (distance from the origin). This defines the area around each index cell that will be taken into account to compute its Gi* statistic.time_freq
:STRING
The time interval  step to use for the time series. Available values are:year
,quarter
,month
,week
,day
,hour
,minute
,second
. It is the equivalent of the spatial index in the time domain.time_bw
:INTEGER
The bandwidth to use for the time series. This defines the number of adjacent observations in time domain to be considered. It is the equivalent of the H3 kring in the time domain.kernel
:STRING
kernel function to compute the spatial weights across the kring. Available functions are: uniform, triangular, quadratic, quartic and gaussian.kernel_time
:STRING
kernel function to compute the temporal weights within the time bandwidth. Available functions are: uniform, triangular, quadratic, quartic and gaussian.
Output
The results are stored in the table named <output_table>
, which contains the following columns:
h3
:STRING
the H3 index.date
:DATETIME
gi
:FLOAT
computed Gi* value.p_value
:FLOAT
computed P value.
Example
GETIS_ORD_SPACETIME_QUADBIN
Description
This procedure computes the space temporal GetisOrd Gi* statistic for each Quadbin index and each datetime timestamp according to the method described in this paper. It extends the GetisOrd Gi* function by including the time domain. The GetisOrd Gi* statistic is a measure of spatial autocorrelation, which is the degree to which data values are clustered together in space and time. The statistic is computed as the sum of the values of the cells in the kring (distance from the origin, space and temporal) weighted by the kernel functions, minus the value of the origin cell, divided by the standard deviation of the values of the cells in the kring. The GetisOrd Gi* statistic is calculated from minimum to maximum datetime with the step defined by the user, in the input array. The datetime timestamp is truncated to the provided level, for example day / hour / week etc. For each spatial index, the missing datetime timestamp, from minimum to maximum, are filled with the default value of 0. Any other imputation of the values should take place outside of the function prior to passing the input to the function. The p value is computed as the probability of observing a value as extreme as the observed value, assuming the null hypothesis that the values are randomly distributed in space and time. The p value is computed using a normal distribution approximation.
input
:STRING
the query to the data used to compute the coefficient. A qualified table name can be given as well:<databaseid>.<schemaid>.<tablename>
.output_table
:STRING
qualified name of the output table:<databaseid>.<schemaid>.<tablename>
.index_column
:STRING
name of the column with the Quadbin indexes.date_column
:STRING
name of the column with the date.value_column
:STRING
name of the column with the values for each Quadbin cell.size
:INTEGER
size of the Quadbin kring (distance from the origin). This defines the area around each index cell that will be taken into account to compute its Gi* statistic.time_freq
:STRING
The time interval  step to use for the time series. Available values are:year
,quarter
,month
,week
,day
,hour
,minute
,second
. It is the equivalent of the spatial index in the time domain.time_bw
:INTEGER
The bandwidth to use for the time series. This defines the number of adjacent observations in time domain to be considered. It is the equivalent of the Quadbin kring in the time domain.kernel
:STRING
kernel function to compute the spatial weights across the kring. Available functions are: uniform, triangular, quadratic, quartic and gaussian.kernel_time
:STRING
kernel function to compute the temporal weights within the time bandwidth. Available functions are: uniform, triangular, quadratic, quartic and gaussian.
Output
The results are stored in the table named <output_table>
, which contains the following columns:
quadbin
:BIGINT
the QUADBIN index.date
:DATETIME
gi
:FLOAT
computed Gi* value.p_value
:FLOAT
computed P value.
Example
MORANS_I_H3
Description
This procedure computes the Moran's I spatial autocorrelation from the input table with H3 indexes.
input
:STRING
the query to the data used to compute the coefficient. A qualified table name can be given as well:<databaseid>.<schemaid>.<tablename>
.output_table
:STRING
qualified name of the output table:<databaseid>.<schemaid>.<tablename>
.index_column
:STRING
name of the column with the H3 indexes.value_column
:STRING
name of the column with the values for each H3 cell.size
:INT
size of the H3 kring (distance from the origin). This defines the area around each index cell where the distance decay will be applied. If no neighboring cells are found, the weight of the corresponding index cell is set to zero.decay
:STRING
decay function to compute the distance decay. Available functions are: uniform, inverse, inverse_square and exponential.
Output
The results are stored in the table named <output_table>
, which contains the following column:
morans_i
:FLOAT
Moran's I spatial autocorrelation.
If all cells have no neighbours, then the procedure will fail.
Example
MORANS_I_QUADBIN
Description
This procedure computes the Moran's I spatial autocorrelation from the input table with Quadbin indexes.
input
:STRING
the query to the data used to compute the coefficient. A qualified table name can be given as well:<databaseid>.<schemaid>.<tablename>
.output_table
:STRING
qualified name of the output table:<databaseid>.<schemaid>.<tablename>
.index_column
:STRING
name of the column with the Quadbin indexes.value_column
:STRING
name of the column with the values for each Quadbin cell.size
:INT
size of the Quadbin kring (distance from the origin). This defines the area around each index cell where the distance decay will be applied. If no neighboring cells are found, the weight of the corresponding index cell is set to zero.decay
:STRING
decay function to compute the distance decay. Available functions are: uniform, inverse, inverse_square and exponential.
Output
The results are stored in the table named <output_table>
, which contains the following column:
morans_i
:FLOAT
Moran's I spatial autocorrelation.
If all cells have no neighbours, then the procedure will fail.
Example
LOCAL_MORANS_I_H3
Description
This procedure computes the local Moran's I spatial autocorrelation from the input table with H3 indexes. It outputs the H3 index
, local Moran's I spatial autocorrelation value
, simulated p value psim
, Conditional randomization null  expectation EIc
, Conditional randomization null  variance VIc
, Total randomization null  expectation EI
, Total randomization null  variance VI
, and the quad
HH=1, LL=2, LH=3, HL=4.
input
:STRING
the query to the data used to compute the coefficient. A qualified table name can be given as well:<databaseid>.<schemaid>.<tablename>
.output_table
:STRING
qualified name of the output table:<databaseid>.<schemaid>.<tablename>
.index_column
:STRING
name of the column with the H3 indexes.value_column
:STRING
name of the column with the values for each H3 cell.size
:INTEGER
size of the H3 kring (distance from the origin). This defines the area around each index cell where the distance decay will be applied. If no neighboring cells are found, the weight of the corresponding index cell is set to zero.decay
:STRING
decay function to compute the distance decay. Available functions are: uniform, inverse, inverse_square and exponential.permutations
:INTEGER
number of permutations for the estimation of pvalue.
Output
The results are stored in the table named <output_table>
, which contains the following columns:
h3
:STRING
the H3 index.value
:FLOAT
local Moran's I spatial autocorrelation.psim
:FLOAT
simulated p value.EIc
:FLOAT
conditional randomization null  expectation.VIc
:FLOAT
conditional randomization null  variance.EI
:FLOAT
total randomization null  expectation.VI
:FLOAT
total randomization null  variance.quad
:INTEGER
HH=1, LL=2, LH=3, HL=4.
Example
LOCAL_MORANS_I_QUADBIN
Description
This procedure computes the local Moran's I spatial autocorrelation from the input table with Quadbin indexes. It outputs the Quadbin index
, local Moran's I spatial autocorrelation value
, simulated p value psim
, Conditional randomization null  expectation EIc
, Conditional randomization null  variance VIc
, Total randomization null  expectation EI
, Total randomization null  variance VI
, and the quad
HH=1, LL=2, LH=3, HL=4.
input
:STRING
the query to the data used to compute the coefficient. A qualified table name can be given as well:<databaseid>.<schemaid>.<tablename>
.output_table
:STRING
qualified name of the output table:<databaseid>.<schemaid>.<tablename>
.index_column
:STRING
name of the column with the Quadbin indexes.value_column
:STRING
name of the column with the values for each Quadbin cell.size
:INTEGER
size of the Quadbin kring (distance from the origin). This defines the area around each index cell where the distance decay will be applied. If no neighboring cells are found, the weight of the corresponding index cell is set to zero.decay
:STRING
decay function to compute the distance decay. Available functions are: uniform, inverse, inverse_square and exponential.permutations
:INTEGER
number of permutations for the estimation of pvalue.
Output
The results are stored in the table named <output_table>
, which contains the following columns:
quadbin
:BIGINT
the QUADBIN index.value
:FLOAT
local Moran's I spatial autocorrelation.psim
:FLOAT
simulated p value.EIc
:FLOAT
conditional randomization null  expectation.VIc
:FLOAT
conditional randomization null  variance.EI
:FLOAT
total randomization null  expectation.VI
:FLOAT
total randomization null  variance.quad
:INTEGER
HH=1, LL=2, LH=3, HL=4.
Example
Last updated