# retail

This module contains procedures to perform analysis to solve specific retail analytics use cases, such as revenue prediction.

## COMMERCIAL\_HOTSPOTS <a href="#commercial_hotspots" id="commercial_hotspots"></a>

```sql
COMMERCIAL_HOTSPOTS(input, index_column, variable_columns, output, options)
```

**Description**

This procedure is used to locate hotspot areas by calculating a combined [Getis-Ord Gi\*](https://docs.carto.com/data-and-analysis/analytics-toolbox-for-snowflake/statistics#getis_ord_h3) statistic over several variables. The input data should be in either an H3 or Quadbin grid. The individual Gi\* statistics are combined using [Stouffer's Z-score method](https://en.wikipedia.org/wiki/Fisher's_method), which also allows to introduce individual weights, with the combined statistics following a standard normal distribution. The hotspots are identified as those cells with a positive combined the Gi\* statistics which is significant at the specified significance level, i.e. whose p-value is below the p-value threshold (`pvalue_thresh`) set by the user.

**Input parameters**

* `input`: `STRING` the query to the input data. A qualified table name can be given as well, e.g. `<my-database>.<my-schema>.<my-table>`.
* `index_column`: `STRING` name of the column containing the H3 or Quadbin indexes.
* `variable_columns`: `ARRAY` names of the columns containing the variables to take into account when computing the combined Gi\* statistic.
* `output`: `STRING` name of the output table, e.g. `<my-database>.<my-schema>.<my-output-table>`.
* `options`: `STRING` containing a valid JSON with the different options. Valid options are described in the table below.

  | Option             | Description                                                                                                                                                                                                                                                       |
  | ------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
  | `variable_weights` | `ARRAY` An array containing the weights associated with each of the variables. These weights can take any value but will be normalized to sum up to 1. If NULL, uniform weights will be considered.                                                               |
  | `kring`            | `BIGINT` The size of the *k-ring* (distance from the origin). This defines the area around each cell that will be taken into account to compute its Gi\* statistic. The default value is 1 (i.e. only the first-order neighbouring cells are taken into account). |
  | `kernel`           | `STRING` The kernel type used to smooth the individual Gi\* statistic over the area around each cell defined by the `kring` parameter. If not specified, a 'uniform' kernel will be used by default.                                                              |
  | `pvalue_thresh`    | `FLOAT` Threshold for the Gi\* value significance, ranging from 0 (most significant) to 1 (least significant). It defaults to 0.05. Cells with a p-value above this threshold won't be returned.                                                                  |

**Output**

The output will contain the following columns:

* a column named either `H3` (`STRING`) OR `QUADBIN` (`BIGINT`) depending on the grid type storing the unique geographic identifier of each grid cell.
* `combined_gi`: `FLOAT64` with the resulting combined Gi\*.
* `p_value`: `FLOAT64` with the p-value associated with the combined Gi\* statistic.

**Examples**

```sql
CALL CARTO.CARTO.COMMERCIAL_HOTSPOTS(
    '<my-database>.<my-schema>.<my-table>',
    'H3',
    ARRAY_CONSTRUCT('feature_1','feature_2'),
    '<my-database>.<my-schema>.<my-output-table>',
    NULL
);
-- The table `<my-database>.<my-schema>.<my-output-table>` will be created
-- with columns: h3, combined_gi, p_value
```

```sql
CALL CARTO.CARTO.COMMERCIAL_HOTSPOTS(
    '<my-database>.<my-schema>.<my-table>',
    'H3',
    ARRAY_CONSTRUCT('feature_1','feature_2','feature_3'),
    '<my-database>.<my-schema>.<my-output-table>',
    '{
        "variable_weights":[0.4,0.3,0.3],
        "kring":1,
        "kernel":"gaussian",
        "pvalue_thresh":0.01
    }'
);
-- The table `<my-database>.<my-schema>.<my-output-table>` will be created
-- with columns: h3, combined_gi, p_value
```
