# data

This module contains functions and procedures that make use of data (user-provided data) for their computations.

## ENRICH\_GRID <a href="#enrich_grid" id="enrich_grid"></a>

```sql
ENRICH_GRID(grid_type, input_query, input_index_column, data_query, data_geography_column, variables, output)
```

**Description**

This procedure enriches a set of grid cells of one of the supported types (quadbin) with data from another enrichment query. The cells are identified by their indices.

As a result of this process, each input grid cell will be enriched with the data of the enrichment query that spatially intersects it. When the input cell intersects with more than one feature of the enrichment query, the data is aggregated using the aggregation methods specified.

Valid aggregation methods are:

* `SUM`: It assumes the aggregated variable is an [*extensive property*](https://en.wikipedia.org/wiki/Intensive_and_extensive_properties) (e.g. population). Accordingly, the value corresponding to the enrichment feature intersected is weighted by the fraction of area or length intersected. If the enrichment features are points, then a simple sum is performed.
* `MIN`: It assumes the aggregated variable is an [*intensive property*](https://en.wikipedia.org/wiki/Intensive_and_extensive_properties) (e.g. temperature, population density). Thus, the value is not altered by the intersected area/length as it's the case for `SUM`.
* `MAX`: It assumes the aggregated variable is an [*intensive property*](https://en.wikipedia.org/wiki/Intensive_and_extensive_properties) (e.g. temperature, population density). Thus, the value is not altered by the intersected area/length as it's the case for `SUM`.
* `AVG`: It assumes the aggregated variable is an [*intensive property*](https://en.wikipedia.org/wiki/Intensive_and_extensive_properties) (e.g. temperature, population density). Thus, the value is not altered by the intersected area/length as it's the case for `SUM`. However, a [weighted average](https://en.wikipedia.org/wiki/Weighted_arithmetic_mean) is computed, using the intersection areas or lengths as the weight. When the enrichment features are points, a simple average is computed.
* `COUNT` It computes the number of enrichment features that contain the enrichment variable and are intersected by the input geography.

**Input parameters**

* `grid_type`: Type of grid: "quadbin".
* `input_query`: `VARCHAR` query to be enriched; this query must produce valid grid indices for the selected grid type in a column of the proper type (INT for quadbin). It can include additional columns with data associated with the grid cells that will be preserved. A qualified table name can be given as well, e.g. `'<my-schema>.<my-table>'`.
* `input_index_column`: `VARCHAR` name of a column in the query that contains the grid indices.
* `data_query`: `VARCHAR` query that contains both a geography column and the columns with the data that will be used to enrich the polygons provided in the input query.
* `data_geography_column`: `VARCHAR` name of the GEOGRAPHY column provided in the `data_query`.
* `variables`: `VARCHAR` a JSON array of pairs. The column that will be used to enrich the input polygons and their corresponding aggregation method. e.g. `[["var1","sum"],["var2","count"]]`.
* `output`: `VARCHAR` containing the name of an output table to store the results e.g. `'<schema>.<table>'`. The resulting table cannot exist before running the procedure.

**Output**

The output table will contain all the input columns provided in the `input_query` and one extra column for each variable in `variables`, named after its corresponding enrichment column and including a suffix indicating the aggregation method used.

**Examples**

{% code overflow="wrap" lineNumbers="true" %}

```sql
CALL carto.ENRICH_GRID(
   'quadbin',
   'SELECT 5256779493799886847 AS index
   UNION ALL SELECT 5256779493812469759
   UNION ALL SELECT 5256779493816664063
   UNION ALL SELECT 5256779493808275455
   UNION ALL SELECT 5256779493820858367
   UNION ALL SELECT 5256779493825052671
   UNION ALL SELECT 5256779493900550143
   UNION ALL SELECT 5256779493913133055
   UNION ALL SELECT 5256779493917327359',
   'index',
   'SELECT geom, var1, var2 FROM <my-schema>.<my-table>',
   'geom',
   '[["var1", "sum"],["var2", "sum"],["var2", "max"]]',
   '<my-schema>.<my-enriched-table>'
);
-- The table `<my-schema>.<my-enriched-table>` will be created
-- with columns: index, var1_sum, var2_sum, var2_max
```

{% endcode %}

{% code overflow="wrap" lineNumbers="true" %}

```sql
CALL carto.ENRICH_GRID(
   'quadbin',
   '<my-schema>.<my-table>',
   'index',
   'SELECT geom, var1, var2 FROM <my-schema>.<my-table>',
   'geom',
   '[["var1", "sum"],["var2", "sum"],["var2", "max"]]',
   '<my-schema>.<my-table>'
);
-- The table `<my-schema>.<my-table>` will be augmented
-- with columns: var1_sum, var2_sum, var2_max
```

{% endcode %}

## ENRICH\_POINTS <a href="#enrich_points" id="enrich_points"></a>

```sql
ENRICH_POINTS(input_query, input_geography_column, data_query, data_geography_column, variables, output)
```

**Description**

This procedure enriches a query containing geographic points with data from another query, spatially matching both and aggregating the result.

As a result of this process, each input point will be enriched with the data from the enrichment query that spatially intersects it. When an input point intersects with more than one enrichment polygon, point, or line, the data is aggregated using the aggregation methods specified.

Valid aggregation methods are: `SUM`, `MIN`, `MAX`, `AVG`, and `COUNT`.

**Input parameters**

* `input_query`: `VARCHAR` query to be enriched. A qualified table name can be given as well, e.g. `'<my-schema>.<my-table>'`.
* `input_geography_column`: `VARCHAR` name of the GEOGRAPHY column in the query containing the points to be enriched.
* `data_query`: `VARCHAR` query that contains both a geography column and the columns with the data that will be used to enrich the points provided in the input query.
* `data_geography_column`: `VARCHAR` name of the GEOGRAPHY column provided in the `data_query`.
* `variables`: `VARCHAR` a JSON array of pairs. The column that will be used to enrich the input points and their corresponding aggregation method. e.g. `[["var1","sum"],["var2","count"]]`.
* `output`: `VARCHAR` containing the name of an output table to store the results e.g. `'<schema>.<table>'`. The resulting table cannot exist before running the procedure.

**Output**

The output table will contain all the input columns provided in the `input_query` and one extra column for each variable in `variables`, named after its corresponding enrichment column and including a suffix indicating the aggregation method used.

**Examples**

{% code overflow="wrap" lineNumbers="true" %}

```sql
CALL carto.ENRICH_POINTS(
   'SELECT id, geom FROM <my-schema>.<my-table>',
   'geom',
   'SELECT geom, var1, var2 FROM <my-schema>.<my-table>',
   'geom',
   '[["var1", "sum"],["var2", "count"]]',
   '<my-schema>.<my-enriched-table>'
);
-- The table '<my-schema>.<my-enriched-table>' will be created
-- with columns: id, geom, var1_sum, var2_count
```

{% endcode %}

{% code overflow="wrap" lineNumbers="true" %}

```sql
CALL carto.ENRICH_POINTS(
   '<my-schema>.<my-table>',
   'geom',
   'SELECT geom, var1, var2 FROM <my-schema>.<my-table>',
   'geom',
   '[["var1", "sum"],["var2", "count"]]',
   '<my-schema>.<my-enriched-table>'
);
-- The table '<my-schema>.<my-enriched-table>' will be created
-- with columns: id, geom, var1_sum, var2_count
```

{% endcode %}

## ENRICH\_POLYGONS <a href="#enrich_polygons" id="enrich_polygons"></a>

```sql
ENRICH_POLYGONS(input_query, input_geography_column, data_query, data_geography_column, variables, output)
```

**Description**

This procedure enriches a query containing geographic polygons with data from another query, spatially matching both and aggregating the result.

As a result of this process, each input polygons will be enriched with the data from the enrichment query that spatially intersects it. When the input polygons intersects with more than one enrichment polygon, point, or line, the data is aggregated using the aggregation methods specified.

Valid aggregation methods are:

* `SUM`: It assumes the aggregated variable is an [*extensive property*](https://en.wikipedia.org/wiki/Intensive_and_extensive_properties) (e.g. population). Accordingly, the value corresponding to the enrichment feature intersected is weighted by the fraction of area or length intersected. If the enrichment features are points, then a simple sum is performed.
* `MIN`: It assumes the aggregated variable is an [*intensive property*](https://en.wikipedia.org/wiki/Intensive_and_extensive_properties) (e.g. temperature, population density). Thus, the value is not altered by the intersected area/length as it's the case for `SUM`.
* `MAX`: It assumes the aggregated variable is an [*intensive property*](https://en.wikipedia.org/wiki/Intensive_and_extensive_properties) (e.g. temperature, population density). Thus, the value is not altered by the intersected area/length as it's the case for `SUM`.
* `AVG`: It assumes the aggregated variable is an [*intensive property*](https://en.wikipedia.org/wiki/Intensive_and_extensive_properties) (e.g. temperature, population density). Thus, the value is not altered by the intersected area/length as it's the case for `SUM`. However, a [weighted average](https://en.wikipedia.org/wiki/Weighted_arithmetic_mean) is computed, using the intersection areas or lengths as the weight. When the enrichment features are points, a simple average is computed.
* `COUNT` It computes the number of enrichment features that contain the enrichment variable and are intersected by the input geography.

**Input parameters**

* `input_query`: `STRING` query to be enriched. A qualified table name can be given as well, e.g. `'<my-schema>.<my-table>'`.
* `input_geography_column`: `STRING` name of the GEOMETRY/GEOGRAPHY column in the query containing the points to be enriched. The input cannot be a GeometryCollection.
* `data_query`: `STRING` query that contains both a geography column and the columns with the data that will be used to enrich the points provided in the input query.
* `data_geography_column`: `STRING` name of the GEOMETRY/GEOGRAPHY column provided in the `data_query`.
* `variables`: `STRING` a JSON with a property `variables` containing an array of pairs. The column that will be used to enrich the input polygons and their corresponding aggregation method. e.g. `[["var1","sum"],["var2","count"]]`.
* `output`: `STRING` containing the name of an output table to store the results e.g. `'<schema>.<table>'`. The resulting table cannot exist before running the procedure.

The input and data geography columns need to be in the same spatial reference system. If they are not, you will need to convert them.

**Output**

The output table will contain all the input columns provided in the `input_query` and one extra column for each variable in `variables`, named after its corresponding enrichment column and including a suffix indicating the aggregation method used.

**Examples**

{% code overflow="wrap" lineNumbers="true" %}

```sql
CALL carto.ENRICH_POLYGONS(
   'SELECT id, geom FROM <my-schema>.<my-table>',
   'geom',
   'SELECT geom, var1, var2 FROM <my-schema>.<my-table>',
   'geom',
   '[["var1", "sum"],["var2", "count"]]',
   '<my-schema>.<my-enriched-table>'
);
-- The table '<my-schema>.<my-enriched-table>' will be created
-- with columns: id, geom, var1_sum, var2_count
```

{% endcode %}

{% code overflow="wrap" lineNumbers="true" %}

```sql
CALL carto.ENRICH_POLYGONS(
   '<my-schema>.<my-table>',
   'geom',
   'SELECT geom, var1, var2 FROM <my-schema>.<my-table>',
   'geom',
   '[["var1", "sum"],["var2", "count"]]',
   '<my-schema>.<my-enriched-table>'
);
-- The table '<my-schema>.<my-enriched-table>' will be created
-- with columns: id, geom, var1_sum, var2_count
```

{% endcode %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.carto.com/data-and-analysis/analytics-toolbox-for-redshift/sql-reference/data.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
