Data Enrichment

Components to enrich your data with variables from other data sources. These components work for simple features and spatial indexes grids.

Enrich H3 Grid

Description

This component enriches a target table with data from a source. Enriching here means adding columns with aggregated data from the source that matches the target geographies.

The target, (which is the upper input connection of this component), must have a column that contains H3 indices, which will be used to join with the source.
The source (lower input connection) can be either a CARTO Data Observatory subscription or table (or result from other component) with a geography column.

For the enrichment operation the CARTO Analytics Toolbox is required, and one of the following procedures will be called:

DATAOBS_ENRICH_GRID if the source is a Data Observatory subscription
ENRICH_GRID otherwise

Inputs

Target geo column: it's the column of the target that will be used to join with the source and select the rows that will be aggregated for each target row.
Source geo column: (only necessary for non-DO sources) is the column of the source that will join with the target.
Variables: this allows selecting the data from the source that will be aggregated and added to the target.
- For Data Observatory subscriptions, the variables can be selected from the DO variables of the subscription, identified by their variable slug;
- for other sources they are the columns in the source table.
Each variable added must be assigned an aggregation method. You can add the same variable with different aggregation methods. At the moment only numeric variables are supported.

For spatially smoothed enrichments that take into account the surrounding cells, use the following input parameters:

Kring size: size of the k-ring where the decay function will be applied. This value can be 0, in which case no k-ring will be computed and the decay function won't be applied.
Decay function: decay function to aggregate and smooth the data. Supported values are uniform, inverse, inverse_square and exponential.

Outputs

Result table [Table]

Enrich Points

Description

This component enriches a target table with data from a source. Enriching here means adding columns with aggregated data from the source that matches (intersects) the target geographies.

The target, (which is the upper input connection of this component), must have a geo column, which will be used to intersect with the source.
The source (lower input connection) can be either a CARTO Data Observatory subscription or table (or result from other component) with a geo column.

For the enrichment operation the CARTO Analytics Toolbox is required, and one of the following procedures will be called:

DATAOBS_ENRICH_POINTS if the source is a Data Observatory subscription
ENRICH_POINTS otherwise

Inputs

Target geo column: it's the column of the target that will be used to intersect with the source and select the rows that will be aggregated for each target row.
Source geo column: (only necessary for non-DO sources) is the column of the source that will intersect with the target.
Variables: this allows selecting the data from the source that will be aggregated and added to the target.
- For Data Observatory subscriptions, the variables can be selected from the DO variables of the subscription, identified by their variable slug;
- for other sources they are the columns in the source table.
Each variable added must be assigned an aggregation method. You can add the same variable with different aggregation methods. At the moment only numeric variables are supported.

Outputs

Result table [Table]

Enrich Polygons

Description

This component enriches a target table with data from a source. Enriching here means adding columns with aggregated data from the source that matches (intersects) the target geographies.

The target, (which is the upper input connection of this component), must have a geo column, which will be used to intersect with the source.
The source (lower input connection) can be either a CARTO Data Observatory subscription or table (or result from other component) with a geo column.

For the enrichment operation the CARTO Analytics Toolbox is required, and one of the following procedures will be called:

DATAOBS_ENRICH_POLYGONS if the source is a Data Observatory subscription
ENRICH_POLYGONS otherwise

Inputs

Target geo column: it's the column of the target that will be used to intersect with the source and select the rows that will be aggregated for each target row.
Source geo column: (only necessary for non-DO sources) is the column of the source that will intersect with the target.
Variables: this allows selecting the data from the source that will be aggregated and added to the target.
- For Data Observatory subscriptions, the variables can be selected from the DO variables of the subscription, identified by their variable slug;
- for other sources they are the columns in the source table.
Each variable added must be assigned an aggregation method. You can add the same variable with different aggregation methods. At the moment only numeric variables are supported.

Outputs

Result table [Table]

Enrich Polygons with Weights

Description

This component uses a data source (either a table or a DO subscription) to enrich another target table using weights to control the enrichment.

Inputs

Target table to be enriched
Source table with data for the enrichment data
Weights table with data to weight the enrichment

Settings

Target polygons geo column: Select the column from the target table that contains a valid geography.
Source table geo column: Select the column from the source table that contains a valid geography.
Variables: Select a list of variables and aggregation method from the source table to be used to enrich the target table. Valid aggregation methods are:
- SUM: It assumes the aggregated variable is an extensive property (e.g. population). Accordingly, the value corresponding to the feature intersected is weighted by the fraction of the intersected weight variable.
- MIN: It assumes the aggregated variable is an intensive property (e.g. temperature, population density). Thus, the value is not altered by the weight variable.
- MAX: It assumes the aggregated variable is an intensive property (e.g. temperature, population density). Thus, the value is not altered by the weight variable.
- AVG: It assumes the aggregated variable is an intensive property (e.g. temperature, population density). A weighted average is computed, using the value of the intersected weight variable as weights.
- COUNT It computes the number of features that contain the enrichment variable and are intersected by the input geography.

💡 The component will return an error if all variables selected are aggregated as MIN or MAX, since the result wouldn't actually be weighted.

Weights geo column: Select the column from the weights table that contains a valid geography.
Weights variable: Select one variable and aggregation operation to be used as weight for the enrichment.

If your weight variables are included in the same table as the source variables, you can connect the same node to both inputs in this component.

When the source for the enrichment is a standard table, the weights source can't be a DO subscription.

The same limitation applies when the source for the enrichment is a DO subscription; the weights source can't be a standard table.

Outputs

Output table with the following schema
- All columns from Target
- A column from each variable in 'Variables', named like 'name_sum', 'name_avg', 'name_max' depending on the original column name and the aggregation method.

Enrich Quadbin Grid

Description

This component enriches a target table with data from a source. Enriching here means adding columns with aggregated data from the source that matches the target geographies.

The target, (which is the upper input connection of this component), must have a column that contains Quadbin indices, which will be used to join with the source.
The source (lower input connection) can be either a CARTO Data Observatory subscription or table (or result from other component) with a geography column.

For the enrichment operation the CARTO Analytics Toolbox is required, and one of the following procedures will be called:

DATAOBS_ENRICH_GRID if the source is a Data Observatory subscription
ENRICH_GRID otherwise

Inputs

Target geo column: it's the column of the target that will be used to join with the source and select the rows that will be aggregated for each target row.
Source geo column: (only necessary for non-DO sources) is the column of the source that will join with the target.
Variables: this allows selecting the data from the source that will be aggregated and added to the target.
- For Data Observatory subscriptions, the variables can be selected from the DO variables of the subscription, identified by their variable slug;
- for other sources they are the columns in the source table.
Each variable added must be assigned an aggregation method. You can add the same variable with different aggregation methods. At the moment only numeric variables are supported.

For spatially smoothed enrichments that take into account the surrounding cells, use the following input parameters:

Kring size: size of the k-ring where the decay function will be applied. This value can be 0, in which case no k-ring will be computed and the decay function won't be applied.
Decay function: decay function to aggregate and smooth the data. Supported values are uniform, inverse, inverse_square and exponential.

Outputs

Result table [Table]

PreviousCustom NextData Preparation

Last updated 9 months ago

Was this helpful?