Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
CARTO Workflows provides a visual language to design and execute multi-step spatial analytics procedures. With Workflows, our mission is to bring spatial analytics to a broader audience of data analysts and business users, and to democratize access to advanced Location Intelligence.
This tool, like the rest of the platform, is fully cloud native; running Spatial SQL in your own data warehouse, and leveraging CARTO’s Analytics Toolbox and the other components of our technology stack.
CARTO Workflows reduces the complexity and the high dependence on specialist knowledge. Users can leverage the scalability of cloud data warehouses through the use of spatial SQL without needing to write SQL code themselves. It opens up analytical modelling to all roles and skill levels, through a simple, familiar user interface.
Workflows runs in a directed graph structure, meaning that the workflow will run from left to right, step by step, but outputs of nodes can be used in the immediate next step or later in the Workflow. For example, if you have 5 nodes, the output of node 2 can connect to the immediate next node, or node 3, or nodes 4 or 5, or all if needed.
The CARTO team has designed this collection of Workflows examples with a hands-on approach to empower users and ease the Workflows learning curve.
These examples showcase a wide range of scenarios and applications: from simple building blocks for your geospatial analysis to more complex, industry-specific workflows tailored to facilitate running specific geospatial use-cases.
Making use of these examples is very easy. Just click on "New Workflow" and "From template" in your CARTO Workspace to access the collection of templates. Once the workflow is re-created you will be able to modifying as with any other workflow, replacing the data sources and re-configuring the different nodes so it can be useful for your specific use-case.
Before we jump into Workflows, let’s take a quick tour of the Workflows UI so you know your way around before getting started.
First is the Canvas where you will design your Workflow. This is a free-form Canvas meaning you can drag nodes onto any part of the canvas. You can zoom in and out to see different parts of your workflow and see the layout of the workflow in the mini viewer in the lower right corner. As you add nodes to the canvas they will snap to a grid to align. \
On the left side, you will find a menu where you can add data sources from the connection you created the Workflow. You can add any data source that you want that exists in your connection. You also have all the components, or nodes, that you can add to the canvas. We will go over the various components later. You can search for components or scroll to find the component you want.
The bottom panel is the results space where you will see four different tabs:
Messages: Messages about the status of your Workflow including success and error messages.
Data: After clicking on a node, you can see the tabular data outputs of that specific workflow step.
Map: After clicking on a node, if that step returns a valid geometry, it will show up in the map. If there is more than one geometry you will have an option to pick which one to show.

The Results panel in Workflows provide information about each node in your workflow. Click on a node to open the panel and see its different sections.
The header of the panel contains general information about the workflow, like the time it was last executed or the connection being used. It also allows expanding to full screen or hiding the panel to leave more space for the canvas.
This tab contains information about the execution of the workflow. Error messages and confirmation for successful runs will appear here.
This tab contains a table visualization of the result of the node's execution.
Next to each column name, you will find this icon . Click on it to show some column statistics on the right side of the panel:
For numeric columns, max, min, average and sum are displayed.
For string and date or timestamp columns, the frequency of the top 20 categories is calculated and shown.
The data tab also allows to explore the result in pages of different length, which can be configured in the bottom right corner of the panel:
The copy button in the top right corner will copy the content of the current page to the clipboard, using tabs as delimiter, which makes it easy to paste directly into a spreadsheet.
The export button will start the export process of the data on the current node. The export will happen asynchronously, and the status will be reported in the activity panel in the top bar. Once finished, the resulting link(s) will be available.
The Map tab shows a preview of the result on a map. It also allows to create a map in Builder using the selected node as data source. When there are more than one columns that could be rendered (several geometry/geography columns, spatial indexes, etc), you can use a selector to make sure the correct column is used in the Builder map.
This tab contains the SQL code that is executed in the data warehouse when the workflow runs from the UI, by clicking on the Run button.
Part of this code handles the creation of temporal tables and other control strategies. These portion of the code is collapsed and hidden by default, but it can be expanded by clicking on the small two dots inline with the code.
This is the area of the interface where you will be adding the different components and data sources that are required to run your analysis, and connecting them in order to define each step of the workflow in order of execution.
In this central panel you will always find the main diagram of your workflow, with all its components and data sources connected according to what you have established.
In the workflow canvas you have the possibility to add visual assets and annotations to provide explanations on your workflow.
You can do it by clicking on the corresponding icon in the toolbar and drag on the canvas. This will help you make very complex workflows more understandable.
The notes support , allowing to add different levels of headers, text formatting, code blocks, in line monospace code, bulleted and ordered lists, links, images, etc.
This feature has been deprecated in favor of the component. Take a look at the announcement in our section
On the toolbar on top of the canvas you will find the Import file into workflow button.
Clicking this button will let you select a location in the canvas to place a datasource. After that, select a local file from your computer or specify a URL that will be imported as a data node in your workflow.
On the toolbar on top of the canvas you will find the Generate workflow description button.
Clicking this button will let you select a location in the canvas that will contain an automatic description of the workflow.
This description is obtained using Generative AI on the workflow's SQL code.
Double click on a node's name to add a custom name to it. By using more descriptive names, your workflow will be much easier to understand.








An extension for accessing open source Geospatial Foundation Models: large-scale AI models trained on diverse geospatial data like satellite imagery, topographical maps, and other location-specific datasets, that encapsulate spatial information and relationships into dense representations in a multi-dimensional space (embeddings).
Description
This component loads Google's PDFM Embeddings at the county level.
This component loads Google's Population Dynamics Foundation Model (PDFM) Embeddings at the county level. Google's (PDFM) Embeddings are condensed vector representations designed to encapsulate the complex, multidimensional interactions among human behaviors, environmental factors, and local contexts at specific locations.
These embeddings capture patterns in aggregated data such as search trends, busyness trends, and environmental conditions (maps, air quality, temperature), providing a rich, location-specific snapshot of how populations engage with their surroundings.
Output
Output table: This component generates an output table with the PDFM embeddings at the county level.
Description
This component loads at the ZIP Code Tabulation Area (ZCTA) level. Google's (PDFM) Embeddings are condensed vector representations designed to encapsulate the complex, multidimensional interactions among human behaviors, environmental factors, and local contexts at specific locations.
These embeddings capture patterns in aggregated data such as search trends, busyness trends, and environmental conditions (maps, air quality, temperature), providing a rich, location-specific snapshot of how populations engage with their surroundings.
Output
Output table: This component generates an output table with the PDFM embeddings at the ZCTA level.
This component loads summarized for input geometries.
The Satellite Embeddings dataset provides global, analysis-ready 64-dimensional vectors for each 10-meter land pixel, capturing annual temporal patterns of surface conditions using data from multiple Earth observation sources. Unlike traditional spectral bands, these vectors summarize complex, multi-source relationships in a more abstract but powerful format.
Input
Input table: This component expects an input table with a geography column
Settings
Geography column: A column that contains the geography data.
From: The starting year of data to extract.
To: The end year of data to extract.
Reducer function: Function to summarize the raster data over each polygon (e.g., mean, min, max).
Output
Output table: This component generates an output table with the Satellite Embeddings.
Scale in meters: The scale of the source data in meters.
In order to start using variables, click on the three dots in the upper-right corner and find 'Variables'.
Order: The order of the variables is relevant when setting them as parameters, as they will be used as inputs for the generated stored procedure in that order, defining the procedure's signature.
Name: Give a name to the variable. This name will be referenced as @name within an expression.
Using a variable name that conflicts with a column name can produce errors on the Workflow's SQL code. Please make sure that your variable's name is not the same as any of the column names in the data sources involved in the Workflow.
Type: Number and String variables are supported.
Default value: The default value for the variable. When executing a workflow from the UI, this is the value that will be given to the variable.
Parameter: Mark a variable as parameter if you need to allow external access to it. This is helpful when running a workflow via API call.
{% hint style="info" %} For Databricks connections, executing Workflows via API is currently a work in progress. For this reason, the use of parameters is not supported on this type of execution. The functionality will be available soon, please get in touch with
Learn more about executing parameterized workflows in .
Click on the '+' button to confirm the creation of the variable. A new entry will appear where you can keep adding variables.
Once a variable has been defined, you can use them to control the value of a component's settings. As global variables, they can be used in one or many occasions within the same workflow, and they will apply the same value to the setting where they're used.
To use a variable in a setting, just start typing {{@variable_name}} and the available matching variables will appear. Make sure that you close the variable name with double curly braces, as shown in the screenshot below.
Not only variables, but expressions including SQL operations and column names are also allowed, so something like this could be used as a setting on a node: {{@buffer_radius / distance * 10}} which combines a parameter (buffer_radius) with a column (distance) in a mathematical operation.
Components to work with raster data sources
Description
This component allows obtaining values from a raster table. The values included in the output can be determined by a vector data source input.
Inputs
Raster table [Table]: raster data source that will be used as source for the values.
Vector table [Table]: vector data source that will be used to determine which pixel values are going to be obtained
Geo column: Column in the Vector table that contains a geography to determine which pixel values are going to be obtained.
Raster properties : List of band values that are going to be obtained from the raster, in the areas where it intersects with the vector table features.
Custom expression: (optional) Add a custom SQL expression containing an operation that uses values from the different bands in the raster, for example band_1+band_2 as sum
Advanced options
Include pixel [boolean]: Default false. If true, the output table will include a pixel column containing the quadbin id of the corresponding pixel in the raster.
Outputs
Result table [Table]: A table that contains a row for each pixel in the raster source that intersects with the vector source. It will contains an optional pixel column and a column for each selected property, plus any additional column generated by a custom expression.
Description
This component intersects a raster table with a table containing polygons and aggregates the values extracted from the intersection areas.
Inputs
Raster table [Table]: raster data source that will be used as source for the aggregated values.
Vector table [Table]: vector data source that contains the geographies to be enriched with aggragated data from the raster pixel values.
Geo column: Geography column in the Vector table that will be intersected with the raster.
Outputs
Result table [Table]: A table that contains the same rows as the input Vector table, with an additional column for each raster aggregated property and any additional property defined as a custom expression. The values for those columns are the computed aggregation from all pixels in the raster that intersect with the vector feature.
Components to leverage native Generative AI capabilities on Data Warehouses.
Description
This components calls ML.GENERATE_TEXT function in BigQuery for each row on the input table.
Inputs
Model [FQN]: The path for the model to be used in the format project_id.dataset.model
Prompt column: The column in the input source used to generate the prompt.
Max Output Tokens: an INT64
Outputs
Result table [Table]
External links
You will be able to manage workflows from the Workflows section of the workspace. From there, you will be able to:
Edit workflow names
Create and apply tags to workflows
Share workflows
Duplicate workflows
Delete workflows
These options may be restricted depending on the user's role and permissions.
You can tag workflows from the Workspace by editing their properties:
Click on the three dotted-button in the workflow card.
Select "Edit workflow properties".
Create, apply or remove existing tags in the tag input.
Additionally, you will be able to filter workflows by tags using the tag filter in the workspace. Once a tag filter is applied, you will be able to copy the URL and share it with other users within your organization.
Tags are automatically deleted when they are no longer applied to any map or workflow.
To duplicate an existing workflow from the Workspace:
Click on the three-dotted menu in the workflow card.
Select "Duplicate Workflow".
To delete a workflow from the Workspace:
Click on the three-dotted menu in the workflow card.
Select "Delete workflow".
Confirm that you want to delete the selected workflow.
This guide describes how you can share a workflow with the other users within your CARTO organization.
When you create a workflow, it’s always set to private by default. To share a workflow, you just need to make sure it has a name, and then click on the Share button in the top-right corner to open the sharing options.
A new dialog screen will open allowing you to select who do you want to share your workflows with:
Private Preview
Components to control the execution flow of the workflow.
Description The Conditional Split component evaluates an aggregated measure or custom expression from the input data against a logical condition, directing execution into one of two branches: If and Else. This allows workflows to follow different processing paths depending on whether the condition evaluates to true or false.
Inputs
Raster aggregated properties : List of aggregated band values that are going to be obtained from the raster, in the areas where it intersects with each of the vector table rows.
Custom expression: (optional) Add a custom SQL expression containing an operation that aggregates values from the different bands in the raster, for example AVG(band_1+band_2) as avg_band_sum
Source table [Table]
Settings The condition can be defined in two ways:
Simple condition – built through the UI:
Column: The column from the input table to aggregate.
Aggregation method: Function applied to the column (e.g., MIN, MAX, AVG, SUM).
Operator: Comparison operator (e.g., Is equal to, Is greater than, Is less than).
Value: The value to compare the aggregated result against.
Custom expression – a manually written SQL expression that can include multiple columns, functions, and logical operators (e.g., AVG(population) > 3000 AND (category LIKE 'capital_city' OR capital IS TRUE)).
Outputs
If path [Table]: Passes all rows from the input table when the condition evaluates to TRUE.
Else path [Table]: Passes all rows from the input table when the condition evaluates to FALSE.
Description The Success/Error Split component determines the outcome of the previous component’s execution and routes the workflow accordingly. If the upstream component completes successfully, the workflow continues through the Success branch; if it fails, execution follows the Error branch. This enables workflows to handle errors gracefully, trigger alternative actions, or send notifications when issues occur.
Inputs
Source table [Table] (only available if the upstream component executed successfully)
Settings
None
Outputs
Success path [Table]: Passes the table produced by the previous component if no errors occurred.
Error path [No table]: Produces no table output (since the previous step failed) but can connect to components such as HTTP Request, Send by Email, or logging for error handling.
[1,1024]50Temperature: a FLOAT64 value in the range [0.0,1.0] that is used for sampling during the response generation, which occurs when top_k and top_p are applied. It controls the degree of randomness in token selection. Lower temperature values are good for prompts that require a more deterministic and less open-ended or creative response, while higher temperature values can lead to more diverse or creative results. A temperature value of 0 is deterministic, meaning that the highest probability response is always selected. The default is 1.0.
Top P: an INT64 value in the range [1,40] that changes how the model selects tokens for output. Specify a lower value for less random responses and a higher value for more random responses. The default is 40.
Top K: a FLOAT64 value in the range [0.0,1.0] that changes how the model selects tokens for output. Specify a lower value for less random responses and a higher value for more random responses. The default is 1.0.
Tokens are selected from the most (based on the top_k value) to least probable until the sum of their probabilities equals the top_p value. For example, if tokens A, B, and C have a probability of 0.3, 0.2, and 0.1 and the top_p value is 0.5, then the model selects either A or B as the next token by using the temperature value and doesn't consider C.





Private: Only you can view and edit the workflow.
Organization: The workflow will be visible to all users in your organization. Editor users will be able to duplicate the workflow so they can create a copy and start working on it.
If you have Single Sign-On (SSO) and Groups enabled at your organization, you will be able to share workflows only with specific groups, instead of the entire organization. Read more about Managing user groups.
You can also share only with specific users in your organzation.
Enable the "Allow editing this workflow" option to allow collaboration from other editor users with access to the map.
After you enable this setting, all editors with access to this map will also be able to edit it. A few things to take into consideration:
To enable editor collaboration, the connection used to create the workflow also needs to be shared.
When using the CARTO Data Warehouse, or a connection with Viewer Credentials, each user will use their own temporal storage for Workflows, meaning that the result of each node and their execution status will not be shared.
Collaboration is asynchronous. If two users try to open the map at the same time, a pop-up will appear to decide who will take control of the map at that moment. If the current editor is away from the keyboard, the new editor can take control of the map in 30 seconds.
The current editor can also choose to deny the take over, remaining in control of the map until the map is closed.
Click on the three dots in the upper-left corner and find 'Export'
This action will download a file that contains the SQL code needed to create a stored procedure based on the contents of your workflow. It will observe existing variables and parameters, the former being exposed as input parameters for the stored procedure.
A .sql file that was obtained by exporting a workflow can be imported to reproduce the same workflow. There are two different ways to import a workflow:
Drag a drop a compatible .sql file into your Workflows page in Workspace.
Click the import button next to + New workflow and select a file from your computer.

To explore the available extension packages provided and maintained by CARTO, install new packages from a ZIP file, or manage already installed ones, simply click on Manage Extension Packages at the bottom of the component's panel in Workflows.
You will find three different tabs:
Explore: Here you will be able to see the catalog of available extensions. These extensions have been published by CARTO. By clicking on an extension's card you will access the details page with a deeper description of the extension and its components and settings.
Installed: This tab shows a list of installed extensions in the current connection. When a new version of the extension is available you will see an 'Outdated' tag. To update siply click on the three dots and "Update extension package".
Upload: This tab allows uploading a ZIP file for a custom extension. See the Creating an Extension Package section to find more information. After the upload, a details page will appear. Click on "Install extension" to install it. After that, the extension will be available in the Installed tab.
After installation, the custom components included in the extension can be found in the components panel, under the category defined for each by the extension's creator.
Each component included in an extension is installed as an stored procedure in the Workflows working location for each connection. This implies that they're only accessible by workflows that are created with the same connection, or with connections that share the same working location.
Admins can prevent users in their organization from installing extensions from the Governance settings.
Creating an extension package for Workflows involves some development work, as each custom component in the package needs to be programmed as a SQL Stored Procedure.
We have created a GitHub template that can be used as a starting point, as it contains the basic folder structure, configuration files, examples and documentation to create your first package.
Take a look at it in https://github.com/CartoDB/workflows-extension-template.
The repository also includes some utilities to test your custom components, and deploy directly to your data warehouse to ease development.
Additionally, the repository includes an script to package your extension, resulting in a ZIP file that can be used in Workflows to install the package.
Components that allow to create custom user-defined procedures and SQL queries in your workflows.
Description
This component executes an arbitrary CALL SQL statement.
Use $a and $b as placeholders for the input table names (you can use one, both, or none of them), and $output as placeholder for the output table name.
Extension Package provided by CARTO
This extension provides tools for telecommunications analysis, including path profile determination, signal propagation modelling, and obstacle identification.
It can utilize raster data for terrain and clutter profiles, enabling accurate assessment of transmitter-receiver paths and signal characteristics in various environments.
The components in this extension package leverage the functionality provided by the Telco module in the CARTO Analytics Toolbox for BigQuery. Check the module's for more detailed information about the analysis.
Description
This component identifies the Path Profile between the transmitter (Tx) and receiver (Rx), including any obstacles along the propagation path. It utilizes tables containing information about buildings, vegetation, water (for clutter data), and the terrain elevation profile.
Source table [Table]
Outputs
Result table [Table]
External links
Description
This component creates a new table with aggregated values.
It is required to convert spatial types to an allowed type by using 'ST_ASBINARY' before grouping by them.
Inputs
Source table [Table]
Columns and methods for aggregation [String]
Columns to group by [String]
Outputs
Result table [Table]
Description
This component adds new columns with statistics computed from the points of a second table that fall within each polygon.
Inputs
Points table [Table]
Polygons table [Table]
Geo column in points table [Column]
Geo column in polygons table [Column]
Column to compute stats for in points table [Column]
Column with identifier in polygons table [Column]
Outputs
Result table [Table]
Description
This component explodes multi-part geographies into multiple single-part ones.
The output table has the same structure as the input one, but the geo column only contains single part geographies. Rows with multi-part geographies are replaced by several rows with single-part geographies and identical values for the rest of columns.
Inputs
Source table [Table]
Geo column [Column]
Outputs
Result table [Table]
External links
This component creates a new table with aggregated values for the input whole table.
Inputs
Source table [Table]
Aggregation: Use the UI of this component to select a list of aggregated properties that will be added to the result table.
Outputs
Result table [Table]
Description
This component unnests the arrays contained an a column.
Inputs
Source table [Table]
Column [Column]
Outputs
Result table [Table]
External links
Source table [Table]
Geo column [Column]
Outputs
Result table [Table]
External links
Description
This component creates a new table with an additional column with the number of parts in each input geography.
Inputs
Source table [Table]
Geo column [Column]
Outputs
Result table [Table]
External links
Description
This component creates a new table with an additional column with the number of points in each input geography.
Inputs
Source table [Table]
Geo column [Column]
Outputs
Result table [Table]
External links
Description
This component adds a new column with the point at the n-th position in the original input geography.
The geography must be of type linestring.
It will set the column value to null if the original geography is not a linestring, if it is empty, or if there is no vertex at the given index.
Inputs
Source table [Table]
Geo column [Column]
Point index [Number]
Outputs
Result table [Table]
External links
Description
This component creates a new table with additional columns containing X and Y coordinates of the geographies in the input table. This only applies to the case of point geographies.
Inputs
Source table [Table]
Geo column [Column]
Outputs
Result table [Table]
External links
Inputs
Tx locations: A table that contains points with the location of the transmitters.
Rx locations: A table that contains points with the location of the receivers.
Terrain profile: A table that contains geographies with the terrain profile.
Settings
Buildings table FQN: (Optional) The fully qualified name of the table that contains buildings to be considered the propagation path calculation.
Vegetation table FQN: (Optional) The fully qualified name of the table that contains vegeatation to be considered the propagation path calculation.
Water table FQN: (Optional) The fully qualified name of the table that contains water bodies to be considered the propagation path calculation.
Frequency: The value of the operating frequency in GHz.
Intersect Fresnel zone: A flag whether to consider the line or the Fresnel zone for the calculation.
Terrain point sampling: A flag whether to calculate terrain points along the path.
Resolution of sampling for the terrain points: Resolution of sampling for the terrain points to be considered in the path profile.
Street Width calculation: A flag to Include street width in the path profile calculation.
Outputs
Output table: A table containing the path profiles.
Output table details: A table with the details of the obstacles of the path profiles.
Description
This component determines the Path Profile between the transmitter (Tx) and receiver (Rx), identifying obstacles along the propagation path by using raster tables for clutter and terrain profiles.
Inputs
Tx locations: A table that contains points with the location of the transmitters.
Rx locations: A table that contains points with the location of the receivers.
Clutter: A raster table that contains data about the clutter.
Terrain profile: A raster table with terrain profile data.
Settings
Frequency: The value of the operating frequency in GHz.
Intersect Fresnel zone: A flag whether to consider the line or the Fresnel zone for the calculation.
Adjust the Rx height: A flag whether to adjust the rx height based on the clutter.
Intersect center: A flag whether to extract the pixel values from the Clutter raster by intersecting the pixel center or instead the pixel boundary.
Clutter raster band for height: (Optional) The band to be extracted from the raster as the height of the clutter.
Clutter raster band for type: (Optional) The band to be extracted from the raster as the type of the clutter. If empty then the type constant is used.
Terrain profile raster band for height: (Optional) The band to be extracted from the raster as the height of the terrain.
Outputs
Output table: A table containing the path profiles.
Output table details: A table with the details of the obstacles of the path profiles.
Description
This component implements the Close In empirical propagation model.
Inputs
Path profiles: This input should be a table that contains path profiles obtained with either Path Profile or Path Profile Raster components.
Settings
Frequency: The operating frequency in GHz.
Scenario: The environment scenario.
Number of data points: The number of data points used to compute $X sigma$.
Path Loss Exponent for LOS: The $PathlossExponent$ for the LOS case.
Path Loss Exponent for NLOS: The $PathlossExponent$ for the NLOS case.
Sigma LOS: The standard deviation of the zero-mean Gaussian random variable for the LOS case.
Sigma NLOS: The standard deviation of the zero-mean Gaussian random variable for the LOS case.
Outputs
Output table: A table with the same schema as the input table plus a column that contains the result of the analysis.
Description
This component implements the Extended HATA empirical propagation model.
Inputs
Tx Locations: A table that contains points with the location of the transmitters.
Rx Locations: A table that contains points with the location of the receivers.
Path profiles: This input should be a table that contains path profiles obtained with either Path Profile or Path Profile Raster components.
Settings
Frequency: The operating frequency in GHz.
Scenario: The environment scenario
Outputs
Output table: A table with the path losses (dB).


The following is a valid example of code to use in this component:
CALL MY_PROCEDURE($a, 10, `$output`);
$isStoredProcedure
When using the Workflows API or workflows as MCP Tool, your workflow is executed as a stored procedure within your data warehouse. This changes how tables are handled.
To manage this behavior, we provide the $isStoredProcedure placeholder. This will be replaced with true when the workflow is run as a stored procedure. This happens when running workflows via API.
In stored procedures:
Table names are not fully qualified
Tables are expected to be temporary and session-scoped
Your SQL code should account for this logic to avoid errors when running workflows in API mode.
Here is a sample code for that (notice that no CALL is made in this case, since the component can run any arbitrary SQL code):
Example:
$dryrun
Workflows use a dry run mode to infer and propagate table schemas across components. When executing queries in this mode, you may want to differentiate logic from a standard run.
To handle this, use the $dryrun placeholder, which is replaced by true when the workflow is being dry-run (e.g., when clicking "Validate schema").
Example:
Inputs
Source table [Table]
Source table [Table]
SQL CALL statement [String]
Outputs
Result table [Table]
Description
This component runs arbitrary Python code.
Inputs
Source table: Use df_ain your code to reference a dataframe that contains the data in the input. Use $a to reference the FQN of the input table in your code.
Source table: Use df_bin your code to reference a dataframe that contains the data in the input. Use $b to reference the FQN of the input table in your code.
Source table: Use df_cin your code to reference a dataframe that contains the data in the input. Use $c to reference the FQN of the input table in your code.
Settings
Python code: Define your custom python code that will be executed by the component.
Outputs
Output table: Use df_outputin your code to reference a dataframe that contains the data in the output. Use $output to reference the FQN of the output table in your code.
Description
This component executes an arbitrary SELECT SQL statement.
A new table will be created from the result of the entered statement
Use $a, $b and $c as placeholders for the input table names (you can use one, all, or none of them).
The following is a valid example of code to use in this component, which will generate a new table with an added area column (based on a column named 'geom'), with only the rows corresponding to the five largest geometries:
SELECT *, ST_AREA(geom) AS area FROM `$a` ORDER BY area LIMIT 5
Please avoid trailing semi-colon ;in your Custom SQL code as it might cause unexpected behavior.
Inputs
Source table [Table]
Source table [Table]
SQL SELECT statement [String]
Outputs
Result table [Table]
IF ($isStoredProcedure) THEN
-- $output will be a simple table name (i.e mytable)
CREATE TEMPORARY TABLE $output AS SELECT * FROM $a LIMIT 10;
ELSE
-- $output will be a fully qualified name (i.e database.schema.mytable)
CREATE TABLE $output AS SELECT * FROM $a LIMIT 10;
END IFIF ($dryrun) THEN
-- During dry run: propagate schema so other components can recognize output
CREATE TABLE `$output` AS SELECT * FROM `$a`;
ELSE
-- During normal run: actually create output tables
CREATE TABLE `$output` AS SELECT * FROM `$a`;
-- During normal run: write to a specific target table for post-processing
CREATE OR REPLACE TABLE `project.schema.specific_table` AS SELECT * FROM `$a`;
END IF;


Extension Package provided by CARTO
This extension provides powerful Google Earth Engine integration for spatial analysis.
It includes components for summarizing raster data over polygons and efficiently processing large-scale vector data with Earth Engine imagery.
These tools enable users to perform advanced geospatial analysis by combining BigQuery's data processing capabilities with Earth Engine's vast satellite imagery and geospatial datasets.
Description
Summarizes Google Earth Engine raster data over a polygon. Users specify an image, band, and reducer function for efficient spatial analysis.
Inputs
Source table: This component is limited to an input of 10 rows with polygons. It's tipically used along with a input to define the regions that will be summarized.
Settings
Geometry column: Select a column from the input table that contains polygons.
GEE Image ID: (Optional) Provide a Google Earth Engine image ID. Provider either this or a GEE Image expression.
GEE Image expression: (Optional) Provider a Google Earth Engine expression that returns a single image. Provider either this or a GEE Image ID.
Band: Specify which band from the image will be used for the reduce operation.
Outputs
Output table: This component return a table with the same columns as the Source table, with an additional one named ee_summary_<reducer> that contains the value of the reduce operation.
Description
Summarizes Google Earth Engine raster data over polygons on a BigQuery table. Users specify an image, band, and reducer function for efficient spatial analysis.
Inputs
Source table: A table that contains polygons.
Settings
Geometry column: Select a column from the input table that contains polygons.
GEE Image ID: (Optional) Provide a Google Earth Engine image ID. Provider either this or a GEE Image expression.
GEE Image expression: (Optional) Provider a Google Earth Engine expression that returns a single image. Provider either this or a GEE Image ID.
Band: Specify which band from the image will be used for the reduce operation.
Outputs
Output table: This component return a table with the same columns as the Source table, with an additional one named ee_<reducer> that contains the value of the reduce operation.
Description
Retrieves and summarizes for specified polygons.
Inputs
Source table: This component is limited to an input of 10 rows with polygons. It's tipically used along with a input to define the regions that will be summarized.
Settings
Geometry column: Select a column from the input table that contains polygons.
Reducer: Select a function
Outputs
Output table: This component return a table with the same columns as the Source table, with an additional one named ee_<reducer>_elevation that contains the value of the reduce operation.
Description
Retrieves and summarizes the Normalized Difference Vegetation Index (NDVI) obtained with NIR (B8) and RED (B4) bands for specified polygons and date ranges.
Inputs
Source table: This component is limited to an input of 10 rows with polygons. It's tipically used along with a input to define the regions that will be summarized.
Settings
Geometry column: Select a column from the input table that contains polygons.
Start date: Type a YYYY-MM-DD that defines the start date to consider for your analysis.
End date: Type a YYYY-MM-DD that defines the end date to consider for your analysis.
Outputs
Output table: This component return a table with the same columns as the Source table, with an additional one named ee_ndvi that contains the mean NDVI value of the polygon.
Description
Retrieves and summarizes for specified polygons and date ranges.
Inputs
Source table: This component is limited to an input of 10 rows with polygons. It's tipically used along with a input to define the regions that will be summarized.
Settings
Geometry column: Select a column from the input table that contains polygons.
Reducer: Select a function.
Start date: Type a YYYY-MM-DD that defines the start date to consider for your analysis.
Outputs
Output table: This component return a table with the same columns as the Source table, with an additional one named ee_<reducer>_precipitation that contains the value of the reduce operation.
Components to perform more advanced geospatial analytics leveraging the spatial properties and patterns in your data.
Description
This component performs a K-nearest neighbors analysis.
Inputs
Points table [Table]
Id column [Column]
Geo column [Column]
Outputs
Result table [Table]
External links
Description
This component performs a DB Scan clustering on the geographical coordinates. Given an input table and a geo column, it creates a new table with the same content as the input one and an additional column named 'cluster' containing the cluster id corresponding to each geography.
Inputs
Source table [Table]
Geo column [Column]
Search radius (m) [Number]
Outputs
Result table [Table]
External links
Description
This component performs a K-means clustering on the geographical coordinates. Given an input table and a geo column, it creates a new table with the same content as the input one and an additional column named 'cluster' containing the cluster id corresponding to each geography..
Inputs
Source table [Table]
Geo column [Column]
Number of clusters [Number]
Outputs
Result table [Table]
External links
Description
This component takes a points table and a polygons table. It creates a new table with the same content as the polygons table, with an additional column containing the number of points from the points table that fall within each polygon.
Inputs
Points table [Table]
Polygons table [Table]
Geo column in points table [Column]
Outputs
Result table [Table]
Description
This component computes a Delaunay triangulation based in a input table with points.
It creates a new table with a 'geom' column containing the triangulation geographies (lines or polygons)
Inputs
Source table [Table]
Geo column [Column]
Return lines [Boolean]
Outputs
Result table [Table]
External links
Description
This component computes a Voronoi tesellation based in a input table with points.
It creates a new table with a 'geom' column containing the triangulation geographies (lines or polygons).
The maximum number of points used to compute Voronoi diagrams is 300,000. This limit ensures efficient computation while maintaining accuracy in delineating regions based on proximity to specified points.
Inputs
Source table [Table]
Geo column [Column]
Return lines [Boolean]
Outputs
Result table [Table]
External links
Extension Package provided by CARTO
This extension provides powerful Google Environment APIs integration for spatial analysis.
It includes components for integrating usage of Google Solar API, Air Quality API and Pollen API into geospatial analytics pipelines created with Workflows.
Reducer: Select a reducer function
Scale in meters: Specify a size in meters to define the scale of your operation.
Reducer: Select a reducer function
Scale in meters: Specify a size in meters to define the scale of your operation.
YYYY-MM-DD that defines the end date to consider for your analysisNumber of nearest neighbors [Number]
Maximum distance: Defines the maximum distance where neighbors will be considered.
Minimum number of geographies [Number]
Geo column in polygons table [Column]
Column with identifier in polygons table [Column]
Description
This component retrieves data from the Building Insights endpoint in Google Solar API for a set of points in a table. It will return data from buildings that intersects with the points in the input table. Find more information about Solar API here.
Inputs
Source table: A table that contains points that will be used to get data from Solar API
Settings
Geography column: A column in the the input table that contains the points to be used to get data from Solar API.
Google Maps Platform API Key: A Google Maps Platform API Key with access to Solar API
Required quality: The minimum quality level allowed in the results. No result with lower quality than this will be returned. Not specifying this is equivalent to restricting to HIGH quality only.
Fields from response: A list of fields to be included in the JSON response. Each field should follow the JSON path of the , and the result will preserve the same structure. Find a complete reference of available values . Example: imageryDate, solarPotential.panelCapacityWatts, solarPotential.buildingStats.areaMeters2
Output
Result table: A table with the same columns as the source table plus a column that contains a JSON object with the requested properties.
Description
This component retrieves data from the Current Conditions endpoint in Google Air Quality API for a set of points in a table. Find more information about Air Quality API here.
Inputs
Source table: A table that contains points that will be used to get data from Air Quality API
Settings
Geography column: A column in the the input table that contains the points to be used to get data from Solar API.
Google Maps Platform API Key: A Google Maps Platform API Key with access to Air Quality API
Universal Air Quality Index: Optional. Check this option to include the Universal Air Quality Index in the response.
Local Air Quality Index: Optional. Check this option to include a Local Air Quality Index in the response, based on each point location. Checking this option automatically adds the LOCAL_AQI extra computation.
Additional computations: Optional. Add a list of to be added to the result. Example: HEALTH_RECOMMENDATIONS, DOMINANT_POLLUTANT_CONCENTRATION
Language Code: Optional. Type a to be used in the response. Example: es for Spanish.
Universal AQI color palette: Optional. Select a color palette for the Universal Air Quality Index. Only applies if Universal AQI is requested.
Custom Local AQI: Optional. Type a JSON object that defines a custom local AQI. Only applies if Local AQI is requested. Find a complete reference of custom local AQIs . Example: {"regionCode": "us", "aqi": "usa_epa_nowcast"}\
Output
Result table: A table with the same columns as the source table plus a column that contains a JSON object with the response. Find a reference of the expected response here.\
Description
This component retrieves data from the Forecast endpoint in Google Air Quality API for a set of points in a table. Find more information about Air Quality API here.
Inputs
Source table: A table that contains points that will be used to get data from Air Quality API
Settings
Geography column: A column in the the input table that contains the points to be used to get data from Solar API.
Google Maps Platform API Key: A Google Maps Platform API Key with access to Air Quality API.
Range Type: Select 'Period' or 'Date time' to define the type of forecast to be obtained from the API.
Start Time: Defines a start time. Only applies if 'Period' is selected.
End Time: Defines an end time. Only applies if 'Period' is selected.
Date Time: Defines a specific timestamp to be used for the forescast. Only applies if 'Date time' is selected.
Universal Air Quality Index: Optional. Check this option to include the Universal Air Quality Index in the response.
Local Air Quality Index: Optional. Check this option to include a Local Air Quality Index in the response, based on each point location. Checking this option automatically adds the LOCAL_AQI extra computation.
Additional computations: Optional. Add a list of to be added to the result. Example: HEALTH_RECOMMENDATIONS, DOMINANT_POLLUTANT_CONCENTRATION
Language Code: Optional. Type a to be used in the response. Example: es for Spanish.
Universal AQI color palette: Optional. Select a color palette for the Universal Air Quality Index. Only applies if Universal AQI is requested.
Custom Local AQI: Optional. Type a JSON object that defines a custom local AQI. Only applies if Local AQI is requested. Find a complete reference of custom local AQIs . Example: {"regionCode": "us", "aqi": "usa_epa_nowcast"}
Output
Result table: A table with the same columns as the source table plus a column that contains a JSON object with the response. Find a reference of the expected response here.\
Description
This component retrieves data from the History endpoint in Google Air Quality API for a set of points in a table. Find more information about Air Quality API here.
Inputs
Source table: A table that contains points that will be used to get data from Air Quality API
Settings
Geography column: A column in the the input table that contains the points to be used to get data from Solar API.
Google Maps Platform API Key: A Google Maps Platform API Key with access to Air Quality API
Range Type: Select 'Period', 'Date time', or 'Hours' to define the type of forecast to be obtained from the API.
Start Time: Defines a start time. Only applies if 'Period' is selected.
End Time: Defines an end time. Only applies if 'Period' is selected.
Date Time: Defines a specific timestamp to be used for the forescast. Only applies if 'Date time' is selected.
Hours: Defines a number of hours starting from the current time to the get the forecast. Only applies if 'Hours' is selected.
Universal Air Quality Index: Optional. Check this option to include the Universal Air Quality Index in the response.
Local Air Quality Index: Optional. Check this option to include a Local Air Quality Index in the response, based on each point location. Checking this option automatically adds the LOCAL_AQI extra computation.
Additional computations: Optional. Add a list of to be added to the result. Example: HEALTH_RECOMMENDATIONS, DOMINANT_POLLUTANT_CONCENTRATION
Language Code: Optional. Type a to be used in the response. Example: es for Spanish.
Universal AQI color palette: Optional. Select a color palette for the Universal Air Quality Index. Only applies if Universal AQI is requested.
Custom Local AQI: Optional. Type a JSON object that defines a custom local AQI. Only applies if Local AQI is requested. Find a complete reference of custom local AQIs . Example: {"regionCode": "us", "aqi": "usa_epa_nowcast"}
Output
Result table: A table with the same columns as the source table plus a column that contains a JSON object with the response. Find a reference of the expected response here.
Description
This component retrieves data from the Forecast endpoint in Google Pollen API for a set of points in a table.
Inputs
Source table: A table that contains points that will be used to get data from Pollen API
Settings
Geography column: A column in the the input table that contains the points to be used to get data from Solar API.
Google Maps Platform API Key: A Google Maps Platform API Key with access to Pollen API
Days: Set a number of days (up to 4) to get a Pollen forecast.
Language Code: Optional. Type a to be used in the response. Example: es for Spanish.
Include Plants Description: Check this option to include a in the response.
Output
Result table: A table with the same columns as the source table plus a column that contains a JSON object with the response. Find a reference of the expected response here.\
Components to process data from an external format, such as WKT or GeoJSON, into a table format including the processed geographies/geometries in a column when applicable.
Description
This component allows drawing new custom geospatial features.
By clicking on "Draw features" you will be able to use a point and click UI to define custom features (points, lines or polygons) that will be used to generate a GeoJSON string.
Inputs
GeoJSON string [String]
Outputs
Result table [Table]
Description
This component generates a new column with the GeoJSON representation of geographies in another column.
Inputs
Source table [Table]
Column with geographies definition [Column]
Outputs
Result table [Table]
External links
Description
This component generates a new column with the WKT representation of geographies in another column.
Source table [Table]
Column with geographies definition [Column]
Outputs
Result table [Table]
External links
Description
This component creates a new column with geometries created from the values in another column. This column can contain geometries defined in both WKT or GeoJSON format.
Inputs
Source table [Table]
Text column with geography definition [Column]
Outputs
Result table [Table]
External links
Description
This component creates a new table with an additional column with point geographies. Geographies are created based on columns containing latitude and longitude values.
Inputs
Source table [Table]
Latitude column [Column]
Longitude column [Column]
Outputs
Result table [Table]
External links
Description
This creates a new table from a GeoJSON string.
By clicking on "Draw features" you will be able to use a point and click UI to define custom features (points, lines or polygons) that will be used to generate the GeoJSON string.
Inputs
GeoJSON string [String]
Outputs
Result table [Table]
Description
This component creates a new column with geometries in WKB format created from the values in another column.
Inputs
Source table
Settings
Column with geometries definition: Expects a String column that contains GeoJSON defintion of geometries.
Outputs
Output table: This component return a table with the same columns as the input, plus an additional binary column that contians WKB geometries
Description
This component creates a new column with geometries in WKB format created from the values in another column.
Inputs
Source table
Settings
Column with geometries definition: Expects a String column that contains WKT definition of geometries.
Outputs
Output table: This component return a table with the same columns as the input, plus an additional binary column that contians WKB geometries
Components to join two or more streams of data.
Description
This component performs a cross join.
Inputs
Main table [Table]
Secondary table [Table]
Outputs
Result table [Table]
External links
Description
This component combines rows from two separate tables based on a related column between them.
Inputs
Main table [Table]
Secondary table [Table]
Column on main table [Column]
Outputs
Result table [Table]
External links
Description
This component combines two input tables based on common fields between the two tables. You can also Join two tables based on record position.
Inputs
Left table [Table]
Right table [Table]
Join By [Selection]
Outputs
Table with unjoined rows from left table [Table]
Table with unjoined rows from right table [Table]
Joined table [Table]
Description
This component performs a spatial join between two tables, based on a spatial predicate.
Inputs
Main table [Table]
Secondary table [Table]
Geo column on main table [Column]
Outputs
Result table [Table]
Description
This component establishes the spatial relationship (contains, intersects, touches, etc.) between two sets of spatial objects. The component accepts a set of spatial objects from the Target Input and a set of spatial objects from the Universe Input. At least one input stream should contain Polygon type spatial objects.
Inputs
Targets [Table]
Universe [Table]
Target Spatial Object Field [Column]
Outputs
Result table [Table]
Description
This component creates a new table containing the difference between the geographies of the input table and the combined boundary of the geographies in another table. That is, the part of the original geography that doesn't fall within the boundary of the ancillary table.
Geographies in the secondary table must be of type polygon.
Inputs
Source table [Table]
Polygons table [Table]
Geo column in source table [Column]
Outputs
Result table [Table]
External links
Description
This component creates a new table containing the intersection between the geographies of the input table and the geographies in another table.
Columns from both tables are copied to the output one, and a new one with the intersection is added.
Inputs
Source table [Table]
Polygons table [Table]
Geo column in source table [Column]
Outputs
Result table [Table]
External links
Description
This component generates a new table that is the union of several input ones.
The input tables must have the same schema.
Inputs
Tables [Table]
Outputs
Result table [Table]
External links
As you will soon see Workflows can be simple with only a few steps, up to large and complex with many steps, sections, and outputs. So what are the best ways to manage your workflows and ensure that they are useful, readable, and manageable? Below our our tips to making Workflows work for you.
Before you get started and at each step of the process make sure you explore your data to see what is in the dataset and the columns and data types. This is important because:
Extension Package provided by CARTO
The Snowflake ML extension package for CARTO Workflows includes a variety of components that enable users to integrate machine learning workflows with geospatial data. These components allow for creating, evaluating, explaining, forecasting, and managing ML models directly within CARTO Workflows, utilizing Snowflake ML’s capabilities.
Description
This component imports a pre-trained model into the current workflow. If the name provided is not fully qualified, it will default to the connection's default database and the PUBLIC schema. The component assumes that the provided FQN points to an existing Snowflake ML model.
Model's FQN: Fully qualified name for the model to be imported.
Outputs
Output table: This component generates a single-row table with the FQN of the imported model.
Description
This component trains a classification model on the provided input data. If the name provided is not fully qualified, it will default to the connection's default database and the PUBLIC schema.
For more details, please refer to the SNOWFLAKE.ML.CLASSIFICATION official documentation in Snowflake.
Inputs
Input table: A data table that is used as input for the model creation.
Settings
Model's FQN: Fully qualified name for the model to be saved as.
ID Column: Column containing a unique identifier per sample.
Target Column: Column to be used as label in the training data.
Data Split: whether to perform or not a train/evaluation split on the data. Choosing SPLIT is required to use the Evaluate component with the resulting model.
Test Fraction: Fraction of the data to reserve for evaluation. A 0.2 will reserve 20% of the data for evaluation. Only applies if the Data Split is SPLIT.
Outputs
Output table: This component generates a single-row table with the FQN of the imported model.
Description
This component trains a forecasting model on the provided input data. If the name provided is not fully qualified, it will default to the connection's default database and the PUBLIC schema.
For more details, please refer to the SNOWFLAKE.ML.FORECAST official documentation in Snowflake.
Inputs
Input table: A data table that is used as input for the model creation.
Settings
Model's FQN: Fully qualified name for the model to be saved as.
Time Series ID Column: Column containing a unique identifier per time series.
Timestamp Column: Column containing the series' timestamp in DATE or DATETIME format.
Target Column: Column to be used as target in the training data.
Consider exogenous variables: whether to consider exogenous variables for forecasting. If checked, the future values for the variables must be provided when forecasting. All variables in the input will be considered except the specified time series ID, timestamp and target column.
Method: which method to use when fitting the model. It can be best or fast.
Sample Frequency: the frequency of the time series. It can be auto or manual.
Period: number of units to define the sampling frequency. Only applies when the Sample Frequency has been set to manual.
Time Unit: time unit used to define the frequency. It can be seconds, minutes, hours, days, weeks, months, quarters, or years. Only applies when Sampling Frequency has been set to manual.
Aggregation (categorical): aggregation function used for categorical columns if needed due to the sampling frequency. It can be mode, first, or last.
Aggregation (numeric): aggregation function used for numeric columns if needed due to the sampling frequency. It can be mean, median, mode, min, max, sum, first, or last.
Aggregation (target): aggregation function used for the target column if needed due to the sampling frequency. It can be mean, median, mode, min, max, sum, first, or last.
Outputs
Output table: This component generates a single-row table with the FQN of the imported model.
Description
This component uses a pre-trained classification model (using Get Model by Name or Create Classification Model components) to perform predictions on some given input data.
For more details, please refer to the !PREDICT function official documentation in Snowflake.
Inputs
Model table: the pre-trained classification model.
Input table: A data table that is used as input for inference.
Settings
Keep input columns: Whether to keep all the input columns in the output table.
ID Column: Column containing a unique identifier per sample. Only applies when Keep input columns is set to false.
Outputs
Output table: The model's predictions.
Description
This component uses a pre-trained forecast model (using Get Model by Name or Create Forecasting Model components) to perform predictions on some given input data.
For more details, please refer to the !FORECAST function official documentation in Snowflake.
Inputs
Model table: the pre-trained classification model.
Input table: A data table that is used as input for inference. Only needed if the model has been trained using exogenous variables.
Settings
Consider exogenous variables: whether the model was trained to use exogenous variables or not.
Number of periods: number of periods to forecast per time series. Only applies if Consider exogenous variables is false.
Time Series ID Column: Column containing a unique identifier per time series. Only applies if Consider exogenous variables is true.
Timestamp Column: Column containing the series' timestamp in DATE or DATETIME format. Only applies if Consider exogenous variables is true.
Prediction Interval: Expected confidence of the prediction interval.
Keep input columns: Whether to keep all the input columns in the output table.
Outputs
Output table: The model's predictions.
Description
This component returns some evaluation metrics for a pre-trained classification model using Get Model by Name or Create Classification Model components.
For more details, please refer to the !SHOW_EVALUATION_METRICS and !SHOW_GLOBAL_EVALUATION_METRICS functions official documentation in Snowflake.
Inputs
Model table: the pre-trained classification model.
Settings
Class level metrics: Whether to obtain the per-class evaluation metrics or the overall evaluation metrics.
Outputs
Output table: The model's evaluation metrics.
Description
This component returns some evaluation metrics for a pre-trained forecast model using Get Model by Name or Create Forecasting Model components.
For more details, please refer to the !SHOW_EVALUATION_METRICS function official documentation in Snowflake.
Inputs
Model table: the pre-trained classification model.
Input table (optional): additional out-of-sample data to compute the metrics on.
Settings
Compute metrics on additional out-of-sample data: When checked, the component will compute cross-validation metrics on additional out-of-sample data. Otherwise, the component will return the metrics generated at training time.
Time Series ID Column: Column containing a unique identifier per time series. Only applies when the metrics are being computed on additional out-of-sample data.
Timestamp Column: Column containing the series' timestamp in DATE or DATETIME format. Only applies when the metrics are being computed on additional out-of-sample data.
Target Column: Column to use as label in the input data. Only applies when the metrics are being computed on additional out-of-sample data.
Prediction Interval: Expected confidence of the prediction interval. Only applies when the metrics are being computed on additional out-of-sample data.
Outputs
Output table: The model's evaluation metrics.
This component displays the feature importances per variable of a pre-trained classification model.
For more details, please refer to the Snowflake's !SHOW_FEATURE_IMPORTANCE function documentation.
Inputs
Model table: the pre-trained classification model.
Outputs
Output table: a table with the feature importance per variable.
This component displays the feature importances per variable of a pre-trained forecast model.
For more details, please refer to the Snowflake's !SHOW_FEATURE_IMPORTANCE function documentation.
Inputs
Model table: the pre-trained forecast model.
Outputs
Output table: a table with the feature importance per variable.
Column on secondary table [Column]
Join type [Selection]
Column in Left Table [Column]
Column in Right Table [Column]
Geo column on secondary table [Column]
Spatial predicate [Selection]
Universe Spatial Object Field [Column]
Type of Spatial Match [Selection]
Output Intersection Object (Intersects Only) [Boolean]
Geo column in polygons table [Column]
Geo column in secondary table [Column]
You may be missing a column
You might have too many columns that you don’t need
You might have data nested in arrays or JSON
You might have null or missing data that could cause issues later
There are certainly other instances that you might encounter in your data but understanding the structure and limits of your data early on will help prevent issues later in your workflow.
Similar to above we highly suggest removing any columns you do not need in your final analysis and to do the same throughout your workflow. This is due to the fact that most components will add new columns. This is because we never want to modify the original data from the previous step since we cannot presume what columns you do or do not want to keep.With that said as the Worflow progresses, more and more columns will be added. Because of this at various stages we recommend using the SELECT component to pick the specific columns you need at various stages to keep your data clean and easy to read.
Many times you may need to cast your data to other data types. For example a postal code could be stored as a numeric value rather than a text value. Make sure these are appropriately cast into:
The data format you want to keep
The data format required by the components you need to use
You can use the Cast component in Workflows to do this otherwise you can cast using the Select component by selecting all columns using ‘*’ and the using the SQL CAST syntax as follows:*, CAST(zipcode AS TEXT)
Similar to the point above it is important to prepare dates or timestamps in the source data warehouse prior to running Workflows. Some functions only accept dates and others accept timestamps and ensuring you have the right data will help make things smoother.
Using the labeling tools in Workflows can help you keep your Workflow organized and help others understand what is happening in each step. There are two different approaches to labeling your Workflows:
Label small sections individually across the Workflow canvas
Label larger parts of the Workflow and then add smaller sections as necessary
Doing this while you are constructing your Workflow will help when you revisit your work or share work with another user.
As the size of your workflow increases there may come a point when you need to break your Workflow into smaller sections. While it is not impossible to make large Workflows, depending on the complexity it may in fact be easier to have Workflows broken into different steps. This helps in cases such as:
If the source data changes and is inconsistent this may break a part of your Workflow. Having smaller Workflows can help you identify the issue faster.
Smaller Workflows are likely to run faster
It is usual that once you have designed a workflow, you would like to optimize the result table creating a cluster over one or more columns.
Cluster and index optimization is the way to physically and/or logically locate records with similar relations (cluster column) to reduce the number of fetches or access time with prefetching technics.
These advanced options depend on destination data warehouse and are available for any component that generates one o more output tables.
Users can optimize their data by following these simple steps:
Create a model with one or more component generating an output table and select this component e.g. as in the model below
Fill all component parameters. This is necessary to allow final output schema generation that allow to fill up the dropdown menu of the optimization column.
Select the column to be used for the optimization. The mechanism will depend on the data warehouse:
For BigQuery, Cluster by
For Snowflake, two settings: Search optimization column and Cluster by
For Redshift, Cluster by
For PostgreSQL, Index column and Cluster by index
BigQuery allow select Cluster by column that have the following types:
Geography
Geometry
Boolean
Number
String
Date
Datetime
Time
Timestamp
More info about BigQuery clustering .
Snowflake allow select Search optimization column that have the following types:
Geography
Geometry
This restriction is because ST_GEOHASH based ordering is used. More info about Snowflake ST_GEOHASH and the applied general search optimization .
Whilst Cluster by
Redshift allow select Cluster by column as SORTKEY that have the following types:
Boolean
Number
String
More info about Redshift SORTKEY optimization
Postgres allow select Index column that have the following types:
Geography
Geometry
Boolean
Components to enrich your data with variables from other data sources. These components work for simple features and spatial indexes grids.
Description
This component enriches a target table with data from a source. Enriching here means adding columns with aggregated data from the source that matches the target geographies.
The target, (which is the upper input connection of this component), must have a column that contains H3 indices, which will be used to join with the source.
The source (lower input connection) can be either a CARTO Data Observatory subscription or table (or result from other component) with a geography column.
For the enrichment operation the CARTO Analytics Toolbox is required, and one of the following procedures will be called:
DATAOBS_ENRICH_GRID if the source is a Data Observatory subscription
ENRICH_GRID otherwise
Inputs
Target geo column: it's the column of the target that will be used to join with the source and select the rows that will be aggregated for each target row.
Source geo column: (only necessary for non-DO sources) is the column of the source that will join with the target.
Variables: this allows selecting the data from the source that will be aggregated and added to the target.
For spatially smoothed enrichments that take into account the surrounding cells, use the following input parameters:
Kring size: size of the k-ring where the decay function will be applied. This value can be 0, in which case no k-ring will be computed and the decay function won't be applied.
Decay function: decay function to aggregate and smooth the data. Supported values are uniform, inverse, inverse_square and exponential.
Outputs
Result table [Table]
Description
This component enriches a target table with data from a source. Enriching here means adding columns with aggregated data from the source that matches (intersects) the target geographies.
The target, (which is the upper input connection of this component), must have a geo column, which will be used to intersect with the source.
The source (lower input connection) can be either a CARTO Data Observatory subscription or table (or result from other component) with a geo column.
For the enrichment operation the CARTO Analytics Toolbox is required, and one of the following procedures will be called:
DATAOBS_ENRICH_POINTS if the source is a Data Observatory subscription
ENRICH_POINTS otherwise
Inputs
Target geo column: it's the column of the target that will be used to intersect with the source and select the rows that will be aggregated for each target row.
Source geo column: (only necessary for non-DO sources) is the column of the source that will intersect with the target.
Variables: this allows selecting the data from the source that will be aggregated and added to the target.
Outputs
Result table [Table]
Description
This component enriches a target table with data from a source. Enriching here means adding columns with aggregated data from the source that matches (intersects) the target geographies.
The target, (which is the upper input connection of this component), must have a geo column, which will be used to intersect with the source.
The source (lower input connection) can be either a CARTO Data Observatory subscription or table (or result from other component) with a geo column.
For the enrichment operation the CARTO Analytics Toolbox is required, and one of the following procedures will be called:
DATAOBS_ENRICH_POLYGONS if the source is a Data Observatory subscription
ENRICH_POLYGONS otherwise
Inputs
Target geo column: it's the column of the target that will be used to intersect with the source and select the rows that will be aggregated for each target row.
Source geo column: (only necessary for non-DO sources) is the column of the source that will intersect with the target.
Variables: this allows selecting the data from the source that will be aggregated and added to the target.
Outputs
Result table [Table]
Description
This component uses a data source (either a table or a DO subscription) to enrich another target table using weights to control the enrichment.
Inputs
Target table to be enriched
Source table with data for the enrichment data
Weights table with data to weight the enrichment
Settings
Target polygons geo column: Select the column from the target table that contains a valid geography.
Source table geo column: Select the column from the source table that contains a valid geography.
Variables: Select a list of variables and aggregation method from the source table to be used to enrich the target table. Valid aggregation methods are:
Weights geo column: Select the column from the weights table that contains a valid geography.
Weights variable: Select one variable and aggregation operation to be used as weight for the enrichment.
When the source for the enrichment is a standard table, the weights source can't be a DO subscription.
The same limitation applies when the source for the enrichment is a DO subscription; the weights source can't be a standard table.
Outputs
Output table with the following schema
All columns from Target
A column from each variable in 'Variables', named like 'name_sum', 'name_avg', 'name_max' depending on the original column name and the aggregation method.
Description
This component enriches a target table with data from a source. Enriching here means adding columns with aggregated data from the source that matches the target geographies.
The target, (which is the upper input connection of this component), must have a column that contains Quadbin indices, which will be used to join with the source.
The source (lower input connection) can be either a CARTO Data Observatory subscription or table (or result from other component) with a geography column.
For the enrichment operation the CARTO Analytics Toolbox is required, and one of the following procedures will be called:
DATAOBS_ENRICH_GRID if the source is a Data Observatory subscription
ENRICH_GRID otherwise
Inputs
Target geo column: it's the column of the target that will be used to join with the source and select the rows that will be aggregated for each target row.
Source geo column: (only necessary for non-DO sources) is the column of the source that will join with the target.
Variables: this allows selecting the data from the source that will be aggregated and added to the target.
For spatially smoothed enrichments that take into account the surrounding cells, use the following input parameters:
Kring size: size of the k-ring where the decay function will be applied. This value can be 0, in which case no k-ring will be computed and the decay function won't be applied.
Decay function: decay function to aggregate and smooth the data. Supported values are uniform, inverse, inverse_square and exponential.
Outputs
Result table [Table]
Components to create tilesets out of vector and spatial index data.
Valid table names for tileset's destination:
Provided FQNs need to follow convention on each data warehouse. Invalid names (according to each data warehouse constraints) will make the component fail. Check the documentation for each.
Description
This component creates an aggregation tileset from an input containing H3 cell indexes. On each level of the tileset, cells will be aggregated into their parents, along with their associated properties.
Inputs
Input table: This component expects a table that has at least one column containing H3 indexes.
Settings
Tileset destination: select a destination that is available on your connection to store the resulting tileset table.
H3 column: The column in he input table that contains the H3 index.
H3 input data resolution: It is required to manually enter the actual resolution of the indexes contained in your input.
Minimum H3 tileset resolution: Your tileset will be generated at a range of different H3 resolutions. This setting specifies the minimum resolution in that range. Take into account that H3 resolutions don't match exactly the zoom levels in a web map.
Properties: Select a property from the input columns and an aggregation method to add aggregated properties to the tileset. The aggregation will be computed for every cell on each resolution level, considering the child cells in the input's data.
Add custom expression: Check this option to define additional custom properties. The custom properties are specified on a JSON object that needs to follow this format:
{
"new_prop_name": {
"formula": "COUNT(*)",
"type": "Number"
},
"most_common_category": {
Advanced options
Metadata: Use this setting (as a JSON input) to define "name", "description" and "legend" metata objects to be included in the TileJSON. Other custom objects with arbitrary names can also be added, resulting in a "extra_metadata" object in the TileJSON.
Output
This component doesn't have an output that can be connected to other nodes downstream. The generated tileset will be stored on the specified destination table.
Description
This component creates a tileset from an input containing Points. It will aggregate points close to each other and their properties.
Inputs
Input table: This component expects a table that has at least one geospatial column containing points of type GEOMETRY or GEOGRAPHY column depending on the connection provider.
Settings
Tileset destination: select a destination that is available on your connection to store the resulting tileset table.
Geo column: A column in the input table that contains the Points.
Minimum zoom level: Your tileset will be generated at a range of different zoom levels. This setting specifies the lower limit in that range.
Maximum zoom level: Your tileset will be generated at a range of different zoom levels. This setting specifies the higher limit in that range.
Advanced options
Minimum zoom column: A column that controls the minimum zoom level where that specific point should be considered to be included in the tile.
Maximum zoom column: A column that controls the maximum zoom level where that specific point should be considered to be included in the tile.
Maximum features per tile: The maximum number of features that can be included in the tileset. This setting is applied after the aggregation is computed. When the limit is reached, points are going to be dropped from the tile.
Output
This component doesn't have an output that can be connected to other nodes downstream. The generated tileset will be stored on the specified destination table.
Description
This component creates an aggregation tileset from an input containing Quadbin cell indexes. On each level of the tileset, cells will be aggregated into their parents, along with their associated properties.
Inputs
Input table: This component expects a table that has at least one column containing H3 indexes.
Settings
Tileset destination: select a destination that is available on your connection to store the resulting tileset table.
Quadbin column: The column int he input table that contains the Quadbin index.
Quadbin input data resolution: It is required to manually enter the actual resolution of the indexes contained in your input.
Minimum Quadbin tileset resolution: Your tileset will be generated at a range of different Quadbin resolutions. This setting specifies the minimum resolution in that range. Take into account that Quadbin resolutions levels are equivalent to zoom levels on a CARTO map.
Properties: Select a property from the input columns and an aggregation method to add aggregated properties to the tileset. The aggregation will be computed for every cell on each resolution level, considering the child cells in the input's data.
Add custom expression: Check this option to define additional custom properties. The custom properties are specified on a JSON object that needs to follow this format:
{
"new_prop_name": {
"formula": "COUNT(*)",
"type": "Number"
},
"most_common_category": {
Advanced options
Metadata: Use this setting (as a JSON input) to define "name", "description" and "legend" metata objects to be included in the TileJSON. Other custom objects with arbitrary names can also be added, resulting in a "extra_metadata" object in the TileJSON.
Output
This component doesn't have an output that can be connected to other nodes downstream. The generated tileset will be stored on the specified destination table.
Description
This component creates a tileset from an input containing Point, Lines or Polygons.
Inputs
Input table: This component expects a table that has at least one geospatial column of type GEOMETRY or GEOGRAPHY depending on the connection provider.
Settings
Tileset destination: select a destination that is available on your connection to store the resulting tileset table.
Geo column: A column in the input table that contains the Points.
Minimum zoom level: Your tileset will be generated at a range of different zoom levels. This setting specifies the lower limit in that range.
Maximum zoom level: Your tileset will be generated at a range of different zoom levels. This setting specifies the higher limit in that range.
Advanced options
Minimum zoom column: A column that controls the minimum zoom level where that specific point should be considered to be included in the tile.
Maximum zoom column: A column that controls the maximum zoom level where that specific point should be considered to be included in the tile.
Maximum features per tile: The maximum number of features that can be included in the tileset. This setting is applied after the aggregation is computed. When the limit is reached, points are going to be dropped from the tile.
Output
This component doesn't have an output that can be connected to other nodes downstream. The generated tileset will be stored on the specified destination table.
In the Sources explorer you can find all the data sources available via the data warehouse connection that you have selected when creating the workflow. Here you can learn more about how to create your data warehouse connections in CARTO.
There are three different types of data source: Connection data, Data Observatory and Files
It is important to note that currently each workflow can only be associated with one single data warehouse connection and that it is defined at the time of creating the workflow (without being able to modify it afterwards).
Through "Connection data" you will have access to tables in schemas/datasets in your data warehouse. These sources can be used in workflows by just adding them to the canvas.
While it is supported to insert or update or data in your table sources, modifying the schema of these tables (such as adding or removing columns, or changing data types) may cause unexpected behavior or errors during execution, as the schema won't be automatically updated in the workflow.
If you expect schema changes in your tables, it is recommended to use a and explicitly specify the column names to reflect those changes throughout the workflow.
If your CARTO organization has been provisioned in a cloud region outside of the US (Europe-West, Asia-Northeast or Australia-Southeast), you will be able to select between two different options in your Sources panel in Workflows:
Your organization's region. You will only have access to your own 'shared' and 'private' datasets within 'organization data'.
CARTO's region (US). This one will grant to access to a different 'organization data' provisioned in the US region and also to CARTO's 'demo data'.
If you want to use data that is not available via the data warehouse connection used in the workflow, you always have the possibility to drag and drop files to your workflow at any time which will get imported to your data warehouse via the established connection. Please find the list of supported file formats and size limits .
The files added by drag and drop, or selected from your browser will be stored and remain accessible from the same Workflow.
You can star connections, database/projects/schemas and tables to quickly access them later. For more information, see .
An extension package for CARTO Workflows that provides comprehensive embedding analytics capabilities. This package enables users to analyze, cluster, compare, and visualize high-dimensional vector embeddings derived from spatial data, satellite imagery, or any other geospatial data source.
All components in this extension package work with embeddings resulting from using the components in the extension package. They also work with other embeddings as inputs, as long as each embedding feature is stored on a column (as opposed to an array of vectors).
Maximum H3 tileset resolution: Your tileset will be generated at a range of different H3 resolutions. This setting specifies the maximum resolution in that range. Take into account that H3 resolutions don't match exactly the zoom levels in a web map.
Aggregation resolution: On each resolution level in the range set by the minimum and maximum tileset resolution, cells will be aggregated into their parent cells, along with their properties. This setting controls the granularity of that aggregation. For example, let's take a dataset that contains H3 cells of resolution 14. When generating the lower resolution levels of the tileset (bigger cells), the original cells are smaller than a pixel on your screen and can't be visualized. In this case, assuming that 'Aggregation resolution' is set to 5:
at resolution 4, the original cells are going to be aggregated into their level 9 parents (4+5);
at resolution 5, original cells will be aggregated into their level 10 parents (5+5)s;
at resolution 6, original cells will be aggregated into their level 11 parents (6+5).
In general, cells will be aggregated into a resolution level that is the result of: current resolution level + aggregation resolution.
"formula": "APPROX_TOP_COUNT(category, 1)[OFFSET(0)].value","type": "String"},...}formulaAggregation resolution: On each zoom level in the range set with the options above, points will be aggregated into a grid, along with their properties. This setting controls the granularity of that aggregation. In other words, it controls the the size of the grid cells used to compute the aggregation. For example, assuming the Aggregation resolution is set to 8:
at zoom level 6, the points will be aggregated into a grid where cells are equivalent to tiles of zoom level 14 (6+8).
At zoom level 7, the points will be aggregated into a grid where cells are equivalent to tiles of zoom level 15 (7+8).
At zoom level 8, the points will be aggregated into a grid where cells are equivalent to tiles of zoom level 16 (8+8).
In general, the size of the grid's cells is going to be determined by current zoom level + aggregation resolution.
Aggregation placement: Since points are aggregated into a grid, there are different options for the resulting placement:
cell-centroid: The resulting feature in the tileset is a point a the cell's centroid.
cell: The resulting feature in the tileset is the cell's boundary.
features-any: The resulting feature in the tileset is a random point from the ones that are being aggregated on that cell. This helps avoiding a gridded effect on the visualization
features-centroid: The resulting feature in the tileset is a point that represents the centroid of the points aggregated on that cell. This helps avoiding a gridded effect, while keeping a sense of the distribution of points within the cell.
Properties: Select a property from the input columns and an aggregation method to add aggregated properties to the tileset. The aggregation will be computed for every cell on each resolution level, considering the child cells in the input's data.
Add custom expression: Check this option to define additional custom properties. The custom properties are specified on a JSON object that needs to follow this format:
{
"new_prop_name": {
"formula": "COUNT(*)",
"type": "Number"
},
"most_common_category": {
"formula": "APPROX_TOP_COUNT(category, 1)[OFFSET(0)].value",
"type": "String"
},
...
}
The expression in formula needs to follow your data warehouse's SQL syntax.
This setting is not available in all data warehouses.
Maximum kilobytes per tile: The maximum resulting size in kilobytes. This limit is based on an estimation that only considers the size of the geometries and not the included properties. When the limit is reached, points are going to be dropped from the tile.
This setting is not available in all data warehouses.
Tile feature order: When any of the limits above is reached, points are dropped from the tile. This setting expects a SQL expression to define the order of priority on the resulting points. The higher in the ordering, the less likely to be dropped.
Metadata: Use this setting (as a JSON input) to define "name", "description" and "legend" metata objects to be included in the TileJSON. Other custom objects with arbitrary names can also be added, resulting in a "extra_metadata" object in the TileJSON.
Maximum Quadbin tileset resolution: Your tileset will be generated at a range of different Quadbin resolutions. This setting specifies the maximum resolution in that range.
Aggregation resolution: On each resolution level in the range set by the minimum and maximum tileset resolution, cells will be aggregated into their parent cells, along with their properties. This setting controls the granularity of that aggregation. For example, let's take a dataset that contains Quadbin cells of resolution 14. When generating the lower resolution levels of the tileset (bigger cells), the original cells are smaller than a pixel on your screen and can't be visualized. In this case, assuming that 'Aggregation resolution' is set to 5:
at resolution 4, the original cells are going to be aggregated into their level 9 parents (4+5);
at resolution 5, original cells will be aggregated into their level 10 parents (5+5)s;
at resolution 6, original cells will be aggregated into their level 11 parents (6+5).
In general, cells will be aggregated into a resolution level that is the result of: current resolution level + aggregation resolution.
"formula": "APPROX_TOP_COUNT(category, 1)[OFFSET(0)].value","type": "String"},...}formulaProperties: Select a list of properties from the input columns.
This setting is not available in all data warehouses.
Maximum kilobytes per tile: The maximum resulting size in kilobytes. This limit is based on an estimation that only considers the size of the geometries and not the included properties. When the limit is reached, points are going to be dropped from the tile.
This setting is not available in all data warehouses.
Tile feature order: When any of the limits above is reached, points are dropped from the tile. This setting expects a SQL expression to define the order of priority on the resulting points. The higher in the ordering, the less likely to be dropped.
Generate this tileset as a custom boundary for CARTO + deck.gl: Check this option if your tileset is going to be used as a boundary. Learn more about Boundaries.
Metadata: Use this setting (as a JSON input) to define "name", "description" and "legend" metata objects to be included in the TileJSON. Other custom objects with arbitrary names can also be added, resulting in a "extra_metadata" object in the TileJSON.

Boolean
Number
String
More info about Snowflake Cluster optimization here.
Number
String
on which create GIST index. More about Indexes here.
Whilst Cluster by index column can be selected among any types. More info about Posgtgres Cluster optimization here.

For Data Observatory subscriptions, the variables can be selected from the DO variables of the subscription, identified by their variable slug;
for other sources they are the columns in the source table.
Each variable added must be assigned an aggregation method. You can add the same variable with different aggregation methods. At the moment only numeric variables are supported.
For Data Observatory subscriptions, the variables can be selected from the DO variables of the subscription, identified by their variable slug;
for other sources they are the columns in the source table.
Each variable added must be assigned an aggregation method. You can add the same variable with different aggregation methods. At the moment only numeric variables are supported.
For Data Observatory subscriptions, the variables can be selected from the DO variables of the subscription, identified by their variable slug;
for other sources they are the columns in the source table.
Each variable added must be assigned an aggregation method. You can add the same variable with different aggregation methods. At the moment only numeric variables are supported.
MIN: It assumes the aggregated variable is an intensive property (e.g. temperature, population density). Thus, the value is not altered by the weight variable.
MAX: It assumes the aggregated variable is an intensive property (e.g. temperature, population density). Thus, the value is not altered by the weight variable.
AVG: It assumes the aggregated variable is an intensive property (e.g. temperature, population density). A weighted average is computed, using the value of the intersected weight variable as weights.
COUNT It computes the number of features that contain the enrichment variable and are intersected by the input geography.
For Data Observatory subscriptions, the variables can be selected from the DO variables of the subscription, identified by their variable slug;
for other sources they are the columns in the source table.
Each variable added must be assigned an aggregation method. You can add the same variable with different aggregation methods. At the moment only numeric variables are supported.
Description
Analyzes temporal changes in embeddings by comparing vector representations across different time periods. It calculates distance metrics between consecutive time points for each location, enabling the identification of significant changes in spatial patterns, environmental conditions, or other time-varying characteristics encoded in the embeddings.
This component is particularly useful for monitoring temporal evolution, detecting anomalies, tracking environmental changes, and identifying locations that have undergone significant transformations over time.
Inputs
Embedding table: Table containing the embeddings with associated timestamps and location identifiers.
Settings
ID column: Column containing unique identifiers for each embedding. This column is used to track changes for the same entity across different time periods.
Date column: Column containing the timestamps or dates associated with each embedding. This column is used to order the data chronologically and identify consecutive time periods for comparison.
Distance metric: Distance function used to compare vectors across time (default: Cosine).
Cosine: Measures the alignment between two vectors based on the cosine of the angle between them. Values closer to 0 indicate high similarity in direction, while values closer to 2 indicate larger change. This metric is ideal when you want to focus on the direction of change rather than magnitude.
Euclidean: Measures the straight-line distance between vectors in the embedding space. Values closer to 0 indicate high similarity, while larger values indicate greater changes. This metric considers both direction and magnitude of change, making it suitable when the absolute values of the embeddings are meaningful.
Comparison method: Determines how embeddings are compared across time (default: Rolling pairs).
Rolling pairs: Compares the embeddings of each timestamp with those of the previous time period, allowing you to track incremental changes over time. This mode is ideal for detecting gradual changes and trends.
Cross pairs: Compares the embeddings of each timestamp with all remaining time periods, providing a broader view of how a location evolves relative to any other time point. This mode is useful for identifying significant transformations and outliers.
Reduce embedding dimensionality: Whether to use Principal Component Analysis (PCA) to reduce the dimensionality of the vector embeddings before change detection (default: false). Enabling this can improve performance and reduce noise by focusing on the most significant patterns in your data.
PCA explained variance: Variance threshold for PCA dimensionality reduction (0.01-0.99, default: 0.9). This determines how much of the original variance should be preserved. Higher values (closer to 1.0) retain more information but may include more noise, while lower values focus on the most important patterns, with loss of information.
Outputs
Output table: The change detection table with the following columns:
<geoid>: Original identifier for each location/entity.
<date>_from: Starting timestamp for the change period.
<date>_to: Ending timestamp for the change period.
distance: Calculated distance metric between the embeddings at the two time points.
change_score: Measure that quantifies the degree of change between two embedding vectors for the same location across different time periods. 0 means no change.
Metrics table: The metrics table with the following columns:
total_explained_variance_ratio: The proportion of variance in the original data that is explained by the PCA dimensionality reduction. This metric is particularly useful when PCA dimensionality reduction is applied, showing how much of the original information is preserved in the reduced dimensions.
number_of_principal_components: The number of principal components extracted from the PCA.
The component requires complete temporal coverage for each location - every geoid must have embedding values for each date in the dataset. This ensures accurate change detection by maintaining consistent temporal baselines for comparison.
Description
Uses BigQuery ML's K-means algorithm to group embeddings into k clusters, identifying locations with similar spatial patterns or features. It optionally runs dimensionality reduction via Principal Component Analysis to improve clustering performance and reduce computational cost by focusing on the most significant patterns.
This component is particularly useful for discovering natural groupings in spatial data, segmenting areas based on their characteristics, and identifying outliers or distinct regions.
Inputs
Embedding table: Table containing the embeddings to cluster.
Settings
ID column: Column containing unique identifiers for each embedding. This column is used to maintain the relationship between input embeddings and their assigned clusters in the output.
Analysis type: Determines whether the analysis is spatial or spatio-temporal (default: Spatial).
Spatial: Analyzes embeddings at a single point in time or without temporal considerations. Use this for static spatial pattern analysis.
Spatio-temporal: Analyzes embeddings across multiple time periods, enabling detection of temporal changes in spatial patterns. Requires a date column to be specified.
Date column: Column containing temporal information for spatio-temporal analysis (required when Analysis type is Spatio-temporal). This column should contain date, datetime, timestamp, or numeric time values that identify when each embedding was captured.
Number of clusters: Number of clusters to create (2-100, default: 5). This parameter determines how many groups the algorithm will divide your embeddings into.
Distance type: Distance metric used for clustering calculations (default: EUCLIDEAN).
EUCLIDEAN: Measures straight-line distance between points in the embedding space. Best for embeddings where all dimensions have similar scales and importance.
COSINE: Measures the angle between vectors, ignoring magnitude. Ideal for embeddings where the direction matters more than the absolute values, such as normalized feature vectors.
Reduce embedding dimensionality: Whether to apply Principal Component Analysis (PCA) before clustering (default: false). Enabling this can improve performance and reduce noise by focusing on the most significant patterns in your data.
PCA explained variance: Variance threshold for PCA dimensionality reduction (0.01-0.99, default: 0.9). This determines how much of the original variance should be preserved. Higher values (closer to 1.0) retain more information but may include more noise, while lower values focus on the most important patterns, with loss of information.
Maximum iterations: Maximum number of iterations for the K-means algorithm (1-1000, default: 20). The algorithm will stop after this many iterations even if it hasn't converged. Increase this value for complex datasets that may need more iterations to find optimal clusters.
Early stopping: Whether to stop the algorithm early if it converges before reaching the maximum iterations (default: true). Enabling this improves performance by stopping as soon as cluster assignments stabilize, but may miss optimal solutions in some cases.
Minimum relative progress: Threshold for early stopping convergence detection (0.001-0.1, default: 0.01). The algorithm stops when the improvement between iterations falls below this threshold. Lower values allow more precise convergence but may require more iterations.
Outputs
Output table: The clustering table with the following columns:
<geoid>: Original identifier for each embedding.
cluster_id: Assigned cluster ID (1 to k).
distance_to_centroid: Distance from the embedding to its cluster centroid.
Metrics table: The metrics table with the following columns:
davies_bouldin_index: A clustering quality metric that measures the ratio of within-cluster scatter to between-cluster separation. Lower values indicate better clustering quality, with values closer to 0 representing well-separated, compact clusters.
mean_squared_distance: The average squared distance from each point to its assigned cluster centroid. This metric helps assess how tightly packed the clusters are - lower values indicate more compact clusters.
Description
Identifies regions with comparable spatial or contextual characteristics by leveraging spatially aware embeddings. It compares the embeddings of one or more reference locations against a set of search location embeddings to identify areas that share similar spatial patterns or features. This component supports any embedding representation that encodes spatial relationships and can be applied across a wide range of domains, including remote sensing, environmental monitoring, and spatial data analysis.
This component is particularly useful for finding locations with similar characteristics, identifying patterns across different regions, discovering areas that match specific reference conditions, and conducting spatial similarity analysis at scale.
Inputs
Reference location(s): Table containing vector(s) representing the baseline data or items to compare against.
Search locations: Table containing vectors for which similarity to the reference set will be computed.
Settings
Reference location(s) ID column: Column containing the IDs of the reference vector(s). This column is used to maintain the relationship between input embeddings and their similarity scores in the output.
Search location(s) ID column: Column containing the IDs of the search vector(s). This column is used to maintain the relationship between input embeddings and their similarity scores in the output.
Distance metric: Distance function used to compare vectors (default: Cosine).
Cosine: Measures the alignment between two vectors based on the cosine of the angle between them. Values closer to 0 indicate high similarity in direction, while values closer to 2 indicate high dissimilarity. This metric is ideal when you want to focus on the direction of change rather than magnitude.
Euclidean: Measures the straight-line distance between vectors in the embedding space. Values closer to 0 indicate high similarity, while larger values indicate greater dissimilarity. This metric considers both direction and magnitude, making it suitable when the absolute values of the embeddings are meaningful.
Aggregate across reference locations: Whether the similarity results from multiple reference locations should be aggregated into a single output (default: true). When enabled, the component computes similarity maps for each reference location and returns their mean or maximum, producing one consolidated similarity result.
Aggregate function: The function to use when aggregating similarity results from multiple reference locations (default: AVG).
AVG: Returns the average similarity score across all reference locations.
MAX: Returns the maximum similarity score across all reference locations.
Return top-k similar locations: Whether to return only the top-k most similar locations (default: true). When enabled, only the most similar locations are returned, improving performance and focusing on the most relevant results.
Top-k metric: Number of most similar locations to return for each reference vector (1-infinity, default: 10). This parameter determines how many of the most similar locations will be included in the output.
Reduce embedding dimensionality: Whether to use Principal Component Analysis (PCA) to reduce the dimensionality of the vector embeddings (default: false). Enabling this can improve performance and reduce noise by focusing on the most significant patterns in your data.
PCA explained variance: Variance threshold for PCA dimensionality reduction (0.01-0.99, default: 0.9). This determines how much of the original variance should be preserved. Higher values (closer to 1.0) retain more information but may include more noise, while lower values focus on the most important patterns, with loss of information.
Create vector index: Whether to create a vector index to optimize the similarity search at scale (default: false, advanced option). When enabled, uses Approximate Nearest Neighbor search to improve performance with the trade-off of returning more approximate results. Without a vector index, uses brute force search to measure distance for every record.
Outputs
Output table: The similarity table with the following columns:
reference_<id>: Identifier of the reference location.
search_<id>: Identifier of the search location.
distance: Distance between the reference and search embeddings.
similarity_score: Calculated similarity metric between the reference and search embeddings. When Cosine distance is computed, the similarity score is based on the cosine similarity between the two vectors - ranging from 0 (opposite direction) to 1 (identical direction), where higher values indicate greater similarity. When Euclidean distance is computed, the similarity score compares the distance for each search location to the distance from the mean vector data. The score will be positive if and only if the search location is more similar to the reference than the mean vector data. The larger the score, the greater the similarity.
Metrics table: The metrics table with the following columns:
total_explained_variance_ratio: The proportion of variance in the original data that is explained by the PCA dimensionality reduction. This metric is particularly useful when PCA dimensionality reduction is applied, showing how much of the original information is preserved in the reduced dimensions.
number_of_principal_components: The number of principal components extracted from the PCA.
Description
Transforms high-dimensional embeddings into RGB color representations for intuitive visualization. It supports two dimensionality reduction approaches: Principal Component Analysis (PCA) for automatic feature extraction, or manual selection of specific embedding dimensions to map directly to red, green, and blue color channels.
This component is particularly useful for creating visual representations of spatial patterns, identifying clusters through color similarity, and generating color-coded maps that reveal underlying data structures in your embeddings.
Inputs
Embedding table: Table containing the embeddings to visualize.
Settings
ID column: Column containing unique identifiers for each embedding. This column is used to maintain the relationship between input embeddings and their color assignments in the output.
Analysis type: Determines whether the analysis is spatial or spatio-temporal (default: Spatial).
Spatial: Analyzes embeddings at a single point in time or without temporal considerations. Use this for static spatial pattern visualization.
Spatio-temporal: Analyzes embeddings across multiple time periods, enabling visualization of temporal changes in spatial patterns. Requires a date column to be specified.
Date column: Column containing temporal information for spatio-temporal analysis (required when Analysis type is Spatio-temporal). This column should contain date, datetime, timestamp, or numeric time values that identify when each embedding was captured.
Dimensionality reduction technique: Method for reducing embeddings to 3 dimensions for RGB visualization (default: PCA).
PCA: Automatically extracts the three most significant principal components from your embeddings and maps them to RGB channels. This approach preserves the maximum variance in your data and is ideal when you want to discover the most important patterns automatically.
Manual: Allows you to manually select three specific embedding dimensions to map directly to red, green, and blue channels. Use this when you have domain knowledge about which features are most important for your visualization.
Red channel: Column containing values for the red color channel (required when using Manual reduction mode). Select the embedding dimension that should control the red intensity in your visualization.
Green channel: Column containing values for the green color channel (required when using Manual reduction mode). Select the embedding dimension that should control the green intensity in your visualization.
Blue channel: Column containing values for the blue color channel (required when using Manual reduction mode). Select the embedding dimension that should control the blue intensity in your visualization.
Outputs
Output table: The image table with the following columns:
<geoid>: Original identifier for each embedding.
r_channel: Normalized red channel value (0-255).
g_channel: Normalized green channel value (0-255).
b_channel: Normalized blue channel value (0-255).
hex_color: Hexadecimal color representation (e.g., #FF5733) for easy use in mapping applications.
Metrics table: The metrics table with the following columns:
total_explained_variance_ratio: The proportion of variance in the original data that is explained by the dimensionality reduction. When using PCA, this indicates how much information is preserved in the visualization. Values closer to 1.0 indicate that the visualization captures most of the original data's structure.
luminance_variance: The variance of the luminance values across all visualized embeddings. Luminance is calculated using the standard RGB-to-luminance conversion formula (0.2989xR + 0.5870xG + 0.1140xB) and represents the perceived brightness of each color. Higher variance indicates greater diversity in brightness levels across your visualization, while lower variance suggests more uniform brightness distribution.
An extension package for CARTO Workflows that provides comprehensive territory planning and location allocation optimization capabilities. This package enables users to solve complex spatial optimization problems by creating balanced territorial boundaries, or finding optimal facility locations, allocating demand points to facilities, and managing territorial constraints while minimizing costs or maximizing coverage.
Description
Processes and formats facilities data for use in a Location Allocation analysis.
Inputs
Candidate Facilities input table: The source table containing candidate facility data. These represent facilities that could be opened or activated based on the allocation strategy.
Required Facilities input table: optional The source table containing required facility data. These are pre-selected locations that are always active, regardless of optimization results.
Competitor Facilities input table: optional The source table containing competitors data. These are used to model competition and influence demand allocation or market share in the optimization process.
Settings
Candidate facilities ID column: The column containing the candidate facility unique IDs.
Candidate facilities geometry column: The column containing the candidate facility point locations.
Use required facilities: Whether to consider mandatory or pre-selected facilities. Requires a Required Facilities input table to be connected. If enabled:
Outputs
Output table: The table with the prepared facilities data, which contains the following columns:
facility_id: The selected ID column.
geom: The selected Geometry column.
Description
Processes and formats demand points (customers, delivery addresses, clinics...) data for use in a Location Allocation analysis.
Inputs
Demand points input table: The source table containing demand points data.
Settings
ID column: The column containing the unique identifier for each demand point.
Geometry column: The column containing each demand point location.
Use demand: Whether each demand point should have an associated demand value. If enabled:
Demand column
Outputs
Output table: The table with the prepared demand points data, which contains the following columns:
dpoint_id: The selected ID column.
geom: The selected Geometry column.
Description
Processes and formats cost data for use in a Location Allocation analysis.
Inputs
Costs input table: The source table containing the cost of assigning each facility to each demand point (such as travel distace, travel time...).
Settings
Facilities ID column: The column containing the facility IDs.
Demand points ID column: The column containing the demand point IDs.
Costs column: The column containing the cost of assigning each facility to each demand point.
Apply transformation to cost: Whether to apply a transformation to the cost. If enabled:
You can use native workflows components to compute costs based on distance or time, such as or .
Outputs
Output table: The table with the prepared costs data, which contains the following columns:
facility_id: The ID of the facilities.
dpoint_id: The ID of the demand points.
Description
Processes and formats constraint-specific additional data for use in a Location Allocation analysis.
Inputs
Required input table: optional The source table containing required facility-demand point pairs (for every row k, facility k must serve demand point k).
Forbidden input table: optional The source table containing banned facility-demand point pairs (for every row k, facility k cannot serve demand point k).
Settings
Consider required facility-demand point pairs: Whether to consider mandatory relationships between facilities and demand points. Requires a Required input table to be connected. If enabled:
Facilities ID column: The column containing the facility IDs.
Demand points ID column: The column containing the demand point IDs.
Outputs
Output table: The table with the metadata of the prepared contraints data, which contains the following columns:
constraint_id: The ID of the prepared data.
constraint_description: The description of the prepared data.
\
Description
Performs Location Allocation optimization, enabling users to determine the best locations for facilities (e.g., warehouses, stores, hospitals, service centers) and assign demand points to those facilities. The goal is to minimize total/maximum cost or maximize coverage, while respecting a variety of customizable constraints.
This component uses under the hood for solving the optimization problem.
It is mandatory to use the Prepare components in advance, to ensure correct formatting of input sources.
Inputs
Facilities input table: A table with facilities data, as produced by the .
Demand points input table: A table with demand points data, as produced by the .
Costs matrix input table: A table with costs data, as produced by the .
Constraints input table
Settings
Optimization strategy: The strategy of the solver to solve optimization. It can be one of the following:
Minimize total cost: Minimizes the total sum of costs between assigned demand points and open facilities.
Minimize maximum cost: Minimizes the highest individual cost between any assigned facility-demand point pairs.
Outputs
Output table: The table with the optimized assignments:
facility_id: The ID of the open facilities.
dpoint_id: The ID of the demand point.
Description
A component that splits a gridified area into a set of optimal, continuous territories, ensuring balance according to a specified metric while maintaining internal similarity within each territory.
If the input grid contains disconnected elements—such as isolated cells or cell clusters—the algorithm applies a proxy mechanism to establish connectivity across all regions. However, the presence of numerous unconnected components may lead to unexpected behavior or suboptimal results.
For more details, refer to the official documentation.
Inputs
Input table: The table with the gridify-enriched Area of Interest (AOI).
Settings
Index column: Unique identifier for each spatial cell. Must be an H3 or Quadbin index.
Demand column: The business KPI used for balancing territories. This must be an extensive variable (i.e., one that adds up across space, such as number of points of sale, total population, etc.).
Similarity features(s): Optional variable(s) used to measure similarity between neighboring cells. The similarity score influences the grouping of the cells by penalizing large differences between them:
Outputs
Output table: The table with the specified territories that will contain the index column, the territory ID (cluster) and, if specified, the remaining input columns.
total_explained_variance_ratio: The proportion of variance in the original data that is explained by the clustering solution. This metric is particularly useful when PCA dimensionality reduction is applied, showing how much of the original information is preserved in the reduced dimensions.
number_of_principal_components: The number of principal components extracted from the PCA.
Required facilities geometry column: The column containing the required facility point locations.
Use competitor facilities: Whether to consider competitors. Requires a Competitor Facilities input table to be connected. If enabled:
Competitor facilities ID column: The column containing the competitor facility unique IDs.
Competitor facilities geometry column: The column containing the competitor facility point locations.
Use facility groups: Whether to use facility groups (i.e. region, brand) to limit the number of facilities to be opened.
Candidate facilities group ID column: The column containing the ID representing the group each candidate facility belongs to.
Required facilities group ID column: The column containing the ID representing the group each required facility belongs to. Requires a Required Facilities input table to be connected.
Enforce minimum usage of facilities: Whether to use the minimum usage of a facility.
Candidate facilities minimum usage column: The column containing the minimum usage of each candidate facility.
Required facilities minimum usage column: The column containing the minimum usage of each required facility. Requires a Required Facilities input table to be connected.
Use facility maximum capacities: Whether to use the maximum capacity of a facility.
Candidate facilities maximum capacity column: The column containing the maximum capacity of each candidate facility.
Required facilities maximum capacity column: The column containing the maximum capacity of each required facility. Requires a Required Facilities input table to be connected.
Use facility costs of opening: Whether to use the fixed costs of opening a specific facility.
Candidate facilities cost of open column: The column containing the cost of opening each candidate facility.
Required facilities cost of open column: The column containing the cost of opening each required facility. Requires a Required Facilities input table to be connected.
facility_type: The type of facility: candidate (0), required (1) or competitor (2).
group_id: The selected Group ID column. If no group ID column is provided, it will be filled up with NULL values.
min_usage: The selected Minimum usage column. If no minimum usage column is provided, it will be filled up with NULL values.
max_capacity: The selected Maximum capacity column. If no maximum capacity column is provided, it will be filled up with NULL values.
cost_of_open: The selected Cost of open column. If no cost of open column is provided, it will be filled up with NULL values.
demand: The selected Demand column. If no demand column is provided, it will be filled up with NULL values.
Tranformation function: The function to apply: linear, power, exponential.
Tranformation parameter: The parameter value to use. Only applicable for power and exponential transformations.
cost. The resulting cost.dpoint_id: The column containing the demand point IDs.
compatibility: The column containing the status of the relationship between each specified facility and demand point: required (1) or banned (0). If the facility-demand point pair is not specified in this table, the Locatioon Allocation algorithm will find the optimal assignment itself.
Facilities ID column: The column containing the facility IDs.
Demand points ID column: The column containing the demand point IDs.
table_name. The name of the prepared table containning the necessary data to use constraint_id in the Location Allocation problem.
optionalMaximize coverage: Maximizes the number of demand points within the specified coverage radius of each open facility.
Facilities coverage radius: Maximum distance or time (e.g., in kilometers or minutes) that a facility can cover. Used for the "Maximize coverage" strategy.
Use required facilities: Whether to consider mandatory facilities. If enabled, these facilities are guaranteed to be opened.
Use competitor facilities: Whether to account for competitor locations that may influence demand point assignment.
Cost threshold: The distance under which a demand point is considered fully influenced by a competitor, and thus removed from consideration.
Use facility costs of opening: If enabled, the fixed cost of opening each facility is added to the objective function.
Use maximum budget: Whether to consider a maximum budget for opening facilities. When enabled, the sum of opening costs of selected facilities must not exceed the specified budget limit.
Budget limit: Maximum budget allowed for opening facilities (in cost of open units). This parameter is only available when "Use maximum budget" is enabled.
Limit the total number of facilities: If enabled, restricts the number of facilities that can be opened.
Maximum number of facilities: The maximum number of facilities allowed to open if the above option is enabled.
Limit the total number of facilities per group: If enabled, restricts the number of facilities that can be opened per defined group (e.g., region, brand).
Maximum number of facilities per group: The maximum number of facilities allowed to open per group if the above option is enabled.
Use demand: If enabled, ensures that the full demand of each demand point is met across assigned facilities.
Enforce minimum usage of facilities: If enabled, restricts facilities from being opened unless they meet a minimum utilization level.
Use facility maximum capacities: If enabled, restricts the total amount of demand a facility can serve.
Use required/forbidden assignments: Allows enforcing or excluding specific demand point-facility assignments (e.g., a facility must serve a specific demand point). Requires the Constraints input table to be connected.
Relative Gap Tolerance (%): The maximum allowed relative optimality gap between the best known solution and the theoretical optimum.
Maximum Solving Time (sec): The maximum allowed time taken for the solver to solve the problem (in seconds).
demand: The amount of demand allocated from the facility to the demand point.geom: The line geometry connecting the facility with the assigned demand point.
Metrics table: The table with solver's metrics and statistics:
objective_value: Final value of the objective function (e.g., total cost or coverage).
gap: The relative optimality gap between the best known solution and the theoretical optimum.
solving_time: Time taken for the solver to solve the problem (in seconds).
termination_reason: Reason why the solver stopped (e.g., optimal, infeasible...).
stats: Additional statistics such as open facilities ID, total demand satisfied or percentage of demand covered.
If multiple variables are selected, their values are normalized and averaged.
If a single variable is selected, it is normalized to the [0, 1] range.
If no similarity features are provided, only the demand column is used for partitioning, and similarity is not considered.
Number of territories: The desired number of partitions (i.e., territories) to divide the Area of Interest (AOI) into.
Keep input columns: Whether to include all input columns in the output table.
Inputs
Source table
Settings
File format: Select between CSV and JSON formats for the exported file(s).
Export compressed: Choose between a compressed file (.gz) or uncomprsesed.
GCS Bucket name: A bucket location provided by the user will be used by entering its name. Optionally a path can be added to the bucket name separated by slashes (/). The name of the bucket should follow the GCS specification and the name of the folder(s), in addition to GCS requirements should not contain single quotes (') or backslashes (). The connection (the email which appears on the connections page) should at least have permission for creating and deleting objects. To download the urls in a browser they should grant permissions to the email of an account they can login with. This custom bucket won't work with CARTO Data Warehouse connections unless the bucket is public.
Using this component requires that the connection has BigQuery User permission in the billing project.
Outputs
Result table: A table containing links to the result of the export.
Description
This component creates a Builder map on its first execution. On subsequent executions, it can update, overwrite or create a copy of the existing map.
Inputs
This component can receive any number of source table inputs. Each connected node will result in a datasource (with its corresponding layer) in the created Builder map. These datasources will be tagged as "Temporal" in Builder, since they might be subject to an expiration date or live in a temporal dataset/schema (more info).
If you need to create permanent data sources in your map, use the "Save as Table" component and connect its output to the "Create Builder Map" component. The result will be a standard, persistent data source in your Builder map.
Settings
Map name: A string that defines the name of the created Builder map. Usage of expressions and variables is allowed to generate a name by concatenating strings and variables. For example: {{'Map of the state of ' || @state_name}}
Re-running mode: This setting controls the behaviour of the component after the first execution. It has three different modes:
Update (default): In this mode, executing the workflow updates data sources in the map, syncing them with the currently connected input nodes while keeping map styles, widgets and other map settings.
Overwrite: In this mode, executing the workflow deletes the current map and creates a new one from scratch with default settings. The new map will have a different ID and URL.
Create a copy: In this mode, executing the workflow creates a new copy of the map with same styles and other settings. The copy will reflect the currently connected input nodes.
Output
This component generates a single-row table with three columns that contain the map ID, the map URL and a timestamp of the last execution.
Description
Load a table in the workflow, giving its name as a string. eg: db.schema.table.
Inputs
Source table reference: A string that contains the FQN of the table that will be loaded to the canvas.
Output
Result table
Description
This components imports a table from a URL.
This component needs to be executed (by running the workflow) before being able to connect . This process may take several minutes.
Inputs
Source URL: The public URL to be used to get the file that will be imported.
Check the "Automatically guess column data types in the imported table" checkbox to automatically guess data types on the input file. If disabled, all properties will be imported as a String.
Outputs
Output table: The component generates a table that contains the result of importing the file from "Source URL".
Description
This component perform requests to external endpoints.
Inputs
Source table (Optional): If the component has an input table, columns from this input can be used within expressions in other settings.
Settings
URL : The URL that will be used to perform the HTTP request
Options (Optional): JSON object with options for the HTTP Request. You can use this to set a different method than GET, specify headers or a body for your request. Follows the fetch() API options specification.
Example:
{
"method":"POST",
"headers":{ "Content-Type":"application/json" },
"mode":"cors",
"cache":"default",
"body": "{ ...}"
}
Incorrect method values (like "method":"GETS" , "method":"FOO" ) will be ignored and the request will default to using GET .
Both URL and Options settings allow the usage of expressions and variables:
This is an example of using values from an input's column in the URL, concatenating strings with values from columns, all within an expression enclosed with double curly braces:
&#xNAN;{{'https://your-api-domain.com/coordinates_endpoint?lat=' ||`` ``latitude``|| '&lon=' ||``longitude ||`` ``'&otherparams=...'}}
Using expressions in the Options setting is a bit different, since the content of it is stringified on the data warehouse, and we can use column names between braces directly instead of concatenating strings, for example:
{
"method":"POST",
"headers":{ "Content-Type":"application/json" },
"mode":"cors",
"cache":"default",
"body": "{\"lng\":`` ``{{longitude_column}}, \"lat\":{{latitude_column}}`` ``}"
}
In the examples above, values from latitude and longitude columns have been used with expressions to generate a different URL /options for each row in the input table.
Allowed hosts (Optional): When expressions or variables are used in the URL, you need to set a comma-separated list of allowed hosts for your requests. If expressions/variables are not used in URL, this setting is ignored.
Warning: API Key Security Including an API key or an authorization header in the URL or options object may expose them in data warehouse logs. Handle keys responsibly to prevent unauthorized access.
Outputs
Output table: This components produces a table that contains all columns from the input table (if any was connected) plus a column response_data containing a string with the response's data.
External links
This component makes use of the functions in the http_request module of the CARTO Analytics Toolbox. Check the links below for more specific documentation:
Description
This component defines the output of the Workflow when executed via API.
Inputs
Source table
Settings
Execution method: This setting controls the type of return value of the API call. Sync returns the result immediately, while Async returns a job ID.
Sync: When using this method, the API request is made to the /query endpoint in SQL API to call the procedure, and the response will include the resulting data. Bear in mind that the workflow need to run within the standard timeout limit. This mode is only intended for very short-running workflows.
Async (default): When using this method, the API request is made to the /job endpoint in SQL API to call the procedure. The result of the workflow is stored in a temporary table which FQN is returned in the API response as: "workflowOutputTableName": "workflows-api-demo.workflows_temp.wfproc_f2f8df5df4ddf279_out_33afd785675f081d" , along with the job ID and other job metadata.
This component can only be used once on a workflow.
Find more information about running workflows via API here.
Description
This component defines the output of the Workflow when executed as an MCP Tool.
Inputs
Source table
Settings
Execution method: This setting controls the type of return value of the MCP Tool invocation. Sync returns the result immediately, while Async returns a job ID.
Sync: When using this method, the MCP Tool response will include the resulting data. Bear in mind that the workflow need to run within the standard timeout limit. This mode is only intended for very short-running workflows.
Async (default): When using this method, the MCP Tool response will include a job ID. Agents can make use of internal tools in the CARTO MCP Server to poll the status of the execution and fetch the results when finished.
This component can only be used once on a workflow.
Find more information about running workflows as MCP Tools here.
Description
This component writes an input table to a non-temporary location.
If a table exists under that location, it will be overwritten.
Inputs
Source table
Settings
FQN of table to create/overwrite. There are two different options:
Use the UI to select a destination for the target table in the current connection.
Type a FQN or an expression that uses variables or concatenates strings and variables, like: {{@variable_fqn}} or {{'database.schema.result_table_' || @variable_name}}
Append [true/false]: Determines whether the result will be appended to an existing table; or it will overwritten.
Outputs
The output of this component references the table created in the defined location.
Description
This component sends an email with a custom subject and body. Optionally, it can export data to a bucket and add the resulting link(s) to the email's body.
Inputs
Source table [Table]
Settings
Email address: The email address that will receive the email. This field support multiple addresses as separate recipients for the email.
Subject: The subject of the email.
Body: The content of the email.
Include data: If enabled, data from the input table will be exported to a bucket and resulting link(s) will be added to the email's body.
File format:
CSV
JSON
GCS Bucket name (Optional). If provided the data will be exported to a custom GCS bucket. Optionally a path can be added to the bucket name separated by slashes (/). The name of the bucket should follow the and the name of the folder(s), in addition to should not contain single quotes (') or backslashes (). The connection (the email which appears on the connections page) should at least have permission for creating and deleting objects. To download the urls in a browser they should grant permissions to the email of an account they can login with. This custom bucket won't work with CARTO Data Warehouse connections unless the bucket is public.
Outputs
This component doesn't have an output that can be connected to another node.
Extension Package provided by CARTO
The BigQuery ML extension package for CARTO Workflows includes a variety of components that enable users to integrate machine learning workflows with geospatial data. These components allow for creating, evaluating, explaining, forecasting, and managing ML models directly within CARTO Workflows, utilizing BigQuery ML’s capabilities.
The following table summarises available components, and explains how different components can be connected to each other:
Description
This component trains a classification model using a table of input data.
This component computes a clusterization based on the values of a feature over time (i.e. sales of different stores, temperature in different H3 cells, etc) and assigns a cluster number to each, using different methods to evaluate the values and their distribution over time.
Inputs
Input table: This component expects a table containing a column with at least a timestamp column, a value column and an index to be used as partition (usually an spatial index or a feature id)
Settings
Timestamp column: Select a column in the input that contains timestamps/dates.
Partition column: Each unique value on this column will be assigned a cluster.
Value column: Clusters will be calculated based on the value of this column along time
Number of clusters: Define the number of clusters to be generated.
Method:
Value
Profile
Outputs
Output table: This component will generate a table with the following columns:
Partitioning column with the same name as the given column from inputs.
Cluster (String): Contains the assigned cluster number.
Description
This component derives a spatial composite score as the residuals of a regression model which is used to detect areas of under- and over-prediction. The response variable should be measurable and correlated with the set of variables defining the score. For each data point, the residual is defined as the observed value minus the predicted value. Rows with a NULL value in any of the input variables are dropped.
Inputs
Input table
Settings
Column with unique geographic identifier (spatial index)
Input variables
Response variables
Model regressor
Model options
Output formatting
R-squared threshold
Outputs
Output table
External links
Description
This component combines (spatial) input variables into a meaningful composite score. The composite score can be derived using different methods, scaling and aggregation functions and weights. Rows with a NULL value in any of the model predictors are dropped.
Inputs
Input table
Settings
Column with unique geographic identifier (spatial index)
Input variables
Scoring method
Entropy
Custom Weights:
Variables and weights:When defining custom weights, you will find a button that controls the direction of the weight:
By default, it is set as "Direct" (), which means that the variable's value will weight in to make the final score higher.
Output formatting
None
Equal Intervals
Jenks
Outputs
Result table
Lookup table:only available if output formatting is Equal Intervals, Jenks, or Quantiles
External links
Description
This component computes the Cronbach Alpha Coefficient, which can be used to measure internal consistency of different variables.
This component can be used to determine wether a combination of variables are consistent enough to be used to create a Composite Score using one of the components above.
Inputs
Input table
Settings
Input variables
Outputs
Output table
External links
Description
This component can be used to detect space-time regions where the variable of interest is higher (or lower) than its baseline value. It implements the scan statistics framework developed in this R package to detect space-time regions where the variable of interest is higher (or lower) than its baseline value.
Inputs
Source table [Table]: The source table. It must contain the index_column, the date_column, the value_column, a column named value_column_baseline with the values that should be used as a baseline to detect the anomalies and, when the distributional model parameter is set to ‘GAUSSIAN’ also its variance, value_column_baseline_sigma2.
Settings
Index column [Column]: The name of the column with the unique geographic identifier of the spatial index, either ‘H3’ or ‘QUADBIN’.
Date column [Column]: The name of the column with the date identifier.
Value column [Column]: The name of the column with the variable for which the space-time anomalies should be detected.
Time frequency [Selection]: The temporal frequency of the data selected from one of the following: ‘SECOND’, ‘MINUTE’, ‘HOUR’, ‘DAY’, ‘WEEK’, ‘MONTH’, ‘QUARTER’, ‘YEAR’.
Distributional model [Selection]: The distributional model of the data, either ‘POISSON’ or ‘GAUSSIAN’.
Estimation method [Selection]: The estimation method, either ‘EXPECTATION’ or ‘POPULATION’ for the expectation- and population-based methods respectively.
Prospective analysis [Boolean]: Option to specify if the analysis is retrospective or prospective. In a prospective analysis, only temporal zones that end with the last timestamp are considered and the interest lies in detecting new emerging anomalies. In a retrospective analysis instead, the space-time anomalies can happen at any point in time over all the past data (a temporal zone can end at any timestamp)
High-mean anomalies [Boolean]: Option to specify if the analysis is for detecting space-time zones higher or lower than the baseline.
K-ring size (advanced, optional) [Range]: The minimum and maximum k-ring size used to define the spatial zones.
Temporal window (advanced) [Range]: The minimum and maximum temporal window used to define the temporal zones.
Number of permutations (advanced) [Number]: The number of permutations used to derive the random replicas to test the anomaly statistical significance
Max results (advanced) [Number]: The maximum number of space-time zones returned
Outputs
Output table [Table]: The output table
External links
This component can be used to detect spatial regions where the variable of interest is higher (or lower) than its baseline value. It implements the scan statistics framework developed in this R package to detect spatial regions where the variable of interest is higher (or lower) than its baseline value.
Inputs
Source table [Table]: The source table. It must contain the index_column, the date_column, the value_column, a column named value_column_baseline with the values that should be used as a baseline to detect the anomalies and, when the distributional model parameter is set to ‘GAUSSIAN’ also its variance, value_column_baseline_sigma2.
Settings
Index column [Column]: The name of the column with the unique geographic identifier of the spatial index, either ‘H3’ or ‘QUADBIN’.
Value column [Column]: The name of the column with the variable for which the spatial anomalies should be detected.
Distributional model [Selection]: The distributional model of the data, either ‘POISSON’ or ‘GAUSSIAN’.
Estimation method [Selection]: The estimation method, either ‘EXPECTATION’ or ‘POPULATION’ for the expectation- and population-based methods respectively.
High-mean anomalies [Boolean]: Option to specify if the analysis is for detecting space-time zones higher or lower than the baseline.
K-ring size (advanced, optional) [Range]: The minimum and maximum k-ring size used to define the spatial zones.
Number of permutations (advanced) [Number]: The number of permutations used to derive the random replicas to test the anomaly statistical significance
Max results (advanced) [Number]: The maximum number of space-time zones returned
Outputs
Output table [Table]: The output table
External links
Description
This component computes the Getis-Ord Gi* statistic for each spatial index in the source table.
Inputs
Source table [Table]
Settings
Index column [Column]
Value column [Column]
Kernel function for spatial weights [Selection]
Size [Number]
Outputs
Result table [Table]
External links
Description
This component computes the spacetime Getis-Ord Gi* statistic for each spatial index and datetime timestamp in the source table.
Inputs
Source table [Table]
Index column [Column]
Date column [Column]
Value column [Column]
Kernel function for`` spatial ``weights [Selection]
Kernel function for`` temporal ``weights [Selection]
Size [Number]
Temporal bandwidth [Number]
Time interval [Selection]
Outputs
Result table [Table]
External links
Description
This component runs a spatial-index-based Geographically Weighted Regression (GWR) model.
Inputs
Input table [Table]
Index column [Column]
Feature variables [Column]
Target variable [Column]
K-ring size [Number]
Kernel function [Selection]
Fit intercept [Boolean]
Outputs
Result table [Table]
External links
Description
This component is used to locate hotspot areas by calculating a combined Getis-Ord Gi* statistic using a uniform kernel over one or several variables.
Inputs
Source table [Table]
Index column [Column]
Input variables [Column]
Variable weights [String]
K-ring size [Number]
Significance level [Number]
Outputs
Result table [Table]
External links
Description
This component computes the local Moran's I statistic for each spatial index in the source table.
Inputs
Source table [Table]
Index column [Column]
Value column [Column]
Size [Number]
Decay function [Selection]
Permutations [Number]
Outputs
Result table [Table]
External links
Description
This component computes the Moran's I statistic for each spatial index in the source table.
Inputs
Source table [Table]
Index column [Column]
Value column [Column]
Size [Number]
Decay function [Selection]
Outputs
Result table [Table]
External links
Description
This component takes the output of Getis Ord Spacetime and classifies each location into specific types of hotspots or coldspots, based on patterns of spatial clustering and intensity trends over time.
Inputs
Input table: This component expects to be connected to the output of a Getis Ord Spacetime successfully executed node, or a table that contains that same result.
Settings
Index column: Select a column in the input that contains a spatial index (Quadbin or H3).
Date column: Select a column in the input that contains date/timestamp.
Gi Value: Select a column in the input that contains a Gi value generated by a Getis Ord Spacetime component.
P Value: Select a column in the input that contains a P value generated by a Getis Ord Spacetime component.
Threshold: Select the threshold of the P value for a location to be considered as hotspot/coldspot.
Algorithm: Select the algorithm to be used for the monotonic trend test:
Mann-Kendall (default)
Modified Mann-Kendall
Output
Output table: The component will return a table with the column that contains the index and the resulting columns from the procedure's execution.
For more details, refer to the official ML.CREATE_MODEL documentation.
Inputs
Input table: A data table that is used as input for the model creation.
Settings
Model's FQN: Fully qualified name for the model created by this component.
Unique identifier column: A column from the input table to be used as unique identifier for the model.
Input label column: A column from the input table to be used as source of labels for the model.
Model type: Select the type of model to be created. Options are:
LOGISTIC_REG
BOOSTED_TREE_CLASSIFIER
RANDOM_FOREST_CLASSIFIER
Fit intercept: Determines whether to fit an intercept term in the model. Only applies if Model type is "LOGISTIC_REG".
Max tree depth: Determines the maximum depth of the individual trees. Only applies if Model type is "BOOSTED_TREE_CLASSIFIER".
Number of parallel trees: Determines the number of parallel trees constructed on each iteration. Only applies if Model type is "BOOSTED_TREE_CLASSIFIER".
Minimum tree child weight: Determines the minimum sum of instance weight needed in a child. Only applies if Model type is "BOOSTED_TREE_CLASSIFIER".
Subsample: Determines whether to subsample. Only applies if Model type is "BOOSTED_TREE_CLASSIFIER".
Column sample by tree: Subsample ratio of columns when constructing each tree. A fraction between 0 and 1 that controls the number of columns used by each tree. Only applies if Model type is "BOOSTED_TREE_CLASSIFIER".
Column sample by node: "Subsample ratio of columns when constructing each tree. A fraction between 0 and 1 that controls the number of columns used by each tree. Only applies if Model type is "BOOSTED_TREE_CLASSIFIER".
Data split method: The method used to split the input data into training, evaluation and test data. Options are:
AUTO_SPLIT automatically splits the data
RANDOM splits randomly based on specified fractions
CUSTOM uses a specified column
Data split evaluation fraction: Fraction of data to use for evaluation. Only applies if Data split method is RANDOM or SEQ.
Data split test fraction: Fraction of data to use for testing. Only applies if Data split method is RANDOM or SEQ.
Data split column: Column to use for splitting the data. Only applies if Data split method is CUSTOM.
Outputs
Output table: This component generates a single-row table with the FQN of the created model.
Description
This component trains a forecast model using a table of input data.
For more details, refer to the official ML.CREATE_MODEL documentation.
Inputs
Input table: A data table that is used as input for the model creation.
Holidays table: A table containing custom holidays to use during model training.
Settings
Model's FQN: Fully qualified name for the model created by this component.
Model type: Select a type of model to be created. Options are:
ARIMA_PLUS
ARIMA_PLUS_XREG
Time-series ID column: Column from Input table that uniquely identifies each individual time series in the input data. Only applies if Model type is ARIMA_PLUS and Auto ARIMA is set to true.
Time-series timestamp column: Column from Input table containing timestamps for each data point in the time series.
Time-series data column: Column from Input table containing the target values to forecast for each data point in the time series.
Auto ARIMA: Automatically determine ARIMA hyperparameters.
P value: Number of autoregressive terms. Only applies if Auto ARIMA is set to false.
D value: Number of non-seasonal differences. Only applies if Auto ARIMA is set to false.
Q value: Number of lagged forecast errors. Only applies if Auto ARIMA is set to false.
Data frequency: Frequency of data points in the time series. Used by BigQuery ML to properly interpret the time intervals between data points for forecasting. AUTO_FREQUENCY will attempt to detect the frequency from the data. Options are:
AUTO_FREQUENCY (default)
PER_MINUTE
HOURLY
Holiday region: Region for wich the holidays will be applied. Check .
Clean spikes and dips: Determines whether to remove spikes and dips from the time series data.
Outputs
Output table: This component generates a single-row table with the FQN of the created model.
Description
This component trains a regression model using a table of input data.
For more details, refer to the official ML.CREATE_MODEL documentation.
Inputs
Input table: A data table that is used as input for the model creation.
Settings
Model's FQN: Fully qualified name for the model created by this component.
Unique identifier column: A column from the input table to be used as unique identifier for the model.
Input label column: A column from the input table to be used as source of labels for the model.
Model type: Select the type of model to be created. Options are:
LINEAR_REG
BOOSTED_TREE_REGRESSOR
RANDOM_FOREST_REGRESSOR
Fit intercept: Determines whether to fit an intercept term in the model. Only applies if Model type is "LINEAR_REG".
Max tree depth: Determines the maximum depth of the individual trees. Only applies if Model type is "BOOSTED_TREE_REGRESSOR".
Number of parallel trees: Determines the number of parallel trees constructed on each iteration. Only applies if Model type is "BOOSTED_TREE_REGRESSOR".
Minimum tree child weight: Determines the minimum sum of instance weight needed in a child. Only applies if Model type is "BOOSTED_TREE_REGRESSOR".
Subsample: Determines whether to subsample. Only applies if Model type is "BOOSTED_TREE_REGRESSOR".
Column sample by tree: Subsample ratio of columns when constructing each tree. A fraction between 0 and 1 that controls the number of columns used by each tree. Only applies if Model type is "BOOSTED_TREE_REGRESSOR".
Column sample by node: "Subsample ratio of columns when constructing each tree. A fraction between 0 and 1 that controls the number of columns used by each tree. Only applies if Model type is "BOOSTED_TREE_REGRESSOR".
Data split method: The method used to split the input data into training, evaluation and test data. Options are:
AUTO_SPLIT automatically splits the data
RANDOM splits randomly based on specified fractions
CUSTOM uses a specified column
Data split evaluation fraction: Fraction of data to use for evaluation. Only applies if Data split method is RANDOM or SEQ.
Data split test fraction: Fraction of data to use for testing. Only applies if Data split method is RANDOM or SEQ.
Data split column: Column to use for splitting the data. Only applies if Data split method is CUSTOM.
Outputs
Output table: This component generates a single-row table with the FQN of the created model.
Description
Given a pre-trained ML model and an input table, this component evaluates its performance against some input data provided. The result will contain some metrics regarding the model performance. The user is offered some extra options when the model is a forecasting model.
For more details, refer to the official ML.EVALUATE function documentation.
Inputs
Model: This component's receives a trained model table as input.
Input table: This component receives a data table to be used as data input.
Settings
Forecast: Determines whether to evaluate on the model as a forecast.
Perform aggregation: Determines whether to evaluate on the time series level or the timestep level. Only applies if Forecast is set to true.
Horizon: Number of forecasted time points to evaluate against. Only applies if Forecast is set to true.
Confidence level: Percentage of the future values that fall in the prediction interval. Only applies if Forecast is set to true.
Outputs
Output table: This components produces a table with a predictions column.
Description
Given a pre-trained ML forecast model, this component evaluates its performance using the ARIMA_EVALUATE function in BigQuery.
Inputs
Model: This component's receives a trained model table as input.
Settings
Show all candidate models: Determines whether to show evaluation metrics for all candidate models or only for the best model.
Outputs
Output table: This component produces a table with the evaluation metrics.
Description
Given a pre-trained ML model and an input table, this component runs an explainability analysis invoking the EXPLAIN_FORECAST function in BigQuery.
Inputs
Model: This component's receives a trained model table as input.
Input table: This component receives a data table to be used as data input. It can only be used with ARIMA models. The execution will fail if a different model is selected.
Settings
Model type: Select the type of model to be used with the input. Options are:
ARIMA_PLUS
ARIMA_PLUS_XREG
Horizon: The number of time units to forecast into the future.
Confidence level: The confidence level to use for the prediction intervals.
Outputs
Output table: This component produces a table with the explainability metrics.
Description
Given a pre-trained ML model, this component generates a predicted value and a set of feature attributions for each instance of the input data. Feature attributions indicate how much each feature in your model contributed to the final prediction for each given instance.
For more details, refer to the official ML.EXPLAIN_PREDICT function documentation.
Inputs
Model: This component's receives a trained model table as input.
Input table: This component receives a data table to be used as data input.
Settings
Number of top features: The number of the top features to be returned.
Outputs
Output table: This component produces a table with the attribution per feature for the input data.
Description
Given a pre-trained ML model and an optional input table, this component infers the predictions for each of the input samples. Take into account that the actual forecasting happens when creating the model, this component only retrieves the desired results.
For more details, refer to the ML.FORECAST function documentation.
Inputs
Model: This component's receives a trained model table as input.
Input table: This component receives a data table to be used as data input. It can only be used with ARIMA models. The execution will fail if a different model is selected.
Settings
Model type: Select the type of model to be used with the input. Options are:
ARIMA_PLUS
ARIMA_PLUS_XREG
Horizon: The number of time units to forecast into the future.
Confidence level: The confidence level to use for the prediction intervals.
Outputs
Output table: This component produces a table with the predictions column.
Description
This component loads an pre-existing model in BigQuery into the expected Workflows format to be used with the rest of BigQuery ML components.
Inputs
Model FQN: Fully-qualified name to get the model from.
Outputs
Output: This component returns a model that can be connected to other BigQuery ML components that expect a model as input.
Description
Given a pre-trained ML model, this component lets you provide explanations for the entire model by aggregating the local explanations of the evaluation data. It returns the attribution of each feature. In the case of classification model, an option can be set to provide explanation for each class of the model.
For more details, refer to the official ML.GLOBAL_EXPLAIN function documentation.
Inputs
Model: This component's receives a trained model table as input.
Settings
Class level explain: Determines whether global feature importances are returned for each class in the case of classification.
Outputs
Output table: This component produces a table with the attributions per row of the input data.
Description
This component imports an ONNX model from Google Cloud Storage. The model will be loaded into BigQuery ML using the provided FQN and it will be ready to use in Workflows with the rest of ML components.
Settings
Model path: Google Cloud Storage URI (gs://) of the pre-trained ONNX file.
Model FQN: A fully-qualified name to save the model to.
Overwrite model: Determines whether to overwrite the model if it already exists.
Outputs
Output table: This component returns a model that can be connected to other BigQuery ML components that expect a model as input.
Description
Given a pre-trained ML model and an input table, this component infers the predictions for each of the input samples. A new variable prediction will be returned. All columns in the input table will be returned by default; an option can be unmarked to select a single ID column that will be returned with the prediction.
For more details, check out the ML.PREDICT function documentation.
Inputs
Model: This component's receives a trained model table as input.
Input table: This component receives a data table to be used as data input.
Settings
Keep input columns: Determines whether to keep all input columns in the output or not.
ID column: Select a column from the input table to be used as the unique identifier for the model. Only applies if Keep input columns is set to false.
Outputs
Output table: This component produces a table with the predictions column.

Workflows are the foundation for creating MCP Tools in CARTO. Each tool exposed through the CARTO MCP Server is powered by a workflow that defines how spatial problems are solved, the data required as input, and the results returned to the AI Agent.
This page explains how to configure workflows for MCP, including how to define outputs, write clear tool descriptions, and keep tools in sync as you make updates. By following these steps, you ensure that your AI Agents can reliably use the tools you create to answer geospatial questions.
Each MCP Tool is backed by a Workflow. For every type of question or task you want the agent to answer, you will need to create a dedicated Workflow.
Use the MCP Tool Output component in Workflows to define the expected output of the tool.
Select Sync mode if the workflow returns results immediately.
Select Async mode if the workflow requires longer processing time and results are retrieved after execution.
Click on the three-dot options button in the top-right corner to access the MCP Tool dialog.
Once in the dialog:
Write a clear description that explains the purpose of the tool.
Define all input parameters with explicit names and descriptions.
Enable the tool so it can be exposed through the MCP Server.
When you make updates to a workflow, ensure you sync the workflow. This propagates the changes to the MCP Tool so the AI Agent always uses the latest version.
Once Workflows is enabled as an MCP Tool, you can give your Agents in CARTO or outside CARTO access to it.
To learn how to connect to the CARTO MCP Server and use your MCP Tools in other MCP-compliant agentic applications, .
If you want to provide access to the Workflows MCP Tool to an AI Agent in CARTO, .
SEQ splits sequentially
NO_SPLIT uses all data for training.
DAILY
WEEKLY
MONTHLY
QUARTERLY
YEARLY
SEQ splits sequentially
NO_SPLIT uses all data for training.




If set to "Inverse" (), the variable's value will weight in to make the final score lower.
When specifying custom weights, the component will internally apply a normalization so that the sum of all weights is equal to 1
Scaling function:
MIN_MAX_SCALER: data is rescaled into the range [0,1] based on minimum and maximum values. Only numerical variables are allowed.
STANDARD_SCALER: data is rescaled by subtracting the mean value and dividing the result by the standard deviation. Only numerical variables are allowed.
RANKING: data is replaced by its percent rank, that is by values ranging from 0 lowest to 1. Both numerical and ordinal variables are allowed (categorical and boolean variables need to be transformed to ordinal).
DISTANCE_TO_TARGET_MIN(_MAX,_AVG):data is rescaled by dividing by the minimum, maximum, or mean of all the values. Only numerical variables are allowed.
PROPORTION: data is rescaled by dividing by the sum total of all the values. Only numerical variables are allowed.
Aggregation Function:
LINEAR: the spatial composite is derived as the weighted sum of the scaled individual variables.
GEOMETRIC: the spatial composite is given by the product of the scaled individual variables, each to the power of its weight.
First PC
Correlation variable: the spatial score will be positively correlated with the selected variable (i.e. the sign the spatial score is set such that the correlation between the selected variable and the first principal component score is positive).
Correlation threshold: the minimum absolute value of the correlation between each individual variable and the first principal component score.
Quantiles
Return Range
This component takes a table with H3 cell indices and generates a new one with an additional column containing the boundaries of those H3 cells.
Inputs
Source table [Table]
H3 column [Column]
Outputs
Result table [Table]
External links
Description
This component adds a new column containing the centers of H3 cells.
Inputs
Source table [Table]
H3 column [Column]
Outputs
Result table [Table]:
External links
Description
This component computes the H3 grid distance between two H3 index columns, creating a new table with the same columns as the source one and adding a new 'h3_distance' column.
The H3 distance is defined as the length of the shortest path between the cells in the graph formed by connecting adjacent cells.
This function will return 'null' if the cells are too far apart to compute the distance.
Inputs
Source table [Table]
First H3 index column [Column]
Second H3 index column [Column]
Outputs
Result table [Table]:
External links
Description
This component adds a new column containing the indices of H3 cells corresponding to points.
Inputs
Source table [Table]
Points column [Column]
Resolution [Number]
Outputs
Result table [Table]
External links
Description
Returns the neighboring indexes in all directions under the K distance size.
Inputs
Source table [Table]
H3 index [Column]
Size [Number]
Include distance [Boolean]: add or not column with the value of H3 distance
Include columns [Boolean]: copy or not source columns in the result table
Outputs
Result table [Table]:
External links
Description
This generates a table with indices of all H3 cells included within a given extent for a given resolution.
The extent is specified using a table with polygons.
Inputs
Source table [Table]
Geo column [Column]
Resolution [Number]
Keep input table columns: If checked, all the original columns of the source table will be added to the result table. Otherwise, only columns with the geometry and the H3 index will be added.
Mode:
Center:Returns only the cells which centroid intersects with the input features.
Intersects:Returns all cells that intersect with the input features.
Outputs
Result table [Table]:
External links
Description
Adds a new column named h3_parent with the value of the parent h3 at a specific resolution. If required resolution is higher than h3 column, a null value will be set.
Inputs
Source table [Table]
H3 index [Column]
Resolution [Number]
Outputs
Result table [Table]:
External links
Description
This component takes a table with Quadbin cell indices and generates a new one with an additional column containing the boundaries of those Quadbin cells.
Inputs
Source table [Table]
Quadbin column [Column]
Outputs
Result table [Table]
External links
Description
This component adds a new column containing the centers of Quadbin cells.
Inputs
Source table [Table]
Quadbin column [Column]
Outputs
Result table [Table]
External links
Description
This component adds a new column containing the indices of Quadbin cells corresponding to points.
Inputs
Source table [Table]
Points column [Column]
Resolution [Number]
Outputs
Result table [Table]
External links
Description
Returns all Quadbin cell indexes and their distances in a filled square k-ring of a user defined size centered at the origin in no particular order.
Inputs
Source table [Table]
Quadbin index [Column]
Size [Number]
Include distance [Boolean]: add or not column with the value of Quadbin distance
Include columns [Boolean]: copy or not source columns in the result table
Outputs
Result table [Table]
External links
Description
This generates a table with indices of all Quadbin cells included within a given extent for a given resolution.
The extent is specified using a table with polygons.
Inputs
Source table [Table]
Geo column [Column]
Resolution [Number]
Mode:
Center:Returns only the cells which centroid intersects with the input features.
Intersects:Returns all cells that intersect with the input features.
Contains:Returns only the cells that are entirely contained within the input features.
Outputs
Result table [Table]
External links
Description
It adds a new column named quadbin_parent with the value of the parent quadbin at a specific resolution. If required resolution is higher than quadbin column, than an "Invalid resolution" query error will be returned.
Inputs
Source table [Table]
Quadbin index [Column]
Resolution [Number]
Outputs
Result table [Table]
External links
A workflow can be executed via an API call, which allows merging workflows with bigger processes, running analytics pipelines from external applications and further integrations that enable a wide range of use cases.
CARTO Workflows translates the analytical pipelines designed in the canvas to cloud native SQL queries that are executed directly on the data warehouse. This generated SQL code has different forms:
The code displayed in the contains all the control structures and creation of intermediate tables for each node in the workflow.
A stored procedure that is created in , executed when a workflow execution is triggered via API call. This stored procedure might have input parameters that are used later in the code, that are created from the marked as 'Parameters'. \
The ability of workflows to generate SQL code that can be executed via an API call represents a significant advancement in integrating complex analytical processes with external systems. This capability not only streamlines the process of executing workflows in response to external triggers but also opens up a myriad of possibilities for automation and scalability within data analysis projects.
To enable API access for an existing workflow, click on the three dots in the upper-right corner and find 'API'. Click on 'Enable API access' and you will see a dialog screen that looks like this:
The endpoint displayed in this dialog depends on the configuration of the API Output component.
When using Sync, the /query endpoint will be used. The response will contain the data
When using Async (default), the /job endpoint will be used. The response will contain job metadata that can be used to poll the status and finding the table that is storing the result. .
This is an example of an API call that would trigger the execution of a workflow:
Synchronous GET example:
Asynchronous POST example:
There are a few things that are worth taking into consideration:
It is actually a GET or POST request to CARTO's , which can be used to get the result of a query directly in the response, o to create asynchronous jobs that are executed in the data warehouse.
The query is a CALL statement for a stored procedure. Workflows will generate a stored procedure in your schema/dataset. The generated SQL is the same that you will obtain by using the .
The
The provided example uses curl to illustrate the POST request that triggers the execution of the workflow. This example call should be easily adapted to other methods in different languages, like the library in Python.
The response to that call will look like the following:
You can use that externalId to make a request to check the status of the execution, like:
The above is just an example that is using a specific API URL (gcp-us-east1.api.carto.com) and connection name. Please make sure that you use the same API endpoint that you obtained from the UI and you used to create the job in the first place.
The response to that API call is a JSON that contains metadata about the query execution, including a status object with information about the execution status.
The following is an example of a failed execution due to incorrect parameter types:
While this would be an example of a successful execution:
Get more information about using the job endpoint in the .
In order to define which node of your workflow will be used as output of the API call, you need to use the component. This component will ensure that the content of the node connected to it is stored in a temporary table, which location is returned in the "workflowOutputTableName" object in the response of the API call.
The location of the temporary table is returned when the job is created, but the content of the table won't be complete until the execution is complete.
API executions of the same workflow, but with different queryParameters will produce different output tables. On the other hand, API executions with same queryParameters will yield the same result, as the same output table will be reused.
This behavior can be controlled with the in your Workflows UI. Setting it off will always force re-execution of the workflow, even if consecutive API calls with same queryParameters are performed.
Whenever you want to propagate changes in your workflow to the corresponding stored procedure that is executed with an API call, you need to update it.
For that, just click on the chip in the top header that says 'API enabled', which will open the API endpoint modal. Wait for a couple of seconds while CARTO checks for changes in the workflow and you will see this:
Click on 'Update' to sync the workflow that is executed via API with the current state of the workflow in the UI.
If you need to prevent API access for a workflow, just click on 'Disable API access' in the API endpoint modal and confirm:
Contains:Returns only the cells that are entirely contained within the input features.queryParametersThe provided API call is authorized with an API Access Token that has the specific grants necessary to run that query through same connection that was used to create the workflow.




https://gcp-us-east1.api.carto.com/v3/sql/your_connection/query?q=CALL `workflows-api-demo.workflows_temp.wfproc_f2f8df5df4ddf279`(@number_of_clusters,@buffer_radius,@store_type)&queryParameters={"number_of_clusters":2,"buffer_radius":500,"store_type":"Supermarket"}&access_token=eyJhbGc(...)GUH-Lwcurl --location 'https://gcp-us-east1.api.carto.com/v3/sql/your_connection/job' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer eyJhbGc(...)GUH-Lw' \
--data '{
"query": "CALL `workflows-api-demo.workflows_temp.wfproc_f2f8df5df4ddf279`(@number_of_clusters,@buffer_radius,@store_type)",
"queryParameters": {"number_of_clusters":2,"buffer_radius":500,"store_type":"Supermarket"}
}'{
"externalId": "job_h8UWzpdzX0s2XAAXQd3rdWCdPUT9",
"accountId": "ac_jfakef5m",
"userId": "auth0|61164b7xxx77c006a259f53",
"connectionId": "5a32a0ea-555a-48dd-aeb1-8768aae8ef1c",
"metadata": {},
"createdAt": "2023-10-04T16:10:35.732Z",
"query": "CALL `workflows-api-demo.workflows_temp.wfproc_f2f8df5df4ddf279`(@number_of_clusters,@buffer_radius,@store_type)",
"jobMetadata": {
"location": "US",
"workflowOutputTableName": "workflows-api-demo.workflows_temp.wfproc_f2f8df5df4ddf279_out_33afd785675f081d"
},
"token": "eyJhbGc(...)GUH-Lw"
}curl --location 'https://gcp-us-east1.api.carto.com/v3/sql/your_connection/job/job_h8UWzpdzX0s2XAAXQd3rdWCdPUT9' \
--header 'Authorization: Bearer eyJhbGc(...)GUH-Lw' "status": {
"errorResult": {
"reason": "invalidQuery",
"location": "query",
"message": "Query error: Cannot coerce expression @number_of_clusters to type INT64 at [1:158]"
},
"state": "DONE"
} "status": {
"state": "DONE"
}To improve performance and to be able to inspect results in intermediate steps, Workflows makes use of temporary data objects that are stored, by default, under a so called carto_temp schema/dataset in your data warehouse.
Note that at the time of creating the connection to your data warehouse, you can also provide a custom location for storing the temporary data generated by Workflows. Check the Connections documentation for each provider for more details.
This system of intermediate tables functions as a cache, avoiding re-executing all the steps in a workflow when they have not been modified; thus, executing again only those that have been changed or their inputs have changed when you are re-running your workflows.
In order for CARTO to be able to execute workflows in your data warehouse and temporarily store the data generated in intermediate steps of your analysis, there is a specific set of permissions required on the connection selected when creating your workflow.
Since CARTO Workflows can be used with different providers (Google BigQuery, Snowflake, Amazon Redshift, and PostgreSQL), adapting the different terminology for each data warehouse, the recommended setup is:
The or user account used for creating the connection must have the following required roles:
BigQuery Data Editor
BigQuery User
In order to learn more details regarding data warehouse connection permissions please check .
When your workflow runs, it stores the output results in intermediate tables. Depending on the data warehouse platform, a different strategy is applied to clean up the intermediate tables generated by CARTO Workflows. The default configuration removes the intermediate tables after 30 days.
Below, you can find code and instructions for automating the process of cleaning up the intermediate tables in each data warehouse:
When your workflow runs in BigQuery or CARTO Data Warehouse, these tables are temporary and will be automatically removed after 30 days.
You can find more information about temporary tables in the .
When your workflow runs in Snowflake, a is automatically created in the WORKFLOWS_TEMP schema to remove the intermediate tables after 30 days. This task is executed periodically every day at 0h UTC.
These are the required permissions to create the task:
Note that the task affects only the tables in that schema. If the name of the schema is changed in the connection, for example to CARTO_TEMP_CUSTOM, a new task will be created for the new schema.
When executing workflows from the UI, intermediate tables are created for each node in the workflow. Depending on the connection used for your workflow, these intermediate tables will be reused, avoiding recreation when the workflow structure or settings have not changed.
In order to control this behavior, you will find the Cache Settings menu, next to the 'Run' button:
When enabled, which is the default option, it will try and reuse previous successful executions of the workflow and its intermediate steps, with some differences depending on the data warehouse:
CARTO Data Warehouse, BigQuery and Snowflake: Intermediate and result tables will be reused as long as the the workflow structure have not changed.
This setting applies to executions from the UI, triggered by clicking on the 'Run' button.
This setting also applies to (beta).
When disabled, all intermediate and result tables will be always be recomputed in all execution modes (UI, Schedule and API call), regardless of updates to source tables or parameter values.
Components to compute operations and measures on geometries/geographies, or to alter geometry/geography size or shape.
Description
This component creates a new table with a distance matrix computed for the geometries contained in an input table.
The output table contains three columns: 'id1', 'id2' and 'distance'.
Inputs
Components that given the specifications in the input generate new sets of spatial data; those being points, line, polygons and even tilesets.
Description
This component creates a new table containing a 'geom' column and a single row containing a geography that represents a bounding box as defined by the input min/max values.
Inputs
Source table [Table]
Geo column [Column]
Id column [Column]
Outputs
Result table [Table]
Description
This component calculates the ellipsoidal direct point-to-point, point-to-edge, or the drive distance between two sets of spatial objects.
There are three types of distance calculation relationships available:
Point-to-point: The Distance tool calculates the distance between two sets of point type spatial objects. An example is calculating the distance from records in a customer data table (point type) to records in a store data table (point type).
Point-to-edge (line): The Distance tool calculates the distance between a set of point type spatial objects and the closest edge of a set of line type spatial objects. An example is calculating the distance from the records in a customer data table (point type) to the nearest interstate highway (edge of line spatial type).
Point-to-edge (polygon): The Distance tool calculates the distance between a set of point type spatial objects and the closest edge of a set of polygon type spatial objects.
When the points are contained within the polygons, you can choose one of two options: Return a distance of zero, or Return the distance to the nearest edge. If the point is located inside of a polygon the DISTANCE component returns a negative number for the distance to the edge of the containing polygon.
When the points are contained outside of the polygons, you can also choose to have the DISTANCE component calculate the Nearest Interior point within the nearest edge of a polygon.
While this method of calculating distance takes the curvature of the Earth into consideration, it is not a routing routine and does not follow street patterns or honor physical barriers such as bodies of water.
Inputs
Source table [Table]
Point or centroid source [Column]
Point, line or polygon destination [Column]
When a point is inside a polygon [Selection]
Create nearest interior point inside of the polygon [Boolean]
Output direction in degrees [Boolean]
Output cardinal direction [Boolean]
Units [Selection]
Outputs
Result table [Table]
Description
This component identifies the shortest distance between spatial objects in one table and the objects in a second table.
There are many use cases for this component. For example, to find the closest store locations to consumers in the customer file (both point tables), identify the closest cell towers (point tables) to LATAs (polygon tables), or select congressional districts (polygon tables) within 50 miles of a major thoroughfare (line table).
Inputs
Targets [Table]
Universe [Table]
Target Spatial Object Field [Column]
Universe Spatial Object Field [Column]
Id column in target table [Column]
Id column in universe table [Column]
Maximum Distance [Number]
Units [Selection]
Outputs
Result table [Table]
Description
This component generates a new table that is ready to be visualized with the CARTO Maps API. Learn more abou this in Performance Considerations section.
Inputs
Source table
Settings
Geo column: Select a binary column that contains WKB geometries.
Outputs
Output table: This component returns a table with the same schema as the input one, plus additional columns describing the feature bounding box; the geometry column as WKB binaries and clustered by the boundbinx columns. Learn more.
Description
This component extracts tabular information about the spatial object. Attributes like area, spatial object, number of parts, number of points, and centroid Latitude /Longitude coordinates can be revealed.
Inputs
Source table [Table]
Spatial Object Field [Column]
Area (Square Kilometers) [Boolean]
Area (Square Miles) [Boolean]
Bounding Rectangle as Points [Boolean]
Bounding Rectangle as Polygon [Boolean]
Bounding Rectangle as X and Y Fields [Boolean]
Centroid as Spatial Object [Boolean]
Centroid as X and Y Fields [Boolean]
End Points as Spatial Object [Boolean]
End Points as X and Y Fields [Boolean]
Length (Kilometers) [Boolean]
Length (Miles) [Boolean]
Number of Parts [Boolean]
Number of Points [Boolean]
Spatial Object Type [Boolean]
Outputs
Result table [Table]
Description
This component creates a new table with an additional column containing the area of the geographies in the input table. This only applies to the case of polygon geographies. For other geography types, the value will be zero.
Inputs
Source polygons table [Table]
Geo column [Column]
Outputs
Result table [Table]
External links
Description
This component adds a new column with a buffer of the original input geography.
Inputs
Source table [Table]
Geo column [Column]
Distance [Number]
Outputs
Result table [Table]
External links
Description
This component creates a new table with a new column containing points representing the centroids of the original input geographies.
Inputs
Source table [Table]
Geo column [Column]
Outputs
Result table [Table]
External links
Description
This component creates a new table containing a single row with the concave hull of point geographies in the input table. The concave hull is stored in a column named 'geo'.
Inputs
Source table [Table]
Geo column [Column]
Max edge (km) [Number]
Outputs
Result table [Table]
External links
Description
This component creates a new table containing a single row with the convex hull of geographies in the input table. The convex hull is stored in a column named 'geo'.
Inputs
Source table [Table]
Geo column [Column]
Outputs
Result table [Table]
External links
Description
This component calculates the minimum distance between geographies of an input table and those of a second table.
It adds two new columns: 'nearest_id' and 'nearest_distance'.
Inputs
Main table [Table]
Secondary table [Table]
Geo column in main table [Column]
Geo column in secondary table [Column]
Id column in main table [Column]
Id column in secondary table [Column]
Radius [Number]
Outputs
Result table [Table]
External links
Description
This component calculates if geographies, of an input table and those of a second table that match a given criteria to join them, are within a given distance.
Inputs
Source table
Source table
Settings
Geo column in main table
Geo colum in second table
Column in main table for joining
Column in second table for joining
Max distance
Outputs
Output table
Description
This component creates a new table with an additional 'length' column containing the length of the lines in the input table.
Inputs
Source lines table [Table]
Geo column [Column]
Outputs
Result table [Table]
External links
Description
This component takes a table and columns containing lines and distances. It adds a new column with points along the lines, at a specified distance from the start of each line.
Inputs
Source table [Table]
Lines column [Column]
Distance [Number]
Outputs
Result table [Table]
External links
Description
This component adds a new column containing the perimeter of the geographies in the input table. This only applies to the case of polygon geographies. For other geography types, the value will be zero.
Inputs
Source polygons table [Table]
Geo column [Column]
Outputs
Result table [Table]
External links
Description
This component adds a new column containing a simplified version of the original input geography.
Inputs
Source table [Table]
Geo column [Column]
Tolerance [Number]
Outputs
Result table [Table]
External links
Description
This component takes an input table and creates a new one replacing its geo column by a new column containing the original input geography snapped to a grid a specified size (in degrees).
Arbitrary grid sizes are not supported. The grid_size parameter is rounded so that it is of the form 10^n, where -10 < n < 0.
Inputs
Source table [Table]
Geo column [Column]
Grid size (degrees) [Number]
Outputs
Result table [Table
External links
Description
This component subdivided large features into smaller ones, using a quadgrid. It returns multiple rows, each with a smaller portion of the input feature and the rest of columns.
Inputs
Source table [Table]
Geo column [column]
Size: The size is the approximate length, in meters, of the cell's side used for the subdivision.
Description
This component creates regions around specified point objects in the input table.
Inputs
Source table [Table]
Spatial Object Field of Point Source [Column]
Radius, Doughnuts [String]
Units [Selection]
Include Point in Output [Boolean]
Eliminate Overlap [Boolean]
Outputs
Result table [Table]
Minimum X [Number]
Minimum Y [Number]
Maximum X [Number]
Maximum Y [Number]
Outputs
Result table [Table]
Description
This component creates a new table containing the geographies of the input table clipped to the combined boundary of the geographies in another table. Geographies in this second table must be of type polygon.
Inputs
Table to clip [Table]
Polygons table [Table]
Geo column in table to clip [Column]
Geo column in polygons table [Column]
Outputs
Result table [Table]
Description
This component creates a new table containing points distributed in a regular grid, according to a given interval. Point geographies are stored in a column named 'geo'. The extent to cover is defined by an ancillary table with geographies.
Inputs
Reference table (used for extent) [Table]
Geo column [Column]
Interval (degrees) [Number]
Outputs
Result table [Table]
Description
This component computes isolines.
Inputs
Source table
Settings
Geo column: select a column that contains points that will be the origin point of the isoline.
Mode: Select a mode of transport for the isoline calculation.
Type of range: Select between time or distance for the isoline calculation.
Range value (seconds/meters): Input a numeric value that represent seconds if time was selected or meters if distance was selected.
Provider-specific settings:
TravelTime:
Departure time: Take into account historical traffic conditions to generate the isoline. Accepted values are now or a timestamp in this format: YYYY-MM-DDThh:mm:ss, for example: 2024-01-31T09:30:00
Single shape (default true): Force the resulting isoline to be a single polygon.
TomTom
Departure time: Accepted values are any, now or a timestamp in this format: YYYY-MM-DDThh:mm:ss, for example: 2024-01-31T09:30:00
Use traffic information if available: Check this option to take into account historical traffic conditions to generate the isoline.
HERE
Arrival Time: any, now or timestamp (YYYY-MM-DDThh:mm:ss). Not compatible with Departure Time
Departure Time: any, now or timestamp (YYYY-MM-DDThh:mm:ss). Not compatible with Arrival time
Optimize for: balanced (default), quality, performance.
Routing mode: fast (default), short
Origin radius (m): Consider any position in the radius of the origin point as origin.
Advanced Options:
Isoline Options: A JSON object that can contain other provider specific options, for example: {"option1":"value", "option2": 123}. Check each provider documentation for a complete reference of available options in the External links section below.
Outputs
Result table: The component generates a table that contains the following columns:
All columns from input table.
An orig_geom column that contains the original point geometry in the selected geo column.
A geom column that contains the generated isoline.
A carto_isoline_metadata column that can contain additional information. Usually details if the isoline failed for some reason. In most scenarios it will be null if the isoline was generated correctly.
External links
Description
This component creates routes between origin and destination points stored in a table.
Inputs
Source table [Table]
Settings
Origin column [Column]
Destination column [Column]
Name for column with routes [String]
Mode [Selection]
Outputs
Result table [Table]
External links
Description
This component computes the routing matrix between a set of origins and destinations and return their distances and duration until arrival.
Inputs
Origin table: A table that contains points with the origin points for the routes.
Destination table: A table that contains points with the destination points for the routes.
Settings
Type of transport: Select the type of transport for the route computation. The available options will depend on the provider. See the 'External links' reference for a full description of the options.
Departure time: (Optional) Set the departure time for the route computation. Date-time as "<YYYY-MM-DD>T<hh:mm:ss>" . Default is "now" or "any" if your provider is TomTom.
Travel time: (Not available with TomTom) The maximum time allowed for the route calculation. If a set of origin/destination produces a route that is longer that this setting (in seconds) it won't be computed.
Advanced settings: a free-text input that supports a JSON formatted string with advanced options. See 'External links' section below for a list of additional options supported on each provider.
Outputs
Result table: This component generates a table that is the cartesian product of both input tables with additional columns for the route computation.
External links
On July, 17th 2024 this component was deprecated in favor of the components in the Tileset Creation category.
Description
This component creates a tileset.
Inputs
Source table [Table]
Geo column [Column]
Tileset name [String]
Tileset description [String]
Minimum zoom [Number]
Maximum zoom [Number]
Minimum zoom column [Column]
Maximum zoom column [Column]
Maximum tile size (KB) [Number]
Tile feature order [String]
Drop duplicates [Boolean]
Outputs
External links
Description
This component creates a new table containing a single row with a single point geography. The point is stored in a column named 'geo'.
Inputs
Longitude [Number]
Latitude [Number]
Outputs
Result table [Table]
Description
This component generates a new geo column with polygons without holes.
Inputs
Source polygons table [Table]
Geo column [Column]
Outputs
Result table [Table]
Description
This component performs high-level spatial object editing from a simple, single component. You can combine multiple objects or cut the spatial objects of the input table.
With this component you can combine objects, cut 1st from 2nd, cut 2nd from 1st, create the intersection object or create the inverse intersection object. Define the action that you want to do using the “Action” field.
Inputs
source table [Table]
1st Spatial Field [Column]
2nd Spatial Field [Column]
Action [Selection]
Outputs
Result table [Table]
Description
This component adds a new column with the boundary of the input geography.
Inputs
Source table [Table]
Geo column [Column]
Result table [Table]:
External links
Description
This component creates a new column containing rectangles that represents the bounding box of each geometry.
Inputs
Source table [Table]
Geo column [Column]
Outputs
Result table [Table]
External links
Description
This component creates a new table containing a 'geom' column and a single row containing a geography that represents the bounding box of the input table.
Inputs
Source table [Table]
Geo column [Column]
Outputs
Result table [Table]
External links
Description
This component generates a table with a specified number of random points within a extent.
The extent is specified using a table with polygons.
Inputs
Source table [Table]
Geo column [Column]
Number of points [Number]
Outputs
Result table [Table]
External links
Description
This component geocodes an input table.
It generates a new table with an additional column named 'geo' containing the geocoded addresses.
Inputs
Source table [Table]
Column with addresses [Column]
Country [Selection]
Outputs
Table with geocoded rows [Table]
Table with rows that could not be geocoded [Table]
External links
Description
This component creates a new table with an additional column with line geographies. Geographies are created based on another column containing arrays of points.
Inputs
Source table [Table]
Points column [Column]
Outputs
Result table [Table]
External links
Description
This component creates a new table with an additional column with polygon geographies. Geographies are created based on another column containing polylines.
Inputs
Source table [Table]
Polylines column [Column]
Outputs
Result table [Table]
External links
Description
This component adds a new column with polygons, created from a column containing arrays of lines representing polygon edges.
Inputs
Source table [Table]
Geo column [Column]
Outputs
Result table [Table]
External links
-- If the workflows_temp schema already exists:
-- Grant permissions to manage the schema
GRANT ALL PRIVILEGES ON SCHEMA DATABASE_NAME.WORKFLOWS_TEMP TO ROLE ROLE_NAME;
-- If the workflows_temp schema does not exist yet:
-- Grant permissions to create the schema
GRANT USAGE ON DATABASE DATABASE_NAME TO ROLE ROLE_NAME;
Your Databricks user will need to have USE CATALOG and CREATE SCHEMA permission.
GRANT USE CATALOG ON CATALOG my_catalog TO [email protected];
GRANT CREATE SCHEMA ON CATALOG my_catalog TO [email protected];-- If the workflows_temp schema already exists:
-- Grant permissions to manage the schema
GRANT ALL ON SCHEMA workflows_temp TO user_name;
-- If the workflows_temp schema does not exist yet:
-- Grant permissions to create the schema
GRANT CREATE ON DATABASE database_name TO user_name;The task is created once per schema, and will be removed when the schema is removed. It can also be removed manually:
DROP TASK IF EXISTS DATABASE_NAME.CARTO_TEMP."carto_scheduler.workflow_clear_cache.workflows_temp"In Databricks, a user with enough privileges will need to set up a workflow or notebook that runs periodically to clean up the content of the carto_temp schema created in the selected catalog.
When your workflow runs in Redshift, in order to remove the intermediate tables, you should create a scheduled query in Redshift to remove datasets older than 30 days.
Scheduled queries are only available for provisioned Redshift clusters. If you are using Redshift Serverless you'll have to execute manually the clean up, e.g. calling the procedure defined below, or find other external means of executing it periodically.
The following procedure should be created in your Redshift cluster to delete tables older than 30 days (you will have to replace all the occurrences of the $database$ placeholder with the name of your database):
CREATE OR REPLACE PROCEDURE $database$.carto_temp._clear_cache_fn() as
$$
DECLARE
statement RECORD;
query VARCHAR(MAX);
BEGIN
query := 'SELECT
\'DROP TABLE $database$.workflows_temp.\' || relname || \';\' as statement
After that, you should define a Redshift scheduled query with the following CALL to execute the previous procedure once per day:
When your workflow runs in PostgreSQL, a pg_cron task is automatically created in the carto_temp schema to remove the intermediate tables after 30 days. This task is executed periodically every day at 0h UTC.
Here are some resources to learn how to use pg_cron on different managed services based on PostgreSQL:
Amazon RDS. Usage guide available here.
Amazon Aurora PostgreSQL supports pg_cron since version 12.
Google Cloud SQL. Setup instructions .
Azure Databases for PostgreSQL supports pg_cron since and provides .
This is the command to install the pg_cron extension:
These are the required permissions to create the pg_cron task:
Note that the task affects only the tables in that schema. If the name of the schema is changed in the connection, for example to carto_temp_custom, a new task will be created for the new schema.
Redshift and PostgreSQL: Intermediate tables are never reused. This means that all nodes in a workflow are always computed completely.
This setting doesn't have an effect on executions from the UI, triggered by clicking on the 'Run' button.
This setting also doesn't have an effect on Scheduled executions (only available for PostgreSQL connections).
For , the table will be reused between API calls that have the exact same parameter values. If parameters are changed, the output table will be recomputed.

-- Grant permissions to manage the schema
GRANT ALL PRIVILEGES ON SCHEMA DATABASE_NAME.CARTO_TEMP TO ROLE ROLE_NAME;
-- Grant permissions to execute the task
GRANT EXECUTE TASK ON ACCOUNT TO ROLE ROLE_NAME;-- If the workflows_temp schema already exists:
-- Grant permissions to manage the schema
GRANT ALL ON SCHEMA workflows_temp TO user_name;
-- If the workflows_temp schema does not exist yet:
-- Grant permissions to create the schema
GRANT CREATE ON DATABASE database_name TO user_name;CALL $database$.carto_temp._clear_cache_fn();CREATE EXTENSION pg_cron;-- Grant permissions to manage the schema
GRANT ALL ON SCHEMA carto_temp TO user_name;
-- Grant permissions to manage the task
GRANT USAGE ON SCHEMA cron TO user_name;
GRANT EXECUTE ON ALL FUNCTIONS IN SCHEMA cron TO user_name;SELECT cron.unschedule('carto_scheduler.workflow_clear_cache.carto_temp');Once you have designed a workflow, you might want it to be executed periodically. This allows to have the result of a workflow automatically updated if the input data sources of a workflow have been modified between the last execution and the next one. For this purpose, CARTO offers a UI that helps creating a scheduled query in different Data Warehouses.
Keep reading this page to learn more about scheduling the execution of a workflow:
Once a workflow has been executed successfully, you can click on the button that you will find next to 'Run'
Clicking on that button will open a new modal window where you can select the frequency and other settings for the schedule:
Hours: The workflow will be executed every X hours, at o'clock times.
Days: The workflow will be executed every day at a specific time.
Weeks: The workflow will be executed weekly, on a specific day, at a specific time.
Once a workflow has been scheduled, there are different actions that can be performed:
If your workflow has been modified, after a successful execution you will find this small notification.
Click on the schedule icon to be able to sync the latest changes to the scheduled version of your workflow.
By clicking on the schedule icon at any time, you will be able to modify the frequency at which the workflow will be executed. Select a different setting and click on 'Save changes'.
Same as above, by clicking on the schedule icon at any time, you will find a 'Delete schedule' button that will permanently disable the schedule.
In BigQuery, scheduled tasks are stored in the project. It allows the creation of tasks in different regions.
The connection requires the following roles:
BigQuery User
BigQuery Data Editor
The project used in the connection will need to have this API enabled:
BigQuery Data Transfer API:
For connections created using 'Sign in with Google', a Data Transfer Version Info code needs to be provided.
Once there, select your Google account and Allow access:
After allowing, you will be provided a code. Copy it and paste into your Data Transfer Version Info setting in the connection's .
The scheduled queries can be monitored from the BigQuery console, in the "Scheduled queries" section. Notice that the name of the scheduled query will be "carto_scheduler.workflows.<id>". The id of the workflow is a UUID that can be found in the URL.
In Snowflake, scheduled tasks are created in a schema. In this case, it uses the workflows_temp schema. If the schema is removed, the task will be removed too.
Snowflake makes use of to allow scheduled execution of SQL code. Tasks can be created via SQL statements using the syntax.
It requires access to the workflows_temp schema, and also a global grant "EXECUTE TASK" to be able to start a task.
The created tasks can be monitored from "Data > Databases", "MYDATABASE > MYSCHEMA > Tasks. Notice that the name of the scheduled query will be "carto_scheduler.workflows.<id>". The id of the workflow is a UUID that can be found in the URL.
When scheduling a CARTO Workflow on a Databricks connection, a, with a specific run schedule. In order to do so, your user will need the "Can Create" and "Can Manage" permission for Jobs.
This permission allows a user to create new jobs. Specifically, users with this permission can:
Define new jobs/workflows with tasks and configurations.
Attach tasks to clusters they have access to.
Set parameters like schedules, libraries, and task dependencies during job creation.
This permission provides full control over a job and is the highest level of access. Users with "Can Manage" permissions can:
Create, edit, delete, and clone jobs.
Modify schedules, task configurations, and parameters.
Manage job runs (e.g., start, stop, or restart runs).
Assign or modify permissions for the job, delegating "Can Edit" or "Can View" permissions to others.
For PostgreSQL, the extension provides the means to schedule the execution of a SQL query. Here are some resources to learn how to leverage pg_cron on different managed services based on PostgreSQL:
Amazon RDS. Usage guide available .
Amazon Aurora PostgreSQL supports pg_cron since .
Google Cloud SQL. Extension setup instructions .
It requires the extension pg_cron to be installed and available. Also, access to the cron schema.
The created jobs can be monitored by running SQL queries in the database:
Notice that the name of the scheduled query will be "carto_scheduler.workflows.<id>". The id of the workflow is a UUID that can be found in the URL.
Redshift doesn't offer an API to handle scheduled queries, which prevents CARTO from being able to schedule the execution of a workflow directly from the UI. However, users can schedule workflows on Redshift by following these simple steps:
Go to the "SQL" tab of the results panel of your workflow.
Click on the "Copy" button at the top right of the SQL tab to copy the SQL code corresponding to your entire workflow.
Redshift allows creating a schedule to run SQL statements using the query editor. There are more details about how to use this feature .
Custom: Use a custom expression to define the schedule. See below for more information.





GRANT ALL PRIVILEGES ON SCHEMA <schema> TO ROLE <role>;
GRANT EXECUTE TASK ON ACCOUNT TO ROLE <role>;REVOKE ALL PRIVILEGES ON SCHEMA <schema> FROM ROLE <role>;
REVOKE EXECUTE TASK ON ACCOUNT FROM ROLE <role>;CREATE EXTENSION pg_cron;
GRANT USAGE ON SCHEMA cron TO postgres;
GRANT EXECUTE ON ALL FUNCTIONS IN SCHEMA cron TO postgres;REVOKE USAGE ON SCHEMA cron FROM postgres;
REVOKE EXECUTE ON ALL FUNCTIONS IN SCHEMA cron FROM postgres;
DROP EXTENSION pg_cron;-- Existing jobs
select * from cron.job;
-- Executed jobs
select * from cron.job_run_details;Source table [Table]
Conditional expressions: The UI of this component helps creating a conditional expression involving multiple columns and SQL operators. Each expression will produce a different result, as set on the component.
Result Column [Column]: Select a column that will contain the specified resulting value.
Description
This component casts the content of a column to a given type
Inputs
Source table [Table]
Column [Column]
New type [Selection]
Outputs
Result table [Table]
Description
This component adds a new column with an array containing the values in a set of selected columns.
Inputs
Source table [Table]
Columns
Array column name
Outputs
Result table [Table]
Description
This component creates a new table with an additional column computed using an expression.
Inputs
Source table [Table]
Name for new column [String]
Expression [String]
Outputs
Result table [Table]
External links
Description
This component generates a new table with the same content as the input one, except one of its columns.
The component will fail if the column to remove is the only one in the input table.
Inputs
Source table [Table]
Column [Column]
Outputs
Result table [Table]
External links
Description
This component simplifies the process of modifying table schemas. It allows to select specific columns, with the option to adjust their names and data types as required.
Inputs
Source table [Table]
Columns: The component's UI allows selecting a column, giving a new name and selecting a data type to cast the column.
Outputs
Result table [Table]
Description
This component creates a new column with values extracted from the JSON strings in another column. It uses the Data Warehouse syntax to specify the path to the key that needs to be extracted. See the documentation links below for more information.
Inputs
Source table [Table]
JSON column [Column]
JSON path [expression]
New column [Column]
Output
Result table
External links
Description
This component finds a string in one column of a table and replaces it with the specified value from another table.
As an alternative, columns from the lookup table can be added to the original table in those rows where the searched string is found. This is regulated by the Mode parameter.
Inputs
Source table [Table]
Find within column [Column]
Lookup table [Table]
Find value column [Column]
Replacement column [Column]
Find mode [Selection]
Case insensitive [Boolean]
Match whole word [Boolean]
Mode [Selection]
Columns to append [Column] [Multiple]: only used if Append field(s) to record mode is selected
Outputs
Result table [Table]
Description
This component creates a new table with an additional UUID column named id.
Inputs
Source table [Table]
Outputs
Result table [Table]
Description
This component converts a column from geography to geometry data type.
Inputs
Source table [Table]
geography column [Column]
Outputs
Result table [Table]
Description
This component converts a column from geometry to geography data type.
Inputs
Source table [Table]
Geometry column [Column]
Outputs
Result table [Table]
Description
This component create hex color for each distinct value of an input string column. Column values as NULL will be associated to grey value. The component generates a copy of the source with a new string column called: [name_of_input_col] + '_hex_color'.
Inputs
Source table [Table]
Column with category values
Outputs
Result table [Table]
Description
This component filters an input table using the presence or absence of null values in a given column.
Inputs
Source table [Table]
Column [Column]
Outputs
Not null values table [Table]
Null values table [Table]
Description
This component creates a new table with only the N first rows of the input table.
Inputs
Source table [Table]
Number of rows [Number]
Outputs
Result table [Table]
External links
Description
This component computes new values based on a given expression and a set of fields to apply the expression to. Use $a to refer to the value of the current column.
Inputs
Source table[Table]
Expression [String]. The expression to apply
Mode [Selection]. The mode used to put new values in the table
Prefix [String]. Only for the case of mode='Create new columns'
Columns [Column][Multiple]. The columns to apply the formula to
Outputs
Result table [Table]
Description
This component creates a new table containing a new column computed using a multi-row formula based on one or several input columns.
To refer to a value in the previous row, use {colname - 1} and to refer to a value in the next row, use {colname + 1}.
Inputs
Table[Table]
New column name [String]
New column type [Selection]
Expression [String]
Value for missing row values [Selection]
Column to sort by [Column]
Column to group by [Column]
Outputs
Result table [Table]
Description
This component normalizes the values of a given column.
It adds a new column named '[column_name]_norm'.
Normalization can be computed as 0-1 values or as z-scores
Inputs
Source table [Table]
Column to normalize [Column]
Use z-scores [Boolean].
Disabled (default): When disabled, the resulting normalized values will range between 0 an 1.
Enabled: The normalized value will be calcuated as a z-score or standard score: the number of standard deviations that the value is above or below the mean of the whole column. See .
Outputs
Result table [Table]
Description
This component generates a new table containing the rows of an input one sorted according to the values in one of its columns, and an optional second column.
Columns to use cannot be of type geometry.
Inputs
Table to order [Table]
Column to order by [Column]
Use descending order [Boolean]
Optional secondary column to order by [Column]
Use descending order in secondary column [Boolean]
Outputs
Result table [Table]
External links
Description
This component creates new columns with values extracted from the JSON string in other columns. It uses the data warehouse syntax to specify the path to the key that needs to be extracted.
Input
Source table. This components expects an input table containing at least one column with a JSON object.
Settings
Select JSON column: This allows selecting a string column that contains a JSON object
Add column name: Type a name for a new column that will be added with the content of a specific JSON key.
Add JSON path: Type a JSON path expression, in the syntax of your data warehouse to identify the key to be extracted
Output
Output table: this component generates a table with the same schema as the input source table, plus an additional column per JSON key specified in the Settings.
External links
Description
This component takes a group of spatial point objects and draws a polygon or polyline in a specific sort order to represent that group of points.
This component can also be used for spatial layer development by translating a collection of GPS data into polygon or polyline objects, where a polygon is a simple bounded region, such as a state boundary, and a polyline contains multiple line segments with any number of points between its start and endpoints, such as a river or road.
Inputs
Build Method [Selection]
Source table [Table]
Source Field [Column]
Source Field [Column]
Sequence Field [Column]
Outputs
Result table [Table]
Description
This component splits polygon or polyline objects into their component point, line, or region objects.
This is a very specialized component used for spatial layer development. A typical use of this component is to disaggregate complex regions that may contain more than one polygon or to separate a polyline into its individual nodes.
Inputs
Source table [Table]
Spatial Field [Column]
Split To [Selection]
Outputs
Result table [Table]
Description
This component takes an input table and generates a new one in which duplicates rows from the input table have been removed.
Inputs
Source table [Table]
Outputs
Result table [Table]
Description
This component generates a new table with the same content as the input one, renaming one or multiple of its columns.
Inputs
Source table [Table]
Column to rename [Column]
New column name [String]
Outputs
Result table [Table]
External links
Description
This component creates a new table with an additional column containing row numbers.
Inputs
Source table [Table]
Outputs
Result table [Table]
Description
This component generates a new table with a random sample of N rows from an input table.
Inputs
Source table [Table]
Number of rows to sample [Number]
Outputs
Result table [Table]
Description
This component runs an arbitrary SELECT statement.
It generates a new table from a SELECT statement applied on the input table.
Inputs
Source table [Table]
SELECT statement [String]
Outputs
Result table [Table]
External links
Description
This component generates a new table with the unique values that appear in a given column of an input table.
Inputs
Source table [Table]
Column [Column]
Outputs
Result table [Table]
External links
Description
This component filters an input table according to a filter expression based on a single column.
It generates a new table with only the rows of the input table that meet the filter criteria and another one with those that do not meet it.
Inputs
Source table [Table]
Column [Column]
Operator [Selection]
Value [String]
Outputs
Table with rows that pass the filter [Table]
Table with rows that do not pass the filter [Table]
Description
This component filters an input table using a spatial predicate and a filter table.
It generates a new table with only the rows of the input table that meet the filter criteria and another one with those that do not meet it.
Inputs
Source table [Table]
Filter table [Table]
Geo column in source table [Column]
Geo column in filter table [Column]
Spatial predicate [Selection]
Outputs
Table with rows that pass the filter [Table]
Table with rows that do not pass the filter [Table]
Description
This component sets the SRID of a geo column
Inputs
Source table [Table]
Geo column [Column]
SRID [String]
Outputs
Result table [Table]
Description
This component adds new columns based on splitting the text string in a text column.
Inputs
Table [Table]
Column to split [Column]
Delimiters [String]
Mode [Selection]. Whether to add new columns or new rows with splitted strings
Number of new columns: Only used if mode = 'Split to columns'
Prefix for new column names [String]: Only used if mode = 'Split to columns'
Extra characters: What to do with extra characters if there are more tokens after dividing the string according to the delimiters than the ones defined in the 'Number of new columns' parameter
Outputs
Result table [Table]
Description
This component rotates table columns into rows.
Inputs
Table to unpivot [Table]
Key columns [Column][Multiple]:The columns to use for identifying rows
Data columns [Column][Multiple]: The columns to use for key-value pairs
Outputs
Result table [Table]
Description
This component separates unique rows and duplicated rows. Unique rows are defined using the values in one or several columns of the input table, as defined by the user.
Inputs
Tables [Table]
Columns to find unique values [Column][Multiple]
Outputs
Table with unique rows [Table]
Table with duplicated rows [Table]
Description
This component filters an input table according to a filter expression.
It generates a new table with only the rows of the input table that meet the filter criteria and another one with those that do not meet it.
Inputs
Source table [Table]
Filter expression [String]
Outputs
Table with rows that pass the filter [Table]
Table with rows that do not pass the filter [Table]
External links
Result table [Table]
Components in CARTO Workflows are, together with the data sources, the available resources to build your analytical workflows. To use these elements, drag and drop them from the Components Explorer to the Workflow canvas.
In this section you can find the list of all components in CARTO Workflows and their availability in the different cloud data warehouse platforms. Please check this page regularly as currently we are continuously incorporating new components and also completing their availability across platforms.
This section also contains information about Extension Packages, which extend Workflows' functionality for specific use cases.
Extension Packages can be created for workflows that use BigQuery and Snowflake connections.
In addition to the extensions created by external developers, partners and other third-parties, CARTO provides a collection of extension packages developed and maintained internally. These are described below, and are distributed via .zip files that can be installed from the Workflows UI.
Workflows' functionality can be extended and tailored to specific use cases, by creating custom components, which can be grouped and distributed together as .
Extension Packages can be created for workflows that use BigQuery and Snowflake connections.
In addition to the extensions created by external developers, partners and other third-parties, CARTO provides a collection of extension packages developed and maintained internally. These are described below, and are distributed via .zip files that can be .
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
❌
✔️
✔️
✔️
❌
✔️
❌
❌
❌
✔️
✔️
✔️
❌
✔️
✔️
✔️
❌
✔️
❌
❌
❌
✔️
✔️
❌
❌
✔️
❌
✔️
❌
❌
❌
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
❌
✔️
❌
✔️
✔️
✔️
❌
✔️
❌
✔️
✔️
✔️
❌
❌
❌
✔️
✔️
✔️
❌
✔️
❌
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
❌
✔️
✔️
✔️
✔️
✔️
❌
✔️
✔️
✔️
✔️
✔️
❌
✔️
✔️
✔️
✔️
✔️
❌
✔️
✔️
✔️
✔️
❌
❌
✔️
❌
✔️
✔️
✔️
✔️
❌
❌
❌
✔️
❌
❌
✔️
✔️
❌
✔️
❌
❌
✔️
✔️
✔️
✔️
✔️
❌
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
❌
✔️
✔️
✔️
✔️
✔️
❌
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
❌
✔️
✔️
✔️
❌
❌
❌
❌
❌
✔️
❌
❌
❌
❌
❌
✔️
✔️
✔️
❌
✔️
✔️
✔️
✔️
✔️
❌
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
❌
❌
❌
✔️
✔️
✔️
✔️
✔️
✔️
❌
✔️
✔️
✔️
✔️
✔️
❌
✔️
✔️
✔️
❌
❌
❌
❌
❌
✔️
✔️
❌
✔️
✔️
✔️
✔️
✔️
❌
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
❌
❌
✔️
❌
✔️
✔️
❌
❌
✔️
❌
✔️
✔️
✔️
❌
✔️
✔️
✔️
✔️
❌
❌
❌
❌
✔️
✔️
✔️
✔️
✔️
✔️
✔️
❌
❌
❌
❌
❌
✔️
✔️
✔️
✔️
✔️
✔️
❌
❌
❌
❌
❌
✔️
✔️
✔️
❌
✔️
✔️
✔️
❌
❌
❌
❌
❌
✔️
✔️
✔️
✔️
❌
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
❌
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
❌
❌
✔️
❌
❌
❌
❌
❌
✔️
❌
❌
❌
✔️
❌
❌
❌
❌
❌
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
❌
✔️
✔️
✔️
✔️
✔️
❌
✔️
✔️
❌
❌
❌
❌
❌
✔️
✔️
❌
❌
✔️
✔️
✔️
✔️
✔️
❌
✔️
✔️
✔️
✔️
❌
❌
✔️
✔️
✔️
✔️
✔️
❌
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
❌
✔️
✔️
✔️
✔️
✔️
❌
✔️
❌
✔️
✔️
✔️
❌
✔️
❌
✔️
✔️
❌
❌
✔️
❌
✔️
✔️
✔️
❌
✔️
✔️
✔️
❌
✔️
✔️
✔️
✔️
✔️
❌
❌
❌
❌
❌
✔️
❌
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
❌
✔️
✔️
✔️
❌
❌
❌
❌
❌
✔️
✔️
✔️
❌
✔️
❌
✔️
❌
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
❌
❌
❌
✔️
✔️
❌
✔️
❌
❌
❌
✔️
✔️
✔️
❌
❌
✔️
✔️
✔️
✔️
❌
❌
❌
✔️
✔️
✔️
❌
❌
✔️
✔️
✔️
✔️
❌
❌
❌
✔️
✔️
✔️
❌
❌
✔️
✔️
✔️
✔️
❌
✔️
✔️
✔️
✔️
✔️
❌
✔️
✔️
✔️
✔️
✔️
❌
✔️
✔️
✔️
✔️
✔️
❌
✔️
✔️
✔️
✔️
✔️
❌
✔️
❌
✔️
✔️
✔️
❌
✔️
✔️
❌
❌
❌
❌
❌
✔️
❌
❌
❌
❌
❌
❌
❌
✔️
❌
❌
❌
✔️
❌
❌
❌
❌
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
❌
✔️
✔️
✔️
✔️
l
✔️
✔️
✔️
❌
❌
✔️
❌
✔️
✔️
✔️
✔️
✔️
✔️
✔️
❌
✔️
✔️
❌
❌
✔️
❌
❌
❌
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
❌
❌
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
❌
❌
❌
❌
✔️
✔️
✔️
✔️
❌
✔️
✔️
✔️
❌
❌
❌
❌
❌
✔️
❌
❌
❌
❌
✔️
✔️
❌
❌
❌
❌
✔️
✔️
❌
❌
❌
❌
✔️
✔️
❌
❌
❌
❌
✔️
✔️
❌
❌
❌
❌
✔️
✔️
❌
❌
❌
❌
✔️
✔️
❌
❌
❌
❌
✔️
✔️
❌
❌
❌
❌
✔️
(coming soon)
❌
❌
❌
❌
✔️
✔️
❌
❌
❌
❌
✔️
✔️
❌
❌
❌
❌
✔️
❌
❌
❌
❌
❌
✔️
✔️
❌
✔️
✔️
✔️
✔️
✔️
❌
✔️
✔️
✔️
✔️
✔️
❌
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
❌
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
❌
✔️
✔️
✔️
✔️
❌
❌
✔️
✔️
✔️
✔️
✔️
❌
❌
❌
✔️
✔️
✔️
✔️
✔️
✔️
✔️
❌
❌
❌
❌
❌
✔️
❌
❌
❌
❌
❌
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
❌
✔️
✔️
✔️
✔️
❌
❌
❌
❌
✔️
✔️
✔️
✔️
✔️
✔️
✔️
❌
❌
❌
❌
❌
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
❌
❌
❌
✔️
✔️
❌
❌
✔️
✔️
✔️
❌
❌
❌
❌
❌
✔️
✔️
✔️
❌
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
❌
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️