LogoLogo
HomeAcademyLoginTry for free
  • Welcome
  • What's new
    • Q2 2025
    • Q1 2025
    • Q4 2024
    • Q3 2024
    • Q2 2024
    • Q1 2024
    • Q4 2023
    • Q3 2023
    • Q2 2023
    • Q1 2023
    • Q4 2022
    • Q3 2022
  • FAQs
    • Accounts
    • Migration to the new platform
    • User & organization setup
    • General
    • Builder
    • Workflows
    • Data Observatory
    • Analytics Toolbox
    • Development Tools
    • Deployment Options
    • CARTO Basemaps
    • CARTO for Education
    • Support Packages
    • Security and Compliance
  • Getting started
    • What is CARTO?
    • Quickstart guides
      • Connecting to your data
      • Creating your first map
      • Creating your first workflow
      • Developing your first application
    • CARTO Academy
  • CARTO User Manual
    • Overview
      • Creating your CARTO organization
      • CARTO Cloud Regions
      • CARTO Workspace overview
    • Maps
      • Data sources
        • Simple features
        • Spatial Indexes
        • Pre-generated tilesets
        • Rasters
        • Defining source spatial data
        • Managing data freshness
        • Changing data source location
      • Layers
        • Point
          • Grid point aggregation
          • H3 point aggregation
          • Heatmap point aggregation
          • Cluster point aggregation
        • Polygon
        • Line
        • Grid
        • H3
        • Raster
        • Zoom to layer
      • Widgets
        • Formula widget
        • Category widget
        • Pie widget
        • Histogram widget
        • Range widget
        • Time Series widget
        • Table widget
      • SQL Parameters
        • Date parameter
        • Text parameter
        • Numeric parameter
        • Publishing SQL parameters
      • Interactions
      • Legend
      • Basemaps
        • Basemap selector
      • AI Agents
      • SQL analyses
      • Map view modes
      • Map description
      • Feature selection tool
      • Search locations
      • Measure distances
      • Exporting data
      • Download PDF reports
      • Managing maps
      • Sharing and collaboration
        • Editor collaboration
        • Map preview for editors
        • Map settings for viewers
        • Comments
        • Embedding maps
        • URL parameters
      • Performance considerations
    • Workflows
      • Workflow canvas
      • Results panel
      • Components
        • Aggregation
        • Custom
        • Data Enrichment
        • Data Preparation
        • Generative AI
        • Input / Output
        • Joins
        • Parsers
        • Raster Operations
        • Spatial Accessors
        • Spatial Analysis
        • Spatial Constructors
        • Spatial Indexes
        • Spatial Operations
        • Statistics
        • Tileset Creation
        • BigQuery ML
        • Snowflake ML
        • Google Earth Engine
        • Google Environment APIs
        • Telco Signal Propagation Models
      • Data Sources
      • Scheduling workflows
      • Sharing workflows
      • Using variables in workflows
      • Executing workflows via API
      • Temporary data in Workflows
      • Extension Packages
      • Managing workflows
      • Workflows best practices
    • Data Explorer
      • Creating a map from your data
      • Importing data
        • Importing rasters
      • Geocoding data
      • Optimizing your data
    • Data Observatory
      • Terminology
      • Browsing the Spatial Data Catalog
      • Subscribing to public and premium datasets
      • Accessing free data samples
      • Managing your subscriptions
      • Accessing your subscriptions from your data warehouse
        • Access data in BigQuery
        • Access data in Snowflake
        • Access data in Databricks
        • Access data in Redshift
        • Access data in PostgreSQL
    • Connections
      • Google BigQuery
      • Snowflake
      • Databricks
      • Amazon Redshift
      • PostgreSQL
      • CARTO Data Warehouse
      • Sharing connections
      • Deleting a connection
      • Required permissions
      • IP whitelisting
      • Customer data responsibilities
    • Applications
    • Settings
      • Understanding your organization quotas
      • Activity Data
        • Activity Data Reference
        • Activity Data Examples
        • Activity Data Changelog
      • Users and Groups
        • Inviting users to your organization
        • Managing user roles
        • Deleting users
        • SSO
        • Groups
        • Mapping groups to user roles
      • CARTO Support Access
      • Customizations
        • Customizing appearance and branding
        • Configuring custom color palettes
        • Configuring your organization basemaps
        • Enabling AI Agents
      • Advanced Settings
        • Managing applications
        • Configuring S3 Bucket for Redshift Imports
        • Configuring OAuth connections to Snowflake
        • Configuring OAuth U2M connections to Databricks
        • Configuring S3 Bucket integration for RDS for PostgreSQL Exports in Builder
        • Configuring Workload Identity Federation for BigQuery
      • Data Observatory
      • Deleting your organization
    • Developers
      • Managing Credentials
        • API Base URL
        • API Access Tokens
        • SPA OAuth Clients
        • M2M OAuth Clients
      • Named Sources
  • Data and Analysis
    • Analytics Toolbox Overview
    • Analytics Toolbox for BigQuery
      • Getting access
        • Projects maintained by CARTO in different BigQuery regions
        • Manual installation in your own project
        • Installation in a Google Cloud VPC
        • Core module
      • Key concepts
        • Tilesets
        • Spatial indexes
      • SQL Reference
        • accessors
        • clustering
        • constructors
        • cpg
        • data
        • http_request
        • import
        • geohash
        • h3
        • lds
        • measurements
        • placekey
        • processing
        • quadbin
        • random
        • raster
        • retail
        • routing
        • s2
        • statistics
        • telco
        • tiler
        • transformations
      • Guides
        • Running queries from Builder
        • Working with Raster data
      • Release notes
      • About Analytics Toolbox regions
    • Analytics Toolbox for Snowflake
      • Getting access
        • Native App from Snowflake's Marketplace
        • Manual installation
      • Key concepts
        • Spatial indexes
        • Tilesets
      • SQL Reference
        • accessors
        • clustering
        • constructors
        • data
        • http_request
        • import
        • h3
        • lds
        • measurements
        • placekey
        • processing
        • quadbin
        • random
        • raster
        • retail
        • s2
        • statistics
        • tiler
        • transformations
      • Guides
        • Running queries from Builder
        • Working with Raster data
      • Release Notes
    • Analytics Toolbox for Databricks
      • Getting access
        • Personal (former Single User) cluster
        • Standard (former Shared) cluster
      • Reference
        • lds
        • tiler
      • Guides
      • Release Notes
    • Analytics Toolbox for Redshift
      • Getting access
        • Manual installation in your database
        • Installation in an Amazon Web Services VPC
        • Core version
      • Key concepts
        • Tilesets
        • Spatial indexes
      • SQL Reference
        • clustering
        • constructors
        • data
        • http_request
        • import
        • lds
        • placekey
        • processing
        • quadbin
        • random
        • s2
        • statistics
        • tiler
        • transformations
      • Guides
        • Running queries from Builder
      • Release Notes
    • Analytics Toolbox for PostgreSQL
      • Getting access
        • Manual installation
        • Core version
      • Key concepts
        • Tilesets
        • Spatial Indexes
      • SQL Reference
        • h3
        • quadbin
        • tiler
      • Guides
        • Creating spatial index tilesets
        • Running queries from Builder
      • Release Notes
    • CARTO + Python
      • Installation
      • Authentication Methods
      • Visualizing Data
      • Working with Data
        • How to work with your data in the CARTO Data Warehouse
        • How to access your Data Observatory subscriptions
        • How to access CARTO's Analytics Toolbox for BigQuery and create visualizations via Python notebooks
        • How to access CARTO’s Analytics Toolbox for Snowflake and create visualizations via Python notebooks
        • How to visualize data from Databricks
      • Reference
    • CARTO QGIS Plugin
  • CARTO for Developers
    • Overview
    • Key concepts
      • Architecture
      • Libraries and APIs
      • Authentication methods
        • API Access Tokens
        • OAuth Access Tokens
        • OAuth Clients
      • Connections
      • Data sources
      • Visualization with deck.gl
        • Basemaps
          • CARTO Basemap
          • Google Maps
            • Examples
              • Gallery
              • Getting Started
              • Basic Examples
                • Hello World
                • BigQuery Tileset Layer
                • Data Observatory Tileset Layer
              • Advanced Examples
                • Arc Layer
                • Extrusion
                • Trips Layer
            • What's New
          • Amazon Location
            • Examples
              • Hello World
              • CartoLayer
            • What's New
        • Rapid Map Prototyping
      • Charts and widgets
      • Filtering and interactivity
      • Summary
    • Quickstart
      • Make your first API call
      • Visualize your first dataset
      • Create your first widget
    • Guides
      • Build a public application
      • Build a private application
      • Build a private application using SSO
      • Visualize massive datasets
      • Integrate CARTO in your existing application
      • Use Boundaries in your application
      • Avoid exposing SQL queries with Named Sources
      • Managing cache in your CARTO applications
    • Reference
      • Deck (@deck.gl reference)
      • Data Sources
        • vectorTableSource
        • vectorQuerySource
        • vectorTilesetSource
        • h3TableSource
        • h3QuerySource
        • h3TilesetSource
        • quadbinTableSource
        • quadbinQuerySource
        • quadbinTilesetSource
        • rasterSource
        • boundaryTableSource
        • boundaryQuerySource
      • Layers (@deck.gl/carto)
      • Widgets
        • Data Sources
        • Server-side vs. client-side
        • Models
          • getFormula
          • getCategories
          • getHistogram
          • getRange
          • getScatter
          • getTimeSeries
          • getTable
      • Filters
        • Column filters
        • Spatial filters
      • CARTO APIs Reference
    • Release Notes
    • Examples
    • CARTO for React
      • Guides
        • Getting Started
        • Views
        • Data Sources
        • Layers
        • Widgets
        • Authentication and Authorization
        • Basemaps
        • Look and Feel
        • Query Parameters
        • Code Generator
        • Sample Applications
        • Deployment
        • Upgrade Guide
      • Examples
      • Library Reference
        • Introduction
        • API
        • Auth
        • Basemaps
        • Core
        • Redux
        • UI
        • Widgets
      • Release Notes
  • CARTO Self-Hosted
    • Overview
    • Key concepts
      • Architecture
      • Deployment requirements
    • Quickstarts
      • Single VM deployment (Kots)
      • Orchestrated container deployment (Kots)
      • Advanced Orchestrated container deployment (Helm)
    • Guides
      • Guides (Kots)
        • Configure your own buckets
        • Configure an external in-memory cache
        • Enable Google Basemaps
        • Enable the CARTO Data Warehouse
        • Configure an external proxy
        • Enable BigQuery OAuth connections
        • Configure Single Sign-On (SSO)
        • Use Workload Identity in GCP
        • High availability configuration for CARTO Self-hosted
        • Configure your custom service account
      • Guides (Helm)
        • Configure your own buckets (Helm)
        • Configure an external in-memory cache (Helm)
        • Enable Google Basemaps (Helm)
        • Enable the CARTO Data Warehouse (Helm)
        • Configure an external proxy (Helm)
        • Enable BigQuery OAuth connections (Helm)
        • Configure Single Sign-On (SSO) (Helm)
        • Use Workload Identity in GCP (Helm)
        • Use EKS Pod Identity in AWS (Helm)
        • Enable Redshift imports (Helm)
        • Migrating CARTO Self-hosted installation to an external database (Helm)
        • Advanced customizations (Helm)
        • Configure your custom service account (Helm)
    • Maintenance
      • Maintenance (Kots)
        • Updates
        • Backups
        • Uninstall
        • Rotating keys
        • Monitoring
        • Change the Admin Console password
      • Maintenance (Helm)
        • Monitoring (Helm)
        • Rotating keys (Helm)
        • Uninstall (Helm)
        • Backups (Helm)
        • Updates (Helm)
    • Support
      • Get debug information for Support (Kots)
      • Get debug information for Support (Helm)
    • CARTO Self-hosted Legacy
      • Key concepts
        • Architecture
        • Deployment requirements
      • Quickstarts
        • Single VM deployment (docker-compose)
      • Guides
        • Configure your own buckets
        • Configure an external in-memory cache
        • Enable Google Basemaps
        • Enable the CARTO Data Warehouse
        • Configure an external proxy
        • Enable BigQuery OAuth connections
        • Configure Single Sign-On (SSO)
        • Enable Redshift imports
        • Configure your custom service account
        • Advanced customizations
        • Migrating CARTO Self-Hosted installation to an external database
      • Maintenance
        • Updates
        • Backups
        • Uninstall
        • Rotating keys
        • Monitoring
      • Support
    • Release Notes
  • CARTO Native App for Snowflake Containers
    • Deploying CARTO using Snowflake Container Services
  • Get Help
    • Legal & Compliance
    • Previous libraries and components
    • Migrating your content to the new CARTO platform
Powered by GitBook
On this page
  • DATAOBS_ENRICH_GRID
  • DATAOBS_ENRICH_GRID_RAW
  • DATAOBS_ENRICH_GRID_WEIGHTED
  • DATAOBS_ENRICH_POINTS
  • DATAOBS_ENRICH_POINTS_RAW
  • DATAOBS_ENRICH_POLYGONS
  • DATAOBS_ENRICH_POLYGONS_RAW
  • DATAOBS_ENRICH_POLYGONS_WEIGHTED
  • DATAOBS_SAMPLES
  • DATAOBS_SUBSCRIPTIONS
  • DATAOBS_SUBSCRIPTION_VARIABLES
  • ENRICH_GRID
  • ENRICH_GRID_RAW
  • ENRICH_POINTS
  • ENRICH_POINTS_RAW
  • ENRICH_POLYGONS
  • ENRICH_POLYGONS_RAW
  • ENRICH_POLYGONS_WEIGHTED
  • GRIDIFY_ENRICH

Was this helpful?

Export as PDF
  1. Data and Analysis
  2. Analytics Toolbox for BigQuery
  3. SQL Reference

data

PreviouscpgNexthttp_request

Last updated 19 days ago

Was this helpful?

This module contains functions and procedures that make use of data (either Data Observatory or user-provided data) for their computations.

For the DATAOBS_ methods, users can find the details on where to locate the tables (and their qualified names) within the corresponding data warehouse by going to the "Access in" option in the subscription page in the Data Explorer, and picking the data warehouse where that data has been made available.

DATAOBS_ENRICH_GRID

DATAOBS_ENRICH_GRID(grid_type, input_query, input_index_column, variables, filters, output, source)

Description

This procedure enriches a query containing grid cell indexes of one of the supported types (H3, Quadbin) with data from the Data Observatory. The user must be subscribed to all the Data Observatory datasets involved in the enrichment. The cells must all have the same resolution.

If the enrich data is indexed by an H3 or Quadbin grid compatible with the input (same grid type and equal or greater resolution), then the enrichment will be performed much more efficiently by matching the index values rather than intersecting associated GEOGRAPHY elements.

As a result of this process, each input grid cell will be enriched with the data of the Data Observatory datasets that spatially intersect it. When the input cell intersects with more than one polygon, point, or line of the Data Observatory datasets, the data is aggregated using the aggregation methods specified.

Valid aggregation methods are:

  • COUNT It computes the number of Data Observatory features that contain the enrichment variable and are intersected by the input geography.

If the enrichment of an input table needs to be repeated, please notice that dropping the added columns will generate problems in consecutive enrichments as Bigquery saves those columns during 7 days for time travel purposes. We recommend storing the original table columns in a temporal table, dropping the input table and then recreating the input table from the temporal table.

Input parameters

  • grid_type: STRING Type of grid: "h3" or "quadbin".

  • input_query: STRING query to be enriched (Standard SQL); this query must produce valid grid indexes for the selected grid type in a column of the proper type (STRING for H3, and INT64 for Quadbin). It can include additional columns with data associated with the grid cells that will be preserved. A qualified table name can be given as well, e.g. 'project-id.dataset-id.table-name'.

  • input_index_column: STRING name of a column in the query that contains the grid indexes.

  • output: ARRAY<STRING>|NULL containing the name of an output table to store the results and optionally an SQL clause that can be used to partition it. The name of the output table should include project and dataset, e.g. ['project-id.dataset-id.table-name'] or ['project-id.dataset-id.table-name', 'PARTITION BY number']. If NULL the enrichment result is returned. When the output table is the same than the input, the input table will be enriched in place.

  • source: STRING name of the location where the Data Observatory subscriptions of the user are stored, in 'project-id.dataset-id' format. If only the 'dataset-id' is included, it uses the project 'carto-data' by default.

Output

The output table will contain all the input columns provided in the input_query and one extra column for each variable in variables, named after its corresponding slug and including a suffix indicating the aggregation method used.

If a new output table is created, it will be clustered by the spatial index to optimize its performance when filtering data by it or using it to join to other grid tables. This is important to visualize the results in a map efficiently. If an SQL clause is included in the output parameter this optimization will not be performed.

Examples

CALL `carto-un`.carto.DATAOBS_ENRICH_GRID(
  'h3',
  R'''
  SELECT * FROM UNNEST(['8718496d8ffffff','873974865ffffff','87397486cffffff','8718496daffffff','873974861ffffff','8718496dbffffff','87397494bffffff','8718496ddffffff','873974864ffffff']) AS index
  ''',
  'index',
  [('population_14d9cf55', 'sum')],
  NULL,
  ['my-project.my-dataset.my-enriched-table'],
  'my-dataobs-project.my-dataobs-dataset'
)
-- The table 'my-project.my-dataset.my-enriched-table' will be created
-- with columns: index, population_14d9cf55_sum
CALL `carto-un-eu`.carto.DATAOBS_ENRICH_GRID(
  'h3',
  R'''
  SELECT * FROM UNNEST(['8718496d8ffffff','873974865ffffff','87397486cffffff','8718496daffffff','873974861ffffff','8718496dbffffff','87397494bffffff','8718496ddffffff','873974864ffffff']) AS index
  ''',
  'index',
  [('population_14d9cf55', 'sum')],
  NULL,
  ['my-project.my-dataset.my-enriched-table'],
  'my-dataobs-project.my-dataobs-dataset'
)
-- The table 'my-project.my-dataset.my-enriched-table' will be created
-- with columns: index, population_14d9cf55_sum
CALL carto.DATAOBS_ENRICH_GRID(
  'h3',
  R'''
  SELECT * FROM UNNEST(['8718496d8ffffff','873974865ffffff','87397486cffffff','8718496daffffff','873974861ffffff','8718496dbffffff','87397494bffffff','8718496ddffffff','873974864ffffff']) AS index
  ''',
  'index',
  [('population_14d9cf55', 'sum')],
  NULL,
  ['my-project.my-dataset.my-enriched-table'],
  'my-dataobs-project.my-dataobs-dataset'
)
-- The table 'my-project.my-dataset.my-enriched-table' will be created
-- with columns: index, population_14d9cf55_sum
CALL `carto-un`.carto.DATAOBS_ENRICH_GRID(
  'h3',
  'my-project.my-dataset.my-table',
  'index',
  [('population_14d9cf55', 'sum')],
  NULL,
  ['my-project.my-dataset.my-table'],
  'my-dataobs-project.my-dataobs-dataset'
)
-- The column population_14d9cf55_sum will be added to the table
-- 'my-project.my-dataset.my-table'.
CALL `carto-un-eu`.carto.DATAOBS_ENRICH_GRID(
  'h3',
  'my-project.my-dataset.my-table',
  'index',
  [('population_14d9cf55', 'sum')],
  NULL,
  ['my-project.my-dataset.my-table'],
  'my-dataobs-project.my-dataobs-dataset'
)
-- The column population_14d9cf55_sum will be added to the table
-- 'my-project.my-dataset.my-table'.
CALL carto.DATAOBS_ENRICH_GRID(
  'h3',
  'my-project.my-dataset.my-table',
  'index',
  [('population_14d9cf55', 'sum')],
  NULL,
  ['my-project.my-dataset.my-table'],
  'my-dataobs-project.my-dataobs-dataset'
)
-- The column population_14d9cf55_sum will be added to the table
-- 'my-project.my-dataset.my-table'.

Additional examples

DATAOBS_ENRICH_GRID_RAW

DATAOBS_ENRICH_GRID_RAW(grid_type, input_query, input_index_column, variables, filters, output, source)

Description

This procedure enriches a query containing grid cell indexes of one of the supported types (H3, Quadbin) with data from the Data Observatory. The user must be subscribed to all the Data Observatory datasets involved in the enrichment. The cells must all have the same resolution.

As a result of this process, each input grid cell will be enriched with the data from the Data Observatory datasets that spatially intersect it. The variable values corresponding to all intersecting Data Observatory features for a given input cell will be returned in an ARRAY column. When variables come from multiple Data Observatory geographies, one ARRAY column will be included for each source geography table. Data Observatory geography slugs are used for the names of these columns. Each array contains STRUCTs with one field for each variable (named after the variable slug) and additional measure fields __carto_intersection, __carto_total, __carto_dimension. See the output information for more details.

If the enrich data is indexed by an H3 or Quadbin grid compatible with the input (same grid type and equal or greater resolution), then the enrichment will be performed much more efficiently by matching the index values rather than intersecting associated GEOGRAPHY elements. In this case the additional measure fields are omitted.

If the enrichment of an input table needs to be repeated, please notice that dropping the added columns will generate problems in consecutive enrichments as Bigquery saves those columns during 7 days for time travel purposes. We recommend storing the original table columns in a temporal table, dropping the input table and then recreating the input table from the temporal table.

Input parameters

  • grid_type: STRING Type of grid: "h3" or "quadbind".

  • input_query: STRING query to be enriched (Standard SQL); this query must produce valid grid indexes for the selected grid type in a column of the proper type (STRING for H3, and INT64 for Quadbin). It can include additional columns with data associated with the grid cells that will be preserved. A qualified table name can be given as well, e.g. 'project-id.dataset-id.table-name'.

  • input_index_column: STRING name of a column in the query that contains the grid indexes.

  • output: ARRAY<STRING>|NULL containing the name of an output table to store the results and optionally an SQL clause that can be used to partition it. The name of the output table should include project and dataset, e.g. ['project-id.dataset-id.table-name'] or ['project-id.dataset-id.table-name', 'PARTITION BY number']. If NULL the enrichment result is returned. When the output table is the same than the input, the input table will be enriched in place.

  • source: STRING name of the location where the Data Observatory subscriptions of the user are stored, in 'project-id.dataset-id' format. If only the 'dataset-id' is included, it uses the project 'carto-data' by default.

Output

The output table will contain all the input columns provided in the input_query and one extra ARRAY column for each Data Observatory geography containing enrichment variables, named after their corresponding geography slug. The array contains STRUCTs with one field for each variable, using the variable slug as the field name. Additional fields will be included with information about the intersection of the grid cell and the Data Observatory geographies unless the grid matching described above is performed.

  • __carto_dimension dimension of the Data Observatory geography: 2 for areas (polygons), 1 for lines, and 0 for points.

  • __carto_intersection area in square meters (for dimension = 2) or length in meters (for dimension = 1) of the intersection.

  • __carto_total area in square meters (for dimension = 2) or length in meters (for dimension = 1) of the Data Observatory feature.

If a new output table is created, it will be clustered by the spatial index to optimize its performance when filtering data by it or using it to join to other grid tables. This is important to visualize the results in a map efficiently. If an SQL clause is included in the output parameter this optimization will not be performed.

Examples

CALL `carto-un`.carto.DATAOBS_ENRICH_GRID_RAW(
  'h3',
  R'''
  SELECT * FROM UNNEST(['8718496d8ffffff','873974865ffffff','87397486cffffff','8718496daffffff','873974861ffffff','8718496dbffffff','87397494bffffff','8718496ddffffff','873974864ffffff']) AS index
  ''',
  'index',
   ['population_93405ad7'],
   NULL,
   ['my-project.my-dataset.my-enriched-table'],
   'my-dataobs-project.my-dataobs-dataset'
);
-- The table 'my-project.my-dataset.my-enriched-table' will be created
-- with columns: index, and wp_grid100m_10955184.
-- Column wp_grid100m_10955184 will have the fields population_93405ad7,
-- __carto_dimension, __carto_intersection and __carto_total.
CALL `carto-un-eu`.carto.DATAOBS_ENRICH_GRID_RAW(
  'h3',
  R'''
  SELECT * FROM UNNEST(['8718496d8ffffff','873974865ffffff','87397486cffffff','8718496daffffff','873974861ffffff','8718496dbffffff','87397494bffffff','8718496ddffffff','873974864ffffff']) AS index
  ''',
  'index',
   ['population_93405ad7'],
   NULL,
   ['my-project.my-dataset.my-enriched-table'],
   'my-dataobs-project.my-dataobs-dataset'
);
-- The table 'my-project.my-dataset.my-enriched-table' will be created
-- with columns: index, and wp_grid100m_10955184.
-- Column wp_grid100m_10955184 will have the fields population_93405ad7,
-- __carto_dimension, __carto_intersection and __carto_total.
CALL carto.DATAOBS_ENRICH_GRID_RAW(
  'h3',
  R'''
  SELECT * FROM UNNEST(['8718496d8ffffff','873974865ffffff','87397486cffffff','8718496daffffff','873974861ffffff','8718496dbffffff','87397494bffffff','8718496ddffffff','873974864ffffff']) AS index
  ''',
  'index',
   ['population_93405ad7'],
   NULL,
   ['my-project.my-dataset.my-enriched-table'],
   'my-dataobs-project.my-dataobs-dataset'
);
-- The table 'my-project.my-dataset.my-enriched-table' will be created
-- with columns: index, and wp_grid100m_10955184.
-- Column wp_grid100m_10955184 will have the fields population_93405ad7,
-- __carto_dimension, __carto_intersection and __carto_total.
CALL `carto-un`.carto.DATAOBS_ENRICH_GRID_RAW(
  'h3',
  'my-project.my-dataset.my-table',
  'index',
   ['population_93405ad7'],
   NULL,
   ['my-project.my-dataset.my-table'],
   'my-dataobs-project.my-dataset'
);
-- The column wp_grid100m_10955184 will be added to the table
-- 'my-project.my-dataset.my-table'.
-- Column wp_grid100m_10955184 will have the fields population_93405ad7,
-- __carto_dimension, __carto_intersection and __carto_total.
CALL `carto-un-eu`.carto.DATAOBS_ENRICH_GRID_RAW(
  'h3',
  'my-project.my-dataset.my-table',
  'index',
   ['population_93405ad7'],
   NULL,
   ['my-project.my-dataset.my-table'],
   'my-dataobs-project.my-dataset'
);
-- The column wp_grid100m_10955184 will be added to the table
-- 'my-project.my-dataset.my-table'.
-- Column wp_grid100m_10955184 will have the fields population_93405ad7,
-- __carto_dimension, __carto_intersection and __carto_total.
CALL carto.DATAOBS_ENRICH_GRID_RAW(
  'h3',
  'my-project.my-dataset.my-table',
  'index',
   ['population_93405ad7'],
   NULL,
   ['my-project.my-dataset.my-table'],
   'my-dataobs-project.my-dataset'
);
-- The column wp_grid100m_10955184 will be added to the table
-- 'my-project.my-dataset.my-table'.
-- Column wp_grid100m_10955184 will have the fields population_93405ad7,
-- __carto_dimension, __carto_intersection and __carto_total.

DATAOBS_ENRICH_GRID_WEIGHTED

DATAOBS_ENRICH_GRID_WEIGHTED(grid_type, input_query, input_index_column, variables, filters, weight_variable, custom_weight_query, do_weight_filters, output, source)

Description

This procedure enriches a query containing grid cell indexes of one of the supported types (H3, Quadbin) with data from the Data Observatory using a speficied feature as weight for the enrichment, which weights appropiately the intersection segments with regard to the total original segment. For example the attribution of the feature to each intersected segment results from the value of the weighted feature in the intersection segment over the total original segment. The user must be subscribed to all the Data Observatory datasets involved in the enrichment. The cells must all have the same resolution. The feature used for the "weights" can either come from Data Observatory or provided by the user.

If the enrich data is indexed by an H3 or Quadbin grid compatible with the input (same grid type and equal or greater resolution), then the enrichment will be performed much more efficiently by matching the index values rather than intersecting associated GEOGRAPHY elements.

As a result of this process, each input grid cell will be enriched with the data of the Data Observatory datasets that spatially intersect it, weighted accordingly by the specified feature . When the input cell intersects with more than one polygon, point, or line of the Data Observatory datasets, the data is aggregated using the aggregation methods specified in a weighted manner where the weights result from the specified feature.

Valid aggregation methods are:

  • COUNT It computes the number of Data Observatory features that contain the enrichment variable and are intersected by the input geography.

If the enrichment of an input table needs to be repeated, please notice that dropping the added columns will generate problems in consecutive enrichments as Bigquery saves those columns during 7 days for time travel purposes. We recommend storing the original table columns in a temporal table, dropping the input table and then recreating the input table from the temporal table.

Input parameters

  • grid_type: STRING Type of grid: "h3" or "quadbin".

  • input_query: STRING query to be enriched (Standard SQL); this query must produce valid grid indexes for the selected grid type in a column of the proper type (STRING for H3, and INT64 for Quadbin). It can include additional columns with data associated with the grid cells that will be preserved. A qualified table name can be given as well, e.g. 'project-id.dataset-id.table-name'.

  • input_index_column: STRING name of a column in the query that contains the grid indexes.

  • weight_variable: STRUCT<variable STRING, aggregation STRING> Variable that will be used to weight the intersections of the input polygons with the Data Observatory datasets polygons, lines, points. Its slug and the aggregation method must be provided. Valid aggregation methods are: SUM, AVG, MAX, MIN, COUNT. This variable can either originate from the Data Observatory or be provided by the user, and it is mandatory. If NULL then an error is raised.

  • custom_weight_query: STRING query that contains the custom variable to be used as weight together with a geography column geom. This field is compulsory when a custom variable is passed in weight_variable, otherwise this must be set to NULL.

  • output: ARRAY<STRING>|NULL containing the name of an output table to store the results and optionally an SQL clause that can be used to partition it. The name of the output table should include project and dataset, e.g. ['project-id.dataset-id.table-name'] or ['project-id.dataset-id.table-name', 'PARTITION BY number']. If NULL the enrichment result is returned. When the output table is the same than the input, the input table will be enriched in place.

  • source: STRING name of the location where the Data Observatory subscriptions of the user are stored, in 'project-id.dataset-id' format. If only the 'dataset-id' is included, it uses the project 'carto-data' by default.

Output

The output table will contain all the input columns provided in the input_query and one extra column for each variable in variables, named after its corresponding slug and including a suffix indicating the aggregation method used.

If a new output table is created, it will be clustered by the spatial index to optimize its performance when filtering data by it or using it to join to other grid tables. This is important to visualize the results in a map efficiently. If an SQL clause is included in the output parameter this optimization will not be performed.

Examples

CALL `carto-un`.carto.DATAOBS_ENRICH_GRID_WEIGHTED(
  'h3',
  R'''
  SELECT * FROM UNNEST(['8718496d8ffffff','873974865ffffff','87397486cffffff','8718496daffffff','873974861ffffff','8718496dbffffff','87397494bffffff','8718496ddffffff','873974864ffffff']) AS index
  ''',
  'index',
  [('DI_PC_f7d891ba','avg'),('c02_05_gasto_m__23eb3ae8','avg')],
  NULL,
   ('population_1ff8e0d','sum'),
  NULL,
  NULL,
  ['my-project.my-dataset.my-enriched-table'],
  'my-dataobs-project.my-dataobs-dataset'
)
-- The table 'my-project.my-dataset.my-enriched-table' will be created
-- with columns: index, population_14d9cf55_sum
CALL `carto-un-eu`.carto.DATAOBS_ENRICH_GRID_WEIGHTED(
  'h3',
  R'''
  SELECT * FROM UNNEST(['8718496d8ffffff','873974865ffffff','87397486cffffff','8718496daffffff','873974861ffffff','8718496dbffffff','87397494bffffff','8718496ddffffff','873974864ffffff']) AS index
  ''',
  'index',
  [('DI_PC_f7d891ba','avg'),('c02_05_gasto_m__23eb3ae8','avg')],
  NULL,
   ('population_1ff8e0d','sum'),
  NULL,
  NULL,
  ['my-project.my-dataset.my-enriched-table'],
  'my-dataobs-project.my-dataobs-dataset'
)
-- The table 'my-project.my-dataset.my-enriched-table' will be created
-- with columns: index, population_14d9cf55_sum
CALL carto.DATAOBS_ENRICH_GRID_WEIGHTED(
  'h3',
  R'''
  SELECT * FROM UNNEST(['8718496d8ffffff','873974865ffffff','87397486cffffff','8718496daffffff','873974861ffffff','8718496dbffffff','87397494bffffff','8718496ddffffff','873974864ffffff']) AS index
  ''',
  'index',
  [('DI_PC_f7d891ba','avg'),('c02_05_gasto_m__23eb3ae8','avg')],
  NULL,
   ('population_1ff8e0d','sum'),
  NULL,
  NULL,
  ['my-project.my-dataset.my-enriched-table'],
  'my-dataobs-project.my-dataobs-dataset'
)
-- The table 'my-project.my-dataset.my-enriched-table' will be created
-- with columns: index, population_14d9cf55_sum
CALL `carto-un`.carto.DATAOBS_ENRICH_GRID_WEIGHTED(
  'h3',
  'my-project.my-dataset.my-table',
  'index',
  [('DI_PC_f7d891ba','avg'),('c02_05_gasto_m__23eb3ae8','avg')],
  NULL,
  ('population_1ff8e0d','sum'),
  NULL,
  NULL,
  ['my-project.my-dataset.my-table'],
  'my-dataobs-project.my-dataobs-dataset'
)
-- The column population_14d9cf55_sum will be added to the table
-- 'my-project.my-dataset.my-table'.
CALL `carto-un-eu`.carto.DATAOBS_ENRICH_GRID_WEIGHTED(
  'h3',
  'my-project.my-dataset.my-table',
  'index',
  [('DI_PC_f7d891ba','avg'),('c02_05_gasto_m__23eb3ae8','avg')],
  NULL,
  ('population_1ff8e0d','sum'),
  NULL,
  NULL,
  ['my-project.my-dataset.my-table'],
  'my-dataobs-project.my-dataobs-dataset'
)
-- The column population_14d9cf55_sum will be added to the table
-- 'my-project.my-dataset.my-table'.
CALL carto.DATAOBS_ENRICH_GRID_WEIGHTED(
  'h3',
  'my-project.my-dataset.my-table',
  'index',
  [('DI_PC_f7d891ba','avg'),('c02_05_gasto_m__23eb3ae8','avg')],
  NULL,
  ('population_1ff8e0d','sum'),
  NULL,
  NULL,
  ['my-project.my-dataset.my-table'],
  'my-dataobs-project.my-dataobs-dataset'
)
-- The column population_14d9cf55_sum will be added to the table
-- 'my-project.my-dataset.my-table'.
CALL `carto-un`.carto.DATAOBS_ENRICH_GRID_WEIGHTED(
  'h3',
  'my-project.my-dataset.my-table',
  'index',
  [('DI_PC_f7d891ba','avg'),('c02_05_gasto_m__23eb3ae8','avg')],
  NULL,
  ('population','sum'),
  '''
  SELECT geoid, geom, population FROM `my-project.my-dataset.custom_weight_query`
  ''',
  NULL,
  ['my-project.my-dataset.my-table'],
  'my-dataobs-project.my-dataobs-dataset'
)
CALL `carto-un-eu`.carto.DATAOBS_ENRICH_GRID_WEIGHTED(
  'h3',
  'my-project.my-dataset.my-table',
  'index',
  [('DI_PC_f7d891ba','avg'),('c02_05_gasto_m__23eb3ae8','avg')],
  NULL,
  ('population','sum'),
  '''
  SELECT geoid, geom, population FROM `my-project.my-dataset.custom_weight_query`
  ''',
  NULL,
  ['my-project.my-dataset.my-table'],
  'my-dataobs-project.my-dataobs-dataset'
)
CALL carto.DATAOBS_ENRICH_GRID_WEIGHTED(
  'h3',
  'my-project.my-dataset.my-table',
  'index',
  [('DI_PC_f7d891ba','avg'),('c02_05_gasto_m__23eb3ae8','avg')],
  NULL,
  ('population','sum'),
  '''
  SELECT geoid, geom, population FROM `my-project.my-dataset.custom_weight_query`
  ''',
  NULL,
  ['my-project.my-dataset.my-table'],
  'my-dataobs-project.my-dataobs-dataset'
)

DATAOBS_ENRICH_POINTS

DATAOBS_ENRICH_POINTS(input_query, input_geography_column, variables, filters, output, source)

Description

This procedure enriches a query containing geographic points with data from the Data Observatory. The user must be subscribed to all the Data Observatory datasets involved in the enrichment.

As a result of this process, each input point will be enriched with the data of the Data Observatory datasets that spatially intersect it. When the input point intersects with more than one polygon, point or line of the Data Observatory datasets, the data is aggregated using the aggregation methods specified.

Valid aggregation methods are: SUM, MIN, MAX, AVG, and COUNT.

If the enrichment of an input table needs to be repeated, please notice that dropping the added columns will generate problems in consecutive enrichments as Bigquery saves those columns during 7 days for time travel purposes. We recommend storing the original table columns in a temporal table, dropping the input table and then recreating the input table from the temporal table.

Input parameters

  • input_query: STRING query to be enriched (Standard SQL). A qualified table name can be given as well, e.g. 'project-id.dataset-id.table-name'.

  • input_geography_column: STRING name of the GEOGRAPHY column in the query containing the points to be enriched.

  • output: ARRAY<STRING>|NULL containing the name of an output table to store the results and optionally an SQL clause that can be used to partition it. The name of the output table should include project and dataset, e.g. ['project-id.dataset-id.table-name'] or ['project-id.dataset-id.table-name', 'PARTITION BY number']. If NULL the enrichment result is returned. When the output table is the same than then input, the input table will be enriched in place.

  • source: STRING name of the location where the Data Observatory subscriptions of the user are stored, in 'project-id.dataset-id' format. If only the 'dataset-id' is included, it uses the project 'carto-data' by default.

Output

The output table will contain all the input columns provided in the input_query and one extra column for each variable in variables, named after its corresponding slug and including a suffix indicating the aggregation method used.

If a new output table is created, it will be clustered by the geography column to optimize the performance of spatial filters and joins. This is important to visualize the results in a map efficiently. If an SQL clause is included in the output parameter this optimization will not be performed.

Examples

CALL `carto-un`.carto.DATAOBS_ENRICH_POINTS(
   R'''
   SELECT id, geom FROM `my-project.my-dataset.my-table`
   ''',
   'geom',
   [('population_93405ad7', 'sum')],
   NULL,
   ['my-project.my-dataset.my-enriched-table'],
   'my-dataobs-project.my-dataobs-dataset'
);
-- The table 'my-project.my-dataset.my-enriched-table' will be created
-- with columns: id, geom, population_93405ad7_sum
CALL `carto-un-eu`.carto.DATAOBS_ENRICH_POINTS(
   R'''
   SELECT id, geom FROM `my-project.my-dataset.my-table`
   ''',
   'geom',
   [('population_93405ad7', 'sum')],
   NULL,
   ['my-project.my-dataset.my-enriched-table'],
   'my-dataobs-project.my-dataobs-dataset'
);
-- The table 'my-project.my-dataset.my-enriched-table' will be created
-- with columns: id, geom, population_93405ad7_sum
CALL carto.DATAOBS_ENRICH_POINTS(
   R'''
   SELECT id, geom FROM `my-project.my-dataset.my-table`
   ''',
   'geom',
   [('population_93405ad7', 'sum')],
   NULL,
   ['my-project.my-dataset.my-enriched-table'],
   'my-dataobs-project.my-dataobs-dataset'
);
-- The table 'my-project.my-dataset.my-enriched-table' will be created
-- with columns: id, geom, population_93405ad7_sum
CALL `carto-un`.carto.DATAOBS_ENRICH_POINTS(
   'my-project.my-dataset.my-table',
   'geom',
   [('population_93405ad7', 'sum')],
   NULL,
   ['my-project.my-dataset.my-table'],
   'my-dataobs-project.my-dataobs-dataset'
);
-- The column population_93405ad7_sum will be added to the table
-- 'my-project.my-dataset.my-table'.
CALL `carto-un-eu`.carto.DATAOBS_ENRICH_POINTS(
   'my-project.my-dataset.my-table',
   'geom',
   [('population_93405ad7', 'sum')],
   NULL,
   ['my-project.my-dataset.my-table'],
   'my-dataobs-project.my-dataobs-dataset'
);
-- The column population_93405ad7_sum will be added to the table
-- 'my-project.my-dataset.my-table'.
CALL carto.DATAOBS_ENRICH_POINTS(
   'my-project.my-dataset.my-table',
   'geom',
   [('population_93405ad7', 'sum')],
   NULL,
   ['my-project.my-dataset.my-table'],
   'my-dataobs-project.my-dataobs-dataset'
);
-- The column population_93405ad7_sum will be added to the table
-- 'my-project.my-dataset.my-table'.

DATAOBS_ENRICH_POINTS_RAW

DATAOBS_ENRICH_POINTS_RAW(input_query, input_geography_column, variables, filters, output, source)

Description

This procedure enriches a query containing geographic points with data from the Data Observatory. The user must be subscribed to all the Data Observatory datasets involved in the enrichment.

As a result of this process, each input point will be enriched with the data of the Data Observatory datasets that spatially intersect it. The variable values corresponding to all intersecting Data Observatory features for a given input point will be returned in an ARRAY column. When variables come from multiple Data Observatory geographies, one ARRAY column will be included for each source geography table. Data Observatory geography slugs are used for the names of these columns. Each array contains STRUCTs with one field for each variable (named after the variable slug) and additional measure fields __carto_total, __carto_dimension. See the output information for more details.

If the enrichment of an input table needs to be repeated, please notice that dropping the added columns will generate problems in consecutive enrichments as Bigquery saves those columns during 7 days for time travel purposes. We recommend storing the original table columns in a temporal table, dropping the input table and then recreating the input table from the temporal table.

Input parameters

  • input_query: STRING query to be enriched (Standard SQL). A qualified table name can be given as well, e.g. 'project-id.dataset-id.table-name'.

  • input_geography_column: STRING name of the GEOGRAPHY column in the query containing the points to be enriched.

  • output: ARRAY<STRING>|NULL containing the name of an output table to store the results and optionally an SQL clause that can be used to partition it. The name of the output table should include project and dataset, e.g. ['project-id.dataset-id.table-name'] or ['project-id.dataset-id.table-name', 'PARTITION BY number']. If NULL the enrichment result is returned. When the output table is the same than then input, the input table will be enriched in place.

  • source: STRING name of the location where the Data Observatory subscriptions of the user are stored, in 'project-id.dataset-id' format. If only the 'dataset-id' is included, it uses the project 'carto-data' by default.

Output

The output table will contain all the input columns provided in the input_query and one extra ARRAY column for each Data Observatory geography containing enrichment variables, named after their corresponding geography slug. The array contains STRUCTs with one field for each variable, using the variable slug as the field name. Additional fields will be included with information about the intersected enrichment geographies:

  • __carto_dimension dimension of the Data Observatory geography: 2 for areas (polygons), 1 for lines, and 0 for points.

  • __carto_total area in square meters (for dimension = 2) or length in meters (for dimension = 1) of the Data Observatory feature.

Moreover, another column named __carto_input_area will be added containing the area of the input polygon in square meters.

If a new output table is created, it will be clustered by the geography column to optimize the performance of spatial filters and joins. This is important to visualize the results in a map efficiently. If an SQL clause is included in the output parameter this optimization will not be performed.

Examples

CALL `carto-un`.carto.DATAOBS_ENRICH_POINTS_RAW(
   R'''
   SELECT id, geom FROM `my-project.my-dataset.my-table`
   ''',
   'geom',
   ['population_93405ad7'],
   NULL,
   ['`my-project.my-dataset.my-enriched-table`'],
   'my-dataobs-project.my-dataobs-dataset'
);
-- The table `my-project.my-dataset.my-enriched-table` will be created
-- with columns: id, geom, __carto_input_area and wp_grid100m_10955184.
-- Column wp_grid100m_10955184 will have the fields population_93405ad7, __carto_dimension and __carto_total.
CALL `carto-un-eu`.carto.DATAOBS_ENRICH_POINTS_RAW(
   R'''
   SELECT id, geom FROM `my-project.my-dataset.my-table`
   ''',
   'geom',
   ['population_93405ad7'],
   NULL,
   ['`my-project.my-dataset.my-enriched-table`'],
   'my-dataobs-project.my-dataobs-dataset'
);
-- The table `my-project.my-dataset.my-enriched-table` will be created
-- with columns: id, geom, __carto_input_area and wp_grid100m_10955184.
-- Column wp_grid100m_10955184 will have the fields population_93405ad7, __carto_dimension and __carto_total.
CALL carto.DATAOBS_ENRICH_POINTS_RAW(
   R'''
   SELECT id, geom FROM `my-project.my-dataset.my-table`
   ''',
   'geom',
   ['population_93405ad7'],
   NULL,
   ['`my-project.my-dataset.my-enriched-table`'],
   'my-dataobs-project.my-dataobs-dataset'
);
-- The table `my-project.my-dataset.my-enriched-table` will be created
-- with columns: id, geom, __carto_input_area and wp_grid100m_10955184.
-- Column wp_grid100m_10955184 will have the fields population_93405ad7, __carto_dimension and __carto_total.

Imagine that you need some information about your points of interest. We'll get population information from the Data Observatory at those points:

CALL `carto-un`.carto.DATAOBS_ENRICH_POLYGONS_RAW(
  R'''
    SELECT
      'Point1' AS name, ST_GEOGPOINT(1,42) AS geom
    UNION ALL
    SELECT
      'Point2', ST_GEOGPOINT(-2,40) AS geom
  ''',
  'geom',
  ['population_f9004c56'],
  NULL,
  ['my-project.my-dataset.enriched_points'],
  'my-dataobs-project.my-dataobs-dataset'
)
CALL `carto-un-eu`.carto.DATAOBS_ENRICH_POLYGONS_RAW(
  R'''
    SELECT
      'Point1' AS name, ST_GEOGPOINT(1,42) AS geom
    UNION ALL
    SELECT
      'Point2', ST_GEOGPOINT(-2,40) AS geom
  ''',
  'geom',
  ['population_f9004c56'],
  NULL,
  ['my-project.my-dataset.enriched_points'],
  'my-dataobs-project.my-dataobs-dataset'
)
CALL carto.DATAOBS_ENRICH_POLYGONS_RAW(
  R'''
    SELECT
      'Point1' AS name, ST_GEOGPOINT(1,42) AS geom
    UNION ALL
    SELECT
      'Point2', ST_GEOGPOINT(-2,40) AS geom
  ''',
  'geom',
  ['population_f9004c56'],
  NULL,
  ['my-project.my-dataset.enriched_points'],
  'my-dataobs-project.my-dataobs-dataset'
)

Now let's compute the average density of population at each location:

SELECT
  name,
  AVG(enrichment.population_f9004c56 / NULLIF(enrichment.__carto_total, 0) AS popdens_avg
FROM
  `my-project.my-dataset.enriched_points`, UNNEST(wp_grid1km_b16138c1) enrichment
GROUP BY name
SELECT
  name,
  AVG(enrichment.population_f9004c56 / NULLIF(enrichment.__carto_total, 0) AS popdens_avg
FROM
  `my-project.my-dataset.enriched_points`, UNNEST(wp_grid1km_b16138c1) enrichment
GROUP BY name
SELECT
  name,
  AVG(enrichment.population_f9004c56 / NULLIF(enrichment.__carto_total, 0) AS popdens_avg
FROM
  `my-project.my-dataset.enriched_points`, UNNEST(wp_grid1km_b16138c1) enrichment
GROUP BY name

Instead of creating a new table we could have modified the source table like this:

CALL `carto-un`.carto.DATAOBS_ENRICH_POINTS_RAW(
   'my-project.my-dataset.my-table',
   'geom',
   ['population_93405ad7'],
   NULL,
   ['my-project.my-dataset.my-table'],
   'my-dataobs-project.my-dataobs-dataset'
);
-- The columns __carto_input_area and wp_grid100m_10955184
-- will be added to the table 'my-project.my-dataset.my-table'.
CALL `carto-un-eu`.carto.DATAOBS_ENRICH_POINTS_RAW(
   'my-project.my-dataset.my-table',
   'geom',
   ['population_93405ad7'],
   NULL,
   ['my-project.my-dataset.my-table'],
   'my-dataobs-project.my-dataobs-dataset'
);
-- The columns __carto_input_area and wp_grid100m_10955184
-- will be added to the table 'my-project.my-dataset.my-table'.
CALL carto.DATAOBS_ENRICH_POINTS_RAW(
   'my-project.my-dataset.my-table',
   'geom',
   ['population_93405ad7'],
   NULL,
   ['my-project.my-dataset.my-table'],
   'my-dataobs-project.my-dataobs-dataset'
);
-- The columns __carto_input_area and wp_grid100m_10955184
-- will be added to the table 'my-project.my-dataset.my-table'.

DATAOBS_ENRICH_POLYGONS

DATAOBS_ENRICH_POLYGONS(input_query, input_geography_column, variables, filters, output, source)

Description

This procedure enriches a query containing geographic polygons with data from the Data Observatory. The user must be subscribed to all the Data Observatory datasets involved in the enrichment.

As a result of this process, each input polygon will be enriched with the data of the Data Observatory datasets that spatially intersect it. When the input polygon intersects with more than one polygon, point, or line of the Data Observatory datasets, the data is aggregated using the aggregation methods specified.

Valid aggregation methods are:

  • COUNT It computes the number of Data Observatory features that contain the enrichment variable and are intersected by the input geography.

If the enrichment of an input table needs to be repeated, please notice that dropping the added columns will generate problems in consecutive enrichments as Bigquery saves those columns during 7 days for time travel purposes. We recommend storing the original table columns in a temporal table, dropping the input table and then recreating the input table from the temporal table.

Input parameters

  • input_query: STRING query to be enriched (Standard SQL). A qualified table name can be given as well, e.g. 'project-id.dataset-id.table-name'.

  • input_geography_column: STRING name of the GEOGRAPHY column in the query containing the polygons to be enriched.

  • output: ARRAY<STRING>|NULL containing the name of an output table to store the results and optionally an SQL clause that can be used to partition it. The name of the output table should include project and dataset, e.g. ['project-id.dataset-id.table-name'] or ['project-id.dataset-id.table-name', 'PARTITION BY number']. If NULL the enrichment result is returned. When the output table is the same than then input, the input table will be enriched in place.

  • source: STRING name of the location where the Data Observatory subscriptions of the user are stored, in 'project-id.dataset-id' format. If only the 'dataset-id' is included, it uses the project 'carto-data' by default.

Output

The output table will contain all the input columns provided in the input_query and one extra column for each variable in variables, named after its corresponding slug and including a suffix indicating the aggregation method used.

If a new output table is created, it will be clustered by the geography column to optimize the performance of spatial filters and joins. This is important to visualize the results in a map efficiently. If an SQL clause is included in the output parameter this optimization will not be performed.

Examples

CALL `carto-un`.carto.DATAOBS_ENRICH_POLYGONS(
   R'''
   SELECT id, geom FROM `my-project.my-dataset.my-table`
   ''',
   'geom',
   [('population_93405ad7', 'SUM')],
   NULL,
   ['`my-project.my-dataset.my-enriched-table`'],
   'my-dataobs-project.my-dataobs-dataset'
);
-- The table `my-project.my-dataset.my-enriched-table` will be created
-- with columns: id, geom, population_93405ad7_sum
CALL `carto-un-eu`.carto.DATAOBS_ENRICH_POLYGONS(
   R'''
   SELECT id, geom FROM `my-project.my-dataset.my-table`
   ''',
   'geom',
   [('population_93405ad7', 'SUM')],
   NULL,
   ['`my-project.my-dataset.my-enriched-table`'],
   'my-dataobs-project.my-dataobs-dataset'
);
-- The table `my-project.my-dataset.my-enriched-table` will be created
-- with columns: id, geom, population_93405ad7_sum
CALL carto.DATAOBS_ENRICH_POLYGONS(
   R'''
   SELECT id, geom FROM `my-project.my-dataset.my-table`
   ''',
   'geom',
   [('population_93405ad7', 'SUM')],
   NULL,
   ['`my-project.my-dataset.my-enriched-table`'],
   'my-dataobs-project.my-dataobs-dataset'
);
-- The table `my-project.my-dataset.my-enriched-table` will be created
-- with columns: id, geom, population_93405ad7_sum
CALL `carto-un`.carto.DATAOBS_ENRICH_POLYGONS(
   'my-project.my-dataset.my-table',
   'geom',
   [('population_93405ad7', 'SUM')],
   NULL,
   ['my-project.my-dataset.my-table'],
   'my-dataobs-project.my-dataobs-dataset'
);
-- The column  population_93405ad7_sum will be added to the table
-- 'my-project.my-dataset.my-table'.
CALL `carto-un-eu`.carto.DATAOBS_ENRICH_POLYGONS(
   'my-project.my-dataset.my-table',
   'geom',
   [('population_93405ad7', 'SUM')],
   NULL,
   ['my-project.my-dataset.my-table'],
   'my-dataobs-project.my-dataobs-dataset'
);
-- The column  population_93405ad7_sum will be added to the table
-- 'my-project.my-dataset.my-table'.
CALL carto.DATAOBS_ENRICH_POLYGONS(
   'my-project.my-dataset.my-table',
   'geom',
   [('population_93405ad7', 'SUM')],
   NULL,
   ['my-project.my-dataset.my-table'],
   'my-dataobs-project.my-dataobs-dataset'
);
-- The column  population_93405ad7_sum will be added to the table
-- 'my-project.my-dataset.my-table'.

DATAOBS_ENRICH_POLYGONS_RAW

DATAOBS_ENRICH_POLYGONS_RAW(input_query, input_geography_column, variables, filters, output, source)

Description

This procedure enriches a query containing geographic polygons with data from the Data Observatory. The user must be subscribed to all the Data Observatory datasets involved in the enrichment.

As a result of this process, each input polygon will be enriched with the data of the Data Observatory datasets that spatially intersect it. The variable values corresponding to all intersecting Data Observatory features for a given input polygon will be returned in an ARRAY column. When variables come from multiple Data Observatory geographies, one ARRAY column will be included for each source geography table. Data Observatory geography slugs are used for the names of these columns. Each array contains STRUCTs with one field for each variable (named after the variable slug) and additional measure fields __carto_intersection, __carto_total, __carto_dimension. See the output information for more details.

If the enrichment of an input table needs to be repeated, please notice that dropping the added columns will generate problems in consecutive enrichments as Bigquery saves those columns during 7 days for time travel purposes. We recommend storing the original table columns in a temporal table, dropping the input table and then recreating the input table from the temporal table.

Input parameters

  • input_query: STRING query to be enriched (Standard SQL). A qualified table name can be given as well, e.g. 'project-id.dataset-id.table-name'.

  • input_geography_column: STRING name of the GEOGRAPHY column in the query containing the polygons to be enriched.

  • output: ARRAY<STRING>|NULL containing the name of an output table to store the results and optionally an SQL clause that can be used to partition it. The name of the output table should include project and dataset, e.g. ['project-id.dataset-id.table-name'] or ['project-id.dataset-id.table-name', 'PARTITION BY number']. If NULL the enrichment result is returned. When the output table is the same than then input, the input table will be enriched in place.

  • source: STRING name of the location where the Data Observatory subscriptions of the user are stored, in 'project-id.dataset-id' format. If only the 'dataset-id' is included, it uses the project 'carto-data' by default.

Output

If a new output table is created, it will contain all the input columns provided in the input_query and one extra ARRAY column for each Data Observatory geography containing enrichment variables, named after their corresponding geography slug. The array contains STRUCTs with one field for each variable, using the variable slug as the field name. Additional fields will be included with information about the intersection of the geographies:

  • __carto_dimension dimension of the Data Observatory geography: 2 for areas (polygons), 1 for lines, and 0 for points.

  • __carto_intersection area in square meters (for dimension = 2) or length in meters (for dimension = 1) of the intersection.

  • __carto_total area in square meters (for dimension = 2) or length in meters (for dimension = 1) of the Data Observatory feature.

Moreover, another column named __carto_input_area will be added containing the area of the input polygon in square meters.

The output table will be clustered by the geography column to optimize the performance of spatial filters and joins. This is important to visualize the results in a map efficiently. If an SQL clause is included in the output parameter this optimization will not be performed.

Examples

CALL `carto-un`.carto.DATAOBS_ENRICH_POLYGONS_RAW(
   R'''
   SELECT id, geom FROM `my-project.my-dataset.my-table`
   ''',
   'geom',
   ['population_93405ad7'],
   NULL,
   ['`my-project.my-dataset.my-enriched-table`'],
   'my-dataobs-project.my-dataobs-dataset'
);
-- The table `my-project.my-dataset.my-enriched-table` will be created
-- with columns: id, geom, __carto_input_area and wp_grid100m_10955184.
-- Column wp_grid100m_10955184 will have the fields population_93405ad7,
-- __carto_dimension, __carto_intersection and __carto_total.
CALL `carto-un-eu`.carto.DATAOBS_ENRICH_POLYGONS_RAW(
   R'''
   SELECT id, geom FROM `my-project.my-dataset.my-table`
   ''',
   'geom',
   ['population_93405ad7'],
   NULL,
   ['`my-project.my-dataset.my-enriched-table`'],
   'my-dataobs-project.my-dataobs-dataset'
);
-- The table `my-project.my-dataset.my-enriched-table` will be created
-- with columns: id, geom, __carto_input_area and wp_grid100m_10955184.
-- Column wp_grid100m_10955184 will have the fields population_93405ad7,
-- __carto_dimension, __carto_intersection and __carto_total.
CALL carto.DATAOBS_ENRICH_POLYGONS_RAW(
   R'''
   SELECT id, geom FROM `my-project.my-dataset.my-table`
   ''',
   'geom',
   ['population_93405ad7'],
   NULL,
   ['`my-project.my-dataset.my-enriched-table`'],
   'my-dataobs-project.my-dataobs-dataset'
);
-- The table `my-project.my-dataset.my-enriched-table` will be created
-- with columns: id, geom, __carto_input_area and wp_grid100m_10955184.
-- Column wp_grid100m_10955184 will have the fields population_93405ad7,
-- __carto_dimension, __carto_intersection and __carto_total.

Imagine that you need some information about the population in two areas of interest defined as a 100-meter buffer around two given points.

CALL `carto-un`.carto.DATAOBS_ENRICH_POLYGONS_RAW(
  R'''
    SELECT
      'Area1' AS name,
      `carto-un`.carto.ST_BUFFER(ST_GEOGPOINT(1,42),100,'meters',6) AS geom
    UNION ALL
    SELECT
      'Area2',
      `carto-un`.carto.ST_BUFFER(ST_GEOGPOINT(-2,40),100,'meters',6)
  ''',
  ['population_f9004c56'],
  NULL,
  ['my-project.my-dataset.enriched_area'],
  'my-dataobs-project.my-dataobs-dataset'
)
CALL `carto-un-eu`.carto.DATAOBS_ENRICH_POLYGONS_RAW(
  R'''
    SELECT
      'Area1' AS name,
      `carto-un-eu`.carto.ST_BUFFER(ST_GEOGPOINT(1,42),100,'meters',6) AS geom
    UNION ALL
    SELECT
      'Area2',
      `carto-un-eu`.carto.ST_BUFFER(ST_GEOGPOINT(-2,40),100,'meters',6)
  ''',
  ['population_f9004c56'],
  NULL,
  ['my-project.my-dataset.enriched_area'],
  'my-dataobs-project.my-dataobs-dataset'
)
CALL carto.DATAOBS_ENRICH_POLYGONS_RAW(
  R'''
    SELECT
      'Area1' AS name,
      carto.ST_BUFFER(ST_GEOGPOINT(1,42),100,'meters',6) AS geom
    UNION ALL
    SELECT
      'Area2',
      carto.ST_BUFFER(ST_GEOGPOINT(-2,40),100,'meters',6)
  ''',
  ['population_f9004c56'],
  NULL,
  ['my-project.my-dataset.enriched_area'],
  'my-dataobs-project.my-dataobs-dataset'
)

Now let's compute some aggregated statistics for our area:

  • The sum of the population, adjusted by the fraction of intersected enrichment areas

  • The average density of population, weighted by the intersection areas

SELECT
  name,
  SUM(enrichment.population_f9004c56 * enrichment.__carto_intersection / NULLIF(enrichment.__carto_total, 0)) AS population_sum,
  SUM(enrichment.population_f9004c56 / NULLIF(enrichment.__carto_total, 0) * enrichment.__carto_intersection) / NULLIF(SUM(enrichment.__carto_intersection), 0) AS popdens_avg
FROM
  `my-project.my-dataset.enriched_area`, UNNEST(wp_grid1km_b16138c1) enrichment
GROUP BY name
SELECT
  name,
  SUM(enrichment.population_f9004c56 * enrichment.__carto_intersection / NULLIF(enrichment.__carto_total, 0)) AS population_sum,
  SUM(enrichment.population_f9004c56 / NULLIF(enrichment.__carto_total, 0) * enrichment.__carto_intersection) / NULLIF(SUM(enrichment.__carto_intersection), 0) AS popdens_avg
FROM
  `my-project.my-dataset.enriched_area`, UNNEST(wp_grid1km_b16138c1) enrichment
GROUP BY name
SELECT
  name,
  SUM(enrichment.population_f9004c56 * enrichment.__carto_intersection / NULLIF(enrichment.__carto_total, 0)) AS population_sum,
  SUM(enrichment.population_f9004c56 / NULLIF(enrichment.__carto_total, 0) * enrichment.__carto_intersection) / NULLIF(SUM(enrichment.__carto_intersection), 0) AS popdens_avg
FROM
  `my-project.my-dataset.enriched_area`, UNNEST(wp_grid1km_b16138c1) enrichment
GROUP BY name

Instead of creating a new table we could have modified the source table like this:

CALL `carto-un`.carto.DATAOBS_ENRICH_POLYGONS_RAW(
   'my-project.my-dataset.my-table',
   'geom',
   ['population_93405ad7'],
   NULL,
   ['my-project.my-dataset.my-table'],
   'my-dataobs-project.my-dataobs-dataset'
);
-- Columns __carto_input_area and wp_grid100m_10955184 will be added
-- to the table 'my-project.my-dataset.my-table'.
CALL `carto-un-eu`.carto.DATAOBS_ENRICH_POLYGONS_RAW(
   'my-project.my-dataset.my-table',
   'geom',
   ['population_93405ad7'],
   NULL,
   ['my-project.my-dataset.my-table'],
   'my-dataobs-project.my-dataobs-dataset'
);
-- Columns __carto_input_area and wp_grid100m_10955184 will be added
-- to the table 'my-project.my-dataset.my-table'.
CALL carto.DATAOBS_ENRICH_POLYGONS_RAW(
   'my-project.my-dataset.my-table',
   'geom',
   ['population_93405ad7'],
   NULL,
   ['my-project.my-dataset.my-table'],
   'my-dataobs-project.my-dataobs-dataset'
);
-- Columns __carto_input_area and wp_grid100m_10955184 will be added
-- to the table 'my-project.my-dataset.my-table'.

DATAOBS_ENRICH_POLYGONS_WEIGHTED

DATAOBS_ENRICH_POLYGONS_WEIGHTED(input_query, input_geography_column, variables, filters, weight_variable, custom_weight_query, do_weight_filters, output, source)

Description

This procedure enriches a query containing geographic polygons with data from the Data Observatory. The user must be subscribed to all the Data Observatory datasets involved in the enrichment.

As a result of this process, each input polygon will be enriched with the data of the Data Observatory datasets that spatially intersect it, weighted accordingly by the specified feature. When the input polygon intersects with more than one polygon, point, or line of the Data Observatory datasets, the data is aggregated using the aggregation methods specified. Using a speficied feature as weight for the enrichment, it weights appropiately the intersection segments with regard to the total original segment. For example the attribution of the feature to each intersected segment results from the value of the weighted feature in the intersection segment over the total original segment.

Valid aggregation methods are:

  • COUNT It computes the number of Data Observatory features that contain the enrichment variable and are intersected by the input geography.

If the enrichment of an input table needs to be repeated, please notice that dropping the added columns will generate problems in consecutive enrichments as Bigquery saves those columns during 7 days for time travel purposes. We recommend storing the original table columns in a temporal table, dropping the input table and then recreating the input table from the temporal table.

Input parameters

  • input_query: STRING query to be enriched (Standard SQL). A qualified table name can be given as well, e.g. 'project-id.dataset-id.table-name'.

  • input_geography_column: STRING name of the GEOGRAPHY column in the query containing the polygons to be enriched.

  • weight_variable: STRUCT<variable STRING, aggregation STRING> Variable that will be used to weight the intersections of the input polygons with the Data Observatory datasets polygons, lines, points. Its slug and the aggregation method must be provided. Valid aggregation methods are: SUM, AVG, MAX, MIN, COUNT. This variable can either originate from the Data Observatory or be provided by the user, and it is mandatory. If NULL then an error is raised.

  • custom_weight_query: STRING query that contains the custom variable to be used as weight together with a geography column geom. This field is compulsory when a custom variable is passed in weight_variable, otherwise this must be set to NULL.

  • output: ARRAY<STRING>|NULL containing the name of an output table to store the results and optionally an SQL clause that can be used to partition it. The name of the output table should include project and dataset, e.g. ['project-id.dataset-id.table-name'] or ['project-id.dataset-id.table-name', 'PARTITION BY number']. If NULL the enrichment result is returned. When the output table is the same than then input, the input table will be enriched in place.

  • source: STRING name of the location where the Data Observatory subscriptions of the user are stored, in 'project-id.dataset-id' format. If only the 'dataset-id' is included, it uses the project 'carto-data' by default.

Output

The output table will contain all the input columns provided in the input_query and one extra column for each variable in variables, named after its corresponding slug and including a suffix indicating the aggregation method used.

If a new output table is created, it will be clustered by the geography column to optimize the performance of spatial filters and joins. This is important to visualize the results in a map efficiently. If an SQL clause is included in the output parameter this optimization will not be performed.

Examples

CALL `carto-un`.carto.DATAOBS_ENRICH_POLYGONS_WEIGHTED(
  R'''
  SELECT id, geom FROM `my-project.my-dataset.my-table`
  ''',
  'geom',
  [('DI_PC_f7d891ba','avg'),('c02_05_gasto_m__23eb3ae8','avg')],
  NULL,
  ('population_1ff8e0d','sum'),
  NULL,
  NULL,
  ['my-project.my-dataset.my-table'],
  'my-dataobs-project.my-dataobs-dataset'
);
-- The table `my-project.my-dataset.my-enriched-table` will be created
-- with columns: id, geom, population_93405ad7_sum
CALL `carto-un-eu`.carto.DATAOBS_ENRICH_POLYGONS_WEIGHTED(
  R'''
  SELECT id, geom FROM `my-project.my-dataset.my-table`
  ''',
  'geom',
  [('DI_PC_f7d891ba','avg'),('c02_05_gasto_m__23eb3ae8','avg')],
  NULL,
  ('population_1ff8e0d','sum'),
  NULL,
  NULL,
  ['my-project.my-dataset.my-table'],
  'my-dataobs-project.my-dataobs-dataset'
);
-- The table `my-project.my-dataset.my-enriched-table` will be created
-- with columns: id, geom, population_93405ad7_sum
CALL carto.DATAOBS_ENRICH_POLYGONS_WEIGHTED(
  R'''
  SELECT id, geom FROM `my-project.my-dataset.my-table`
  ''',
  'geom',
  [('DI_PC_f7d891ba','avg'),('c02_05_gasto_m__23eb3ae8','avg')],
  NULL,
  ('population_1ff8e0d','sum'),
  NULL,
  NULL,
  ['my-project.my-dataset.my-table'],
  'my-dataobs-project.my-dataobs-dataset'
);
-- The table `my-project.my-dataset.my-enriched-table` will be created
-- with columns: id, geom, population_93405ad7_sum
CALL `carto-un`.carto.DATAOBS_ENRICH_POLYGONS_WEIGHTED(
  'my-project.my-dataset.my-table',
  'geom',
  [('DI_PC_f7d891ba','avg'),('c02_05_gasto_m__23eb3ae8','avg')],
  NULL,
  ('population_1ff8e0d','sum'),
  NULL,
  NULL,
  ['my-project.my-dataset.my-table'],
  'my-dataobs-project.my-dataobs-dataset'
);
-- The column  population_93405ad7_sum will be added to the table
-- 'my-project.my-dataset.my-table'.
CALL `carto-un-eu`.carto.DATAOBS_ENRICH_POLYGONS_WEIGHTED(
  'my-project.my-dataset.my-table',
  'geom',
  [('DI_PC_f7d891ba','avg'),('c02_05_gasto_m__23eb3ae8','avg')],
  NULL,
  ('population_1ff8e0d','sum'),
  NULL,
  NULL,
  ['my-project.my-dataset.my-table'],
  'my-dataobs-project.my-dataobs-dataset'
);
-- The column  population_93405ad7_sum will be added to the table
-- 'my-project.my-dataset.my-table'.
CALL carto.DATAOBS_ENRICH_POLYGONS_WEIGHTED(
  'my-project.my-dataset.my-table',
  'geom',
  [('DI_PC_f7d891ba','avg'),('c02_05_gasto_m__23eb3ae8','avg')],
  NULL,
  ('population_1ff8e0d','sum'),
  NULL,
  NULL,
  ['my-project.my-dataset.my-table'],
  'my-dataobs-project.my-dataobs-dataset'
);
-- The column  population_93405ad7_sum will be added to the table
-- 'my-project.my-dataset.my-table'.
CALL `carto-un`.carto.DATAOBS_ENRICH_POLYGONS_WEIGHTED(
  'my-project.my-dataset.my-table',
  'geom',
  [('DI_PC_f7d891ba','avg'),('c02_05_gasto_m__23eb3ae8','avg')],
  NULL,
  ('population','sum'),
  '''
  SELECT geoid, geom, population FROM `my-project.my-dataset.custom_weight_query`
  ''',
  NULL,
  ['my-project.my-dataset.my-table'],
  'my-dataobs-project.my-dataobs-dataset'
);
CALL `carto-un-eu`.carto.DATAOBS_ENRICH_POLYGONS_WEIGHTED(
  'my-project.my-dataset.my-table',
  'geom',
  [('DI_PC_f7d891ba','avg'),('c02_05_gasto_m__23eb3ae8','avg')],
  NULL,
  ('population','sum'),
  '''
  SELECT geoid, geom, population FROM `my-project.my-dataset.custom_weight_query`
  ''',
  NULL,
  ['my-project.my-dataset.my-table'],
  'my-dataobs-project.my-dataobs-dataset'
);
CALL carto.DATAOBS_ENRICH_POLYGONS_WEIGHTED(
  'my-project.my-dataset.my-table',
  'geom',
  [('DI_PC_f7d891ba','avg'),('c02_05_gasto_m__23eb3ae8','avg')],
  NULL,
  ('population','sum'),
  '''
  SELECT geoid, geom, population FROM `my-project.my-dataset.custom_weight_query`
  ''',
  NULL,
  ['my-project.my-dataset.my-table'],
  'my-dataobs-project.my-dataobs-dataset'
);

DATAOBS_SAMPLES

DATAOBS_SAMPLES(source, filters)

Description

When calling this procedure, the result shows a list of the DO samples available.

  • source: STRING name of the location where the Data Observatory samples of the user are stored, in project_id.dataset_id format. If only the dataset_id is included, it uses the project carto-data by default.

  • filters: STRING SQL expression to filter the results, e.g. 'dataset_category="Housing"'. And empty string '' or NULL can be used to omit the filtering.

Output

The result is a table with these columns:`

  • dataset_slug Internal identifier of the DO dataset.

  • dataset_name name of the DO dataset.

  • dataset_country name of the country the dataset belongs to.

  • dataset_category name of the dataset category.

  • dataset_license type of license, either "Public data" or "Premium data".

  • dataset_provider name of the dataset provider.

  • dataset_version version of the dataset.

  • dataset_geo_type type of geometry used by the geography: "POINT"/"MULTIPOINT"/"LINESTRING"/"MULTILINESTRING"/"POLYGON"/"MULTIPOLYGON"/"GEOMETRYCOLLECTION".

  • dataset_sample_table name of the user BigQuery sample table.

Example

CALL `carto-un`.carto.DATAOBS_SAMPLES('myproject.mydataset', '');
CALL `carto-un-eu`.carto.DATAOBS_SAMPLES('myproject.mydataset', '');
CALL carto.DATAOBS_SAMPLES('myproject.mydataset', '');

DATAOBS_SUBSCRIPTIONS

DATAOBS_SUBSCRIPTIONS(source, filters)

Description

When calling this procedure, the result shows a list of the DO subscriptions available.

  • source: STRING name of the location where the Data Observatory subscriptions of the user are stored, in project_id.dataset_id format. If only the dataset_id is included, it uses the project carto-data by default.

  • filters: STRING SQL expression to filter the results, e.g. 'dataset_category="Housing"'. And empty string '' or NULL can be used to omit the filtering.

Output

The result is a table with these columns:

  • dataset_slug Internal identifier of the DO dataset.

  • dataset_name name of the DO dataset.

  • dataset_country name of the country the dataset belongs to.

  • dataset_category name of the dataset category.

  • dataset_license type of license, either "Public data" or "Premium data".

  • dataset_provider name of the dataset provider.

  • dataset_version version of the dataset.

  • dataset_geo_type type of geometry used by the geography: "POINT"/"MULTIPOINT"/"LINESTRING"/"MULTILINESTRING"/"POLYGON"/"MULTIPOLYGON"/"GEOMETRYCOLLECTION".

  • dataset_table name of the user BigQuery subscription table to access the dataset.

  • associated_geography_table geography associated with the dataset (NULL if category is Geography meaning the dataset itself is a geography); contains a subscription table/view if available for the geography or the original (public) BigQuery dataset qualified name otherwise.

  • associated_geography_slug internal identifier of the geography associated with the dataset (NULL if category is Geography).

Example

CALL `carto-un`.carto.DATAOBS_SUBSCRIPTIONS('myproject.mydataset', '');
CALL `carto-un-eu`.carto.DATAOBS_SUBSCRIPTIONS('myproject.mydataset', '');
CALL carto.DATAOBS_SUBSCRIPTIONS('myproject.mydataset', '');

Additional examples

DATAOBS_SUBSCRIPTION_VARIABLES

DATAOBS_SUBSCRIPTION_VARIABLES(source, filters)

Description

When calling this procedure, the result shows a list of the DO subscriptions and variables available.

  • source: STRING name of the location where the Data Observatory subscriptions of the user are stored, in project_id.dataset_id format. If only the dataset_id is included, it uses the project carto-data by default.

  • filters: STRING SQL expression to filter the results, e.g. 'variable_type="STRING"'. And empty string '' or NULL can be used to omit the filtering.

Output

The result is a table with one row per variable and these columns:

  • variable_slug unique identifier of the variable. This can be used for enrichment.

  • variable_name column name of the variable.

  • variable_description description of the variable.

  • variable_type type of the variable column.

  • variable_aggregation default aggregation method for the variable.

  • dataset_slug identifier of the dataset the variable belongs to.

  • associated_geography_slug identifier of the corresponding geography. Note that this is NULL if the dataset itself is a geography..

Example

CALL `carto-un`.carto.DATAOBS_SUBSCRIPTION_VARIABLES('myproject.mydataset','');
CALL `carto-un-eu`.carto.DATAOBS_SUBSCRIPTION_VARIABLES('myproject.mydataset','');
CALL carto.DATAOBS_SUBSCRIPTION_VARIABLES('myproject.mydataset','');

Additional examples

ENRICH_GRID

ENRICH_GRID(grid_type, input_query, input_index_column, data_query, data_geography_column, variables, output)

Description

This procedure enriches a query containing grid cell indexes of one of the supported types (H3, Quadbin) with data from another enrichment query that contains geographies, thus effectively transferring geography-based data to an spatial grid.

As a result of this process, each input grid cell will be enriched with the data of the enrichment query that spatially intersects it. When the input cell intersects with more than one feature of the enrichment query, the data is aggregated using the aggregation methods specified.

Valid aggregation methods are:

  • COUNT It computes the number of enrich features that contain the enrichment variable and are intersected by the input geography.

If the enrichment of an input table needs to be repeated, please notice that dropping the added columns will generate problems in consecutive enrichments as Bigquery saves those columns during 7 days for time travel purposes. We recommend storing the original table columns in a temporal table, dropping the input table and then recreating the input table from the temporal table.

Input parameters

Enrich grid cells with user-provided data.

  • grid_type: Type of grid: "h3", "quadbin".

  • input_query: STRING query to be enriched (Standard SQL); this query must produce valid grid indexes for the selected grid type in a column of the proper type (STRING for H3, and INT64 for Quadbin). It can include additional columns with data associated with the grid cells that will be preserved. A qualified table name can be given as well, e.g. 'project-id.dataset-id.table-name'.

  • input_index_column: STRING name of a column in the query that contains the grid indexes.

  • data_query: STRING query that contains both a geography column and the columns with the data that will be used to enrich the cells provided in the input query.

  • data_geography_column: STRING name of the GEOGRAPHY column provided in the data_query.

  • variables: ARRAY<STRUCT<column STRING, aggregation STRING>> with the columns that will be used to enrich the input polygons and their corresponding aggregation method (SUM, AVG, MAX, MIN, COUNT).

  • output: ARRAY<STRING>|NULL containing the name of an output table to store the results and optionally an SQL clause that can be used to partition it. The name of the output table should include project and dataset, e.g. ['project-id.dataset-id.table-name'] or ['project-id.dataset-id.table-name', 'PARTITION BY number']. If NULL the enrichment result is returned. When the output table is the same than then input, the input table will be enriched in place.

Output

The resulting table has all the input columns and one additional column for each variable in variables, named with a suffix indicating the aggregation method used.

The output table will be clustered by the spatial index to optimize its performance when filtering data by it or using it to join to other grid tables. This is important to visualize the results in a map efficiently. If an SQL clause is included in the output parameter this optimization will not be performed.

Examples

CALL `carto-un`.carto.ENRICH_GRID(
   'h3',
   R'''
   SELECT * FROM UNNEST(['8718496d8ffffff','873974865ffffff','87397486cffffff','8718496daffffff','873974861ffffff','8718496dbffffff','87397494bffffff','8718496ddffffff','873974864ffffff']) AS index
   ''',
   'index',
   R'''
   SELECT geom, var1, var2 FROM `my-project.my-dataset.my-data`
   ''',
   'geom',
   [('var1', 'sum'), ('var2', 'sum'), ('var2', 'max')],
   ['`my-project.my-dataset.my-enriched-table`']
);
-- The table `my-project.my-dataset.my-enriched-table` will be created
-- with columns: index, var1_sum, var2_sum, var2_max
CALL `carto-un-eu`.carto.ENRICH_GRID(
   'h3',
   R'''
   SELECT * FROM UNNEST(['8718496d8ffffff','873974865ffffff','87397486cffffff','8718496daffffff','873974861ffffff','8718496dbffffff','87397494bffffff','8718496ddffffff','873974864ffffff']) AS index
   ''',
   'index',
   R'''
   SELECT geom, var1, var2 FROM `my-project.my-dataset.my-data`
   ''',
   'geom',
   [('var1', 'sum'), ('var2', 'sum'), ('var2', 'max')],
   ['`my-project.my-dataset.my-enriched-table`']
);
-- The table `my-project.my-dataset.my-enriched-table` will be created
-- with columns: index, var1_sum, var2_sum, var2_max
CALL carto.ENRICH_GRID(
   'h3',
   R'''
   SELECT * FROM UNNEST(['8718496d8ffffff','873974865ffffff','87397486cffffff','8718496daffffff','873974861ffffff','8718496dbffffff','87397494bffffff','8718496ddffffff','873974864ffffff']) AS index
   ''',
   'index',
   R'''
   SELECT geom, var1, var2 FROM `my-project.my-dataset.my-data`
   ''',
   'geom',
   [('var1', 'sum'), ('var2', 'sum'), ('var2', 'max')],
   ['`my-project.my-dataset.my-enriched-table`']
);
-- The table `my-project.my-dataset.my-enriched-table` will be created
-- with columns: index, var1_sum, var2_sum, var2_max
CALL `carto-un`.carto.ENRICH_GRID(
   'h3',
   'my-project.my-dataset.my-table',
   'index',
   R'''
   SELECT geom, var1, var2 FROM `my-project.my-dataset.my-data`
   ''',
   'geom',
   [('var1', 'sum'), ('var2', 'sum'), ('var2', 'max')],
   ['my-project.my-dataset.my-table']
);
-- The columns var1_sum, var2_sum, var2_max will be added to the table
-- 'my-project.my-dataset.my-table'.
CALL `carto-un-eu`.carto.ENRICH_GRID(
   'h3',
   'my-project.my-dataset.my-table',
   'index',
   R'''
   SELECT geom, var1, var2 FROM `my-project.my-dataset.my-data`
   ''',
   'geom',
   [('var1', 'sum'), ('var2', 'sum'), ('var2', 'max')],
   ['my-project.my-dataset.my-table']
);
-- The columns var1_sum, var2_sum, var2_max will be added to the table
-- 'my-project.my-dataset.my-table'.
CALL carto.ENRICH_GRID(
   'h3',
   'my-project.my-dataset.my-table',
   'index',
   R'''
   SELECT geom, var1, var2 FROM `my-project.my-dataset.my-data`
   ''',
   'geom',
   [('var1', 'sum'), ('var2', 'sum'), ('var2', 'max')],
   ['my-project.my-dataset.my-table']
);
-- The columns var1_sum, var2_sum, var2_max will be added to the table
-- 'my-project.my-dataset.my-table'.

ENRICH_GRID_RAW

ENRICH_GRID_RAW(grid_type, input_query, input_index_column, data_query, data_geography_column, variables, output)

Description

This procedure enriches a query containing grid cell indexes of one of the supported types (H3, Quadbin) with data from another enrichment query that contains geographies, thus effectively transferring geography-based data to an spatial grid.

As a result of this process, each input grid cell will be enriched with the data of the enrichment query that spatially intersects it. The variable values corresponding to all intersecting enrichment features for a given input cell will be returned in an ARRAY column named __carto_enrichment. Each array contains STRUCTs with one field for each variable and additional measure fields __carto_intersection, __carto_total, __carto_dimension. See the output information for more details.

Input parameters

  • grid_type: STRING Type of grid: "h3" or "quadbin". A qualified table name can be given as well, e.g. 'project-id.dataset-id.table-name'.

  • input_query: STRING query to be enriched (Standard SQL); this query must produce valid grid indexes for the selected grid type in a column of the proper type (STRING for H3, and INT64 for Quadbin). It can include additional columns with data associated with the grid cells that will be preserved.

  • input_index_column: STRING name of a column in the query that contains the grid indexes.

  • data_query: STRING query that contains both a geography column and the columns with the data that will be used to enrich the cells provided in the input query.

  • data_geography_column: STRING name of the GEOGRAPHY column provided in the data_query.

  • variables: ARRAY<STRING> of names of the columns in the enrichment query that will be added to the enriched results.

  • output: ARRAY<STRING>|NULL containing the name of an output table to store the results and optionally an SQL clause that can be used to partition it. The name of the output table should include project and dataset, e.g. ['project-id.dataset-id.table-name'] or ['project-id.dataset-id.table-name', 'PARTITION BY number']. If NULL the enrichment result is returned. When the output table is the same than the input, the input table will be enriched in place.

If the enrichment of an input table needs to be repeated, please notice that dropping the added columns will generate problems in consecutive enrichments as Bigquery saves those columns during 7 days for time travel purposes. We recommend storing the original table columns in a temporal table, dropping the input table and then recreating the input table from the temporal table.

Output

The output table will contain all the input columns provided in the input_query and one extra ARRAY column named __carto_enrichment. The array contains STRUCTs with one field for each variable. Additional fields will be included with information about the intersection of the grid cell and the enrichment features.

  • __carto_dimension dimension of the enrichment geography: 2 for areas (polygons), 1 for lines, and 0 for points.

  • __carto_intersection area in square meters (for dimension = 2) or length in meters (for dimension = 1) of the intersection.

  • __carto_total area in square meters (for dimension = 2) or length in meters (for dimension = 1) of the enrichment feature.

The output table will be clustered by the spatial index to optimize its performance when filtering data by it or using it to join to other grid tables. This is important to visualize the results in a map efficiently. If an SQL clause is included in the output parameter this optimization will not be performed.

Examples

CALL `carto-un`.carto.ENRICH_GRID(
   'h3',
   R'''
   SELECT * FROM UNNEST(['8718496d8ffffff','873974865ffffff','87397486cffffff','8718496daffffff','873974861ffffff','8718496dbffffff','87397494bffffff','8718496ddffffff','873974864ffffff']) AS index
   ''',
   'index',
   R'''
   SELECT geom, var1, var2 FROM `my-project.my-dataset.my-data`
   ''',
   'geom',
   ['var1', 'var2'],
   ['`my-project.my-dataset.my-enriched-table`']
);
-- The table `my-project.my-dataset.my-enriched-table` will be created
-- with columns: index, __carto_enrichment. The latter will contain STRUCTs with the fields var1, var2,
-- __carto_intersection, __carto_total, and __carto_dimension.
CALL `carto-un-eu`.carto.ENRICH_GRID(
   'h3',
   R'''
   SELECT * FROM UNNEST(['8718496d8ffffff','873974865ffffff','87397486cffffff','8718496daffffff','873974861ffffff','8718496dbffffff','87397494bffffff','8718496ddffffff','873974864ffffff']) AS index
   ''',
   'index',
   R'''
   SELECT geom, var1, var2 FROM `my-project.my-dataset.my-data`
   ''',
   'geom',
   ['var1', 'var2'],
   ['`my-project.my-dataset.my-enriched-table`']
);
-- The table `my-project.my-dataset.my-enriched-table` will be created
-- with columns: index, __carto_enrichment. The latter will contain STRUCTs with the fields var1, var2,
-- __carto_intersection, __carto_total, and __carto_dimension.
CALL carto.ENRICH_GRID(
   'h3',
   R'''
   SELECT * FROM UNNEST(['8718496d8ffffff','873974865ffffff','87397486cffffff','8718496daffffff','873974861ffffff','8718496dbffffff','87397494bffffff','8718496ddffffff','873974864ffffff']) AS index
   ''',
   'index',
   R'''
   SELECT geom, var1, var2 FROM `my-project.my-dataset.my-data`
   ''',
   'geom',
   ['var1', 'var2'],
   ['`my-project.my-dataset.my-enriched-table`']
);
-- The table `my-project.my-dataset.my-enriched-table` will be created
-- with columns: index, __carto_enrichment. The latter will contain STRUCTs with the fields var1, var2,
-- __carto_intersection, __carto_total, and __carto_dimension.
CALL `carto-un`.carto.ENRICH_GRID(
   'h3',
   'my-project.my-dataset.my-table',
   'index',
   R'''
   SELECT geom, var1, var2 FROM `my-project.my-dataset.my-data`
   ''',
   'geom',
   ['var1', 'var2'],
   ['my-project.my-dataset.my-table']
);
-- The column __carto_enrichment will be added to the table
-- 'my-project.my-dataset.my-table'.
-- The new column will contain STRUCTs with the fields var1, var2,
-- __carto_intersection, __carto_total, and __carto_dimension.
CALL `carto-un-eu`.carto.ENRICH_GRID(
   'h3',
   'my-project.my-dataset.my-table',
   'index',
   R'''
   SELECT geom, var1, var2 FROM `my-project.my-dataset.my-data`
   ''',
   'geom',
   ['var1', 'var2'],
   ['my-project.my-dataset.my-table']
);
-- The column __carto_enrichment will be added to the table
-- 'my-project.my-dataset.my-table'.
-- The new column will contain STRUCTs with the fields var1, var2,
-- __carto_intersection, __carto_total, and __carto_dimension.
CALL carto.ENRICH_GRID(
   'h3',
   'my-project.my-dataset.my-table',
   'index',
   R'''
   SELECT geom, var1, var2 FROM `my-project.my-dataset.my-data`
   ''',
   'geom',
   ['var1', 'var2'],
   ['my-project.my-dataset.my-table']
);
-- The column __carto_enrichment will be added to the table
-- 'my-project.my-dataset.my-table'.
-- The new column will contain STRUCTs with the fields var1, var2,
-- __carto_intersection, __carto_total, and __carto_dimension.

ENRICH_POINTS

ENRICH_POINTS(input_query, input_geography_column, data_query, data_geography_column, variables, output)

Description

This procedure enriches a query containing geographic points with data from another query, spatially matching both and aggregating the result.

As a result of this process, each input point will be enriched with the data from the enrichment query that spatially intersects it. When the input point intersects with more than one enrichment polygon, point, or line, the data is aggregated using the aggregation methods specified.

Valid aggregation methods are: SUM, MIN, MAX, AVG, and COUNT.

If the enrichment of an input table needs to be repeated, please notice that dropping the added columns will generate problems in consecutive enrichments as Bigquery saves those columns during 7 days for time travel purposes. We recommend storing the original table columns in a temporal table, dropping the input table and then recreating the input table from the temporal table.

Input parameters

  • input_query: STRING query to be enriched (Standard SQL). A qualified table name can be given as well, e.g. 'project-id.dataset-id.table-name'.

  • input_geography_column: STRING name of the GEOGRAPHY column in the query containing the points to be enriched.

  • data_query: STRING query that contains both a geography column and the columns with the data that will be used to enrich the points provided in the input query.

  • data_geography_column: STRING name of the GEOGRAPHY column provided in the data_query.

  • variables: ARRAY<STRUCT<column STRING, aggregation STRING>> with the columns that will be used to enrich the input points and their corresponding aggregation method

  • output: ARRAY<STRING>|NULL containing the name of an output table to store the results * output: ARRAY<STRING>|NULL containing the name of an output table to store the results and optionally an SQL clause that can be used to partition it. The name of the output table should include project and dataset, e.g. ['project-id.dataset-id.table-name'] or ['project-id.dataset-id.table-name', 'PARTITION BY number']. If NULL the enrichment result is returned. When the output table is the same than then input, the input table will be enriched in place.

Output

The output table will contain all the input columns provided in the input_query and one extra column for each variable in variables, named after its corresponding enrichment column and including a suffix indicating the aggregation method used.

The output table will be clustered by the geography column to optimize the performance of spatial filters and joins. This is important to visualize the results in a map efficiently. If an SQL clause is included in the output parameter this optimization will not be performed.

Examples

CALL `carto-un`.carto.ENRICH_POINTS(
   R'''
   SELECT id, geom FROM `my-project.my-dataset.my-input`
   ''',
   'geom',
   R'''
   SELECT geom, var1, var2 FROM `my-project.my-dataset.my-data`
   ''',
   'geom',
   [('var1', 'sum'), ('var2', 'sum'), ('var2', 'max')],
   ['`my-project.my-dataset.my-enriched-table`']
);
-- The table `my-project.my-dataset.my-enriched-table` will be created
-- with columns: id, geom, var1_sum, var2_sum, var2_max
CALL `carto-un-eu`.carto.ENRICH_POINTS(
   R'''
   SELECT id, geom FROM `my-project.my-dataset.my-input`
   ''',
   'geom',
   R'''
   SELECT geom, var1, var2 FROM `my-project.my-dataset.my-data`
   ''',
   'geom',
   [('var1', 'sum'), ('var2', 'sum'), ('var2', 'max')],
   ['`my-project.my-dataset.my-enriched-table`']
);
-- The table `my-project.my-dataset.my-enriched-table` will be created
-- with columns: id, geom, var1_sum, var2_sum, var2_max
CALL carto.ENRICH_POINTS(
   R'''
   SELECT id, geom FROM `my-project.my-dataset.my-input`
   ''',
   'geom',
   R'''
   SELECT geom, var1, var2 FROM `my-project.my-dataset.my-data`
   ''',
   'geom',
   [('var1', 'sum'), ('var2', 'sum'), ('var2', 'max')],
   ['`my-project.my-dataset.my-enriched-table`']
);
-- The table `my-project.my-dataset.my-enriched-table` will be created
-- with columns: id, geom, var1_sum, var2_sum, var2_max
CALL `carto-un`.carto.ENRICH_POINTS(
   'my-project.my-dataset.my-input',
   'geom',
   R'''
   SELECT geom, var1, var2 FROM `my-project.my-dataset.my-data`
   ''',
   'geom',
   [('var1', 'sum'), ('var2', 'sum'), ('var2', 'max')],
   ['my-project.my-dataset.my-input']
);
-- The columns var1_sum, var2_sum, var2_max will be added to the table
-- 'my-project.my-dataset.my-input'.
CALL `carto-un-eu`.carto.ENRICH_POINTS(
   'my-project.my-dataset.my-input',
   'geom',
   R'''
   SELECT geom, var1, var2 FROM `my-project.my-dataset.my-data`
   ''',
   'geom',
   [('var1', 'sum'), ('var2', 'sum'), ('var2', 'max')],
   ['my-project.my-dataset.my-input']
);
-- The columns var1_sum, var2_sum, var2_max will be added to the table
-- 'my-project.my-dataset.my-input'.
CALL carto.ENRICH_POINTS(
   'my-project.my-dataset.my-input',
   'geom',
   R'''
   SELECT geom, var1, var2 FROM `my-project.my-dataset.my-data`
   ''',
   'geom',
   [('var1', 'sum'), ('var2', 'sum'), ('var2', 'max')],
   ['my-project.my-dataset.my-input']
);
-- The columns var1_sum, var2_sum, var2_max will be added to the table
-- 'my-project.my-dataset.my-input'.

ENRICH_POINTS_RAW

ENRICH_POINTS_RAW(input_query, input_geography_column, data_query, data_geography_column, variables, output)

Description

This procedure enriches a query containing geographic points with data from another query, spatially matching both.

As a result of this process, each input point will be enriched with the data of the enrichment query that spatially intersects it. The variable values corresponding to all intersecting enrichment features for a given input point will be returned in an ARRAY column named __carto_enrichment. Each array value in this column contains STRUCTs with one field for each variable and additional measure fields __carto_intersection, __carto_total, `dimension. See the output information for details.

If the enrichment of an input table needs to be repeated, please notice that dropping the added columns will generate problems in consecutive enrichments as Bigquery saves those columns during 7 days for time travel purposes. We recommend storing the original table columns in a temporal table, dropping the input table and then recreating the input table from the temporal table.

Input parameters

  • input_query: STRING query to be enriched (Standard SQL). A qualified table name can be given as well, e.g. 'project-id.dataset-id.table-name'.

  • input_geography_column: STRING name of the GEOGRAPHY column in the query containing the points to be enriched.

  • data_query: STRING query that contains both a geography column and the columns with the data that will be used to enrich the points provided in the input query.

  • data_geography_column: STRING name of the GEOGRAPHY column provided in the data_query.

  • variables: ARRAY<STRING> of names of the columns in the enrichment query that will be added to the enriched results.

  • output: ARRAY<STRING>|NULL containing the name of an output table to store the results and optionally an SQL clause that can be used to partition it. The name of the output table should include project and dataset, e.g. ['project-id.dataset-id.table-name'] or ['project-id.dataset-id.table-name', 'PARTITION BY number']. If NULL the enrichment result is returned. When the output table is the same than then input, the input table will be enriched in place.

Output

The output table will contain all the input columns provided in the input_query, and one extra ARRAY column named __carto_enrichment. The array contains STRUCTs with one field for each variable. Additional fields will be included with information about the intersection of the geographies:

  • __carto_dimension dimension of the enrichment geography: 2 for areas (polygons), 1 for lines, and 0 for points.

  • __carto_total area in square meters (for dimension = 2) or length in meters (for dimension = 1) of the enrichment feature.

The output table will be clustered by the geography column to optimize the performance of spatial filters and joins. This is important to visualize the results in a map efficiently. If an SQL clause is included in the output parameter this optimization will not be performed.

Examples

CALL `carto-un`.carto.ENRICH_POINTS(
   R'''
   SELECT id, geom FROM `my-project.my-dataset.my-input`
   ''',
   'geom',
   R'''
   SELECT geom, var1, var2 FROM `my-project.my-dataset.my-data`
   ''',
   'geom',
   ['var1', 'var2'],
   ['`my-project.my-dataset.my-enriched-table`']
);
-- The table `my-project.my-dataset.my-enriched-table` will be created
-- with columns: id, geom, __carto_enrichment. The latter will contain STRUCTs with the fields var1, var2,
-- __carto_total, and __carto_dimension.
CALL `carto-un-eu`.carto.ENRICH_POINTS(
   R'''
   SELECT id, geom FROM `my-project.my-dataset.my-input`
   ''',
   'geom',
   R'''
   SELECT geom, var1, var2 FROM `my-project.my-dataset.my-data`
   ''',
   'geom',
   ['var1', 'var2'],
   ['`my-project.my-dataset.my-enriched-table`']
);
-- The table `my-project.my-dataset.my-enriched-table` will be created
-- with columns: id, geom, __carto_enrichment. The latter will contain STRUCTs with the fields var1, var2,
-- __carto_total, and __carto_dimension.
CALL carto.ENRICH_POINTS(
   R'''
   SELECT id, geom FROM `my-project.my-dataset.my-input`
   ''',
   'geom',
   R'''
   SELECT geom, var1, var2 FROM `my-project.my-dataset.my-data`
   ''',
   'geom',
   ['var1', 'var2'],
   ['`my-project.my-dataset.my-enriched-table`']
);
-- The table `my-project.my-dataset.my-enriched-table` will be created
-- with columns: id, geom, __carto_enrichment. The latter will contain STRUCTs with the fields var1, var2,
-- __carto_total, and __carto_dimension.
CALL `carto-un`.carto.ENRICH_POINTS(
   'my-project.my-dataset.my-input',
   'geom',
   R'''
   SELECT geom, var1, var2 FROM `my-project.my-dataset.my-data`
   ''',
   'geom',
   ['var1', 'var2'],
   ['my-project.my-dataset.my-input']
);
-- The column __carto_enrichment will be added to the table
-- 'my-project.my-dataset.my-input'.
-- The new column will contain STRUCTs with the fields var1, var2,
-- __carto_total, and __carto_dimension.
CALL `carto-un-eu`.carto.ENRICH_POINTS(
   'my-project.my-dataset.my-input',
   'geom',
   R'''
   SELECT geom, var1, var2 FROM `my-project.my-dataset.my-data`
   ''',
   'geom',
   ['var1', 'var2'],
   ['my-project.my-dataset.my-input']
);
-- The column __carto_enrichment will be added to the table
-- 'my-project.my-dataset.my-input'.
-- The new column will contain STRUCTs with the fields var1, var2,
-- __carto_total, and __carto_dimension.
CALL carto.ENRICH_POINTS(
   'my-project.my-dataset.my-input',
   'geom',
   R'''
   SELECT geom, var1, var2 FROM `my-project.my-dataset.my-data`
   ''',
   'geom',
   ['var1', 'var2'],
   ['my-project.my-dataset.my-input']
);
-- The column __carto_enrichment will be added to the table
-- 'my-project.my-dataset.my-input'.
-- The new column will contain STRUCTs with the fields var1, var2,
-- __carto_total, and __carto_dimension.

ENRICH_POLYGONS

ENRICH_POLYGONS(input_query, input_geography_column, data_query, data_geography_column, variables, output))

Description

This procedure enriches a query containing geographic polygons with data from another query, spatially matching both and aggregating the result.

As a result of this process, each input polygon will be enriched with the data from the enrichment query that spatially intersects it. When the input polygon intersects with more than one enrichment polygon, point or, line, the data is aggregated using the aggregation methods specified.

Valid aggregation methods are:

  • COUNT It computes the number of enrichment features that contain the enrichment variable and are intersected by the input geography.

If the enrichment of an input table needs to be repeated, please notice that dropping the added columns will generate problems in consecutive enrichments as Bigquery saves those columns during 7 days for time travel purposes. We recommend storing the original table columns in a temporal table, dropping the input table and then recreating the input table from the temporal table.

Input parameters

  • input_query: STRING query to be enriched (Standard SQL). A qualified table name can be given as well, e.g. 'project-id.dataset-id.table-name'.

  • input_geography_column: STRING name of the GEOGRAPHY column in the query containing the polygons to be enriched.

  • data_query: STRING query that contains both a geography column and the columns with the data that will be used to enrich the polygons provided in the input query.

  • data_geography_column: STRING name of the GEOGRAPHY column provided in the data_query.

  • variables: ARRAY<STRUCT<column STRING, aggregation STRING>> with the columns that will be used to enrich the input polygons and their corresponding aggregation method (SUM, AVG, MAX, MIN, COUNT).

  • output: ARRAY<STRING>|NULL containing the name of an output table to store the results and optionally an SQL clause that can be used to partition it. The name of the output table should include project and dataset, e.g. ['project-id.dataset-id.table-name'] or ['project-id.dataset-id.table-name', 'PARTITION BY number']. If NULL the enrichment result is returned. When the output table is the same than then input, the input table will be enriched in place.

Output

The output table will contain all the input columns provided in the input_query and one extra column for each variable in variables, named after its corresponding enrichment column and including a suffix indicating the aggregation method used.

The output table will be clustered by the geography column to optimize the performance of spatial filters and joins. This is important to visualize the results in a map efficiently. If an SQL clause is included in the output parameter this optimization will not be performed.

Examples

CALL `carto-un`.carto.ENRICH_POLYGONS(
   R'''
   SELECT id, geom FROM `my-project.my-dataset.my-input`
   ''',
   'geom',
   R'''
   SELECT geom, var1, var2 FROM `my-project.my-dataset.my-data`
   ''',
   'geom',
   [('var1', 'sum'), ('var2', 'sum'), ('var2', 'max')],
   ['`my-project.my-dataset.my-enriched-table`']
);
-- The table `my-project.my-dataset.my-enriched-table` will be created
-- with columns: id, geom, var1_sum, var2_sum, var2_max
CALL `carto-un-eu`.carto.ENRICH_POLYGONS(
   R'''
   SELECT id, geom FROM `my-project.my-dataset.my-input`
   ''',
   'geom',
   R'''
   SELECT geom, var1, var2 FROM `my-project.my-dataset.my-data`
   ''',
   'geom',
   [('var1', 'sum'), ('var2', 'sum'), ('var2', 'max')],
   ['`my-project.my-dataset.my-enriched-table`']
);
-- The table `my-project.my-dataset.my-enriched-table` will be created
-- with columns: id, geom, var1_sum, var2_sum, var2_max
CALL carto.ENRICH_POLYGONS(
   R'''
   SELECT id, geom FROM `my-project.my-dataset.my-input`
   ''',
   'geom',
   R'''
   SELECT geom, var1, var2 FROM `my-project.my-dataset.my-data`
   ''',
   'geom',
   [('var1', 'sum'), ('var2', 'sum'), ('var2', 'max')],
   ['`my-project.my-dataset.my-enriched-table`']
);
-- The table `my-project.my-dataset.my-enriched-table` will be created
-- with columns: id, geom, var1_sum, var2_sum, var2_max
CALL `carto-un`.carto.ENRICH_POLYGONS(
   'my-project.my-dataset.my-input',
   'geom',
   R'''
   SELECT geom, var1, var2 FROM `my-project.my-dataset.my-data`
   ''',
   'geom',
   [('var1', 'sum'), ('var2', 'sum'), ('var2', 'max')],
   ['my-project.my-dataset.my-input']
);
-- The columns var1_sum, var2_sum, var2_max will be added to the table
-- 'my-project.my-dataset.my-input'.
CALL `carto-un-eu`.carto.ENRICH_POLYGONS(
   'my-project.my-dataset.my-input',
   'geom',
   R'''
   SELECT geom, var1, var2 FROM `my-project.my-dataset.my-data`
   ''',
   'geom',
   [('var1', 'sum'), ('var2', 'sum'), ('var2', 'max')],
   ['my-project.my-dataset.my-input']
);
-- The columns var1_sum, var2_sum, var2_max will be added to the table
-- 'my-project.my-dataset.my-input'.
CALL carto.ENRICH_POLYGONS(
   'my-project.my-dataset.my-input',
   'geom',
   R'''
   SELECT geom, var1, var2 FROM `my-project.my-dataset.my-data`
   ''',
   'geom',
   [('var1', 'sum'), ('var2', 'sum'), ('var2', 'max')],
   ['my-project.my-dataset.my-input']
);
-- The columns var1_sum, var2_sum, var2_max will be added to the table
-- 'my-project.my-dataset.my-input'.

ENRICH_POLYGONS_RAW

ENRICH_POLYGONS_RAW(input_query, input_geography_column, data_query, data_geography_column, variables, output))

Description

This procedure enriches a query containing geographic polygons with data from another query, spatially matching both.

As a result of this process, each input polygon will be enriched with the data of the enrichment query that spatially intersects it. The variable values corresponding to all intersecting enrichment features for a given input polygon will be returned in an ARRAY column named __carto_enrichment. Each array value in this column contains STRUCTs with one field for each variable and additional measure fields __carto_intersection, __carto_total, __carto_dimension. See the output information for details.

If the enrichment of an input table needs to be repeated, please notice that dropping the added columns will generate problems in consecutive enrichments as Bigquery saves those columns during 7 days for time travel purposes. We recommend storing the original table columns in a temporal table, dropping the input table and then recreating the input table from the temporal table.

Input parameters

  • input_query: STRING query to be enriched (Standard SQL). A qualified table name can be given as well, e.g. 'project-id.dataset-id.table-name'.

  • input_geography_column: STRING name of the GEOGRAPHY column in the query containing the polygons to be enriched.

  • data_query: STRING query that contains both a geography column and the columns with the data that will be used to enrich the polygons provided in the input query.

  • data_geography_column: STRING name of the GEOGRAPHY column provided in the data_query.

  • variables: ARRAY<STRING> of names of the columns in the enrichment query that will be added to the enriched results.

  • output: ARRAY<STRING>|NULL containing the name of an output table to store the results and optionally an SQL clause that can be used to partition it. The name of the output table should include project and dataset, e.g. ['project-id.dataset-id.table-name'] or ['project-id.dataset-id.table-name', 'PARTITION BY number']. If NULL the enrichment result is returned. When the output table is the same than then input, the input table will be enriched in place.

Output

The output table will contain all the input columns provided in the input_query, and one extra ARRAY column named __carto_enrichment. The array contains STRUCTs with one field for each variable. Additional fields will be included with information about the intersection of the geographies:

  • __carto_dimension dimension of the enrichment geography: 2 for areas (polygons), 1 for lines, and 0 for points.

  • __carto_intersection area in square meters (for dimension = 2) or length in meters (for dimension = 1) of the intersection.

  • __carto_total area in square meters (for dimension = 2) or length in meters (for dimension = 1) of the enrichment feature.

Moreover, another column named __carto_input_area will be added containing the area of the input polygon in square meters.

The output table will be clustered by the geography column to optimize the performance of spatial filters and joins. This is important to visualize the results in a map efficiently. If an SQL clause is included in the output parameter this optimization will not be performed.

Examples

CALL `carto-un`.carto.ENRICH_POLYGONS_RAW(
   R'''
   SELECT id, geom FROM `my-project.my-dataset.my-input`
   ''',
   'geom',
   R'''
   SELECT geom, var1, var2 FROM `my-project.my-dataset.my-data`
   ''',
   'geom',
   ['var1', 'var2'],
   ['`my-project.my-dataset.my-enriched-table`']
);
-- The table `my-project.my-dataset.my-enriched-table` will be created
-- with columns: id, geom, __carto_enrichment. The latter will contain STRUCTs with the fields var1, var2,
-- __carto_intersection, __carto_total, and __carto_dimension.
CALL `carto-un-eu`.carto.ENRICH_POLYGONS_RAW(
   R'''
   SELECT id, geom FROM `my-project.my-dataset.my-input`
   ''',
   'geom',
   R'''
   SELECT geom, var1, var2 FROM `my-project.my-dataset.my-data`
   ''',
   'geom',
   ['var1', 'var2'],
   ['`my-project.my-dataset.my-enriched-table`']
);
-- The table `my-project.my-dataset.my-enriched-table` will be created
-- with columns: id, geom, __carto_enrichment. The latter will contain STRUCTs with the fields var1, var2,
-- __carto_intersection, __carto_total, and __carto_dimension.
CALL carto.ENRICH_POLYGONS_RAW(
   R'''
   SELECT id, geom FROM `my-project.my-dataset.my-input`
   ''',
   'geom',
   R'''
   SELECT geom, var1, var2 FROM `my-project.my-dataset.my-data`
   ''',
   'geom',
   ['var1', 'var2'],
   ['`my-project.my-dataset.my-enriched-table`']
);
-- The table `my-project.my-dataset.my-enriched-table` will be created
-- with columns: id, geom, __carto_enrichment. The latter will contain STRUCTs with the fields var1, var2,
-- __carto_intersection, __carto_total, and __carto_dimension.
CALL `carto-un`.carto.ENRICH_POLYGONS_RAW(
   'my-project.my-dataset.my-input',
   'geom',
   R'''
   SELECT geom, var1, var2 FROM `my-project.my-dataset.my-data`
   ''',
   'geom',
   ['var1', 'var2'],
   ['my-project.my-dataset.my-input']
);
-- The column __carto_enrichment will be added to the table
-- 'my-project.my-dataset.my-input'.
-- The new column will contain STRUCTs with the fields var1, var2,
-- __carto_intersection, __carto_total, and __carto_dimension.
CALL `carto-un-eu`.carto.ENRICH_POLYGONS_RAW(
   'my-project.my-dataset.my-input',
   'geom',
   R'''
   SELECT geom, var1, var2 FROM `my-project.my-dataset.my-data`
   ''',
   'geom',
   ['var1', 'var2'],
   ['my-project.my-dataset.my-input']
);
-- The column __carto_enrichment will be added to the table
-- 'my-project.my-dataset.my-input'.
-- The new column will contain STRUCTs with the fields var1, var2,
-- __carto_intersection, __carto_total, and __carto_dimension.
CALL carto.ENRICH_POLYGONS_RAW(
   'my-project.my-dataset.my-input',
   'geom',
   R'''
   SELECT geom, var1, var2 FROM `my-project.my-dataset.my-data`
   ''',
   'geom',
   ['var1', 'var2'],
   ['my-project.my-dataset.my-input']
);
-- The column __carto_enrichment will be added to the table
-- 'my-project.my-dataset.my-input'.
-- The new column will contain STRUCTs with the fields var1, var2,
-- __carto_intersection, __carto_total, and __carto_dimension.

ENRICH_POLYGONS_WEIGHTED

ENRICH_POLYGONS_WEIGHTED(input_query, input_geography_column, data_query, data_geography_column, variables, weight_variable, custom_weight_query, output)

Description

This procedure enriches a query containing geographic polygons with custom data provided by the user.

As a result of this process, each input polygon will be enriched with the custom data that spatially intersect it, weighted accordingly by the specified feature. When the input polygon intersects with more than one polygon, point, or line of the provided datasets, the data is aggregated using the aggregation methods specified. Using a speficied feature as weight for the enrichment, it weights appropiately the intersection segments with regard to the total original segment. For example the attribution of the feature to each intersected segment results from the value of the weighted feature in the intersection segment over the total original segment.

Valid aggregation methods are:

  • COUNT It computes the number of features that contain the enrichment variable and are intersected by the input geography.

If the enrichment of an input table needs to be repeated, please notice that dropping the added columns will generate problems in consecutive enrichments as Bigquery saves those columns during 7 days for time travel purposes. We recommend storing the original table columns in a temporal table, dropping the input table and then recreating the input table from the temporal table.

Input parameters

  • input_query: STRING query to be enriched (Standard SQL). A qualified table name can be given as well, e.g. 'project-id.dataset-id.table-name'.

  • input_geography_column: STRING name of the GEOGRAPHY column in the query containing the polygons to be enriched.

  • data_query: STRING query that contains both a geography column and the columns with the data that will be used to enrich the polygons provided in the input query.

  • data_geography_column: STRING name of the GEOGRAPHY column provided in the data_query.

  • variables: ARRAY<STRUCT<variable STRING, aggregation STRING>> with the columns that will be used to enrich the input polygons and their corresponding aggregation method (SUM, AVG, MAX, MIN, COUNT). When enriching with multiple variables, all of them must have the same type.

  • weight_variable: STRUCT<variable STRING, aggregation STRING> Variable that will be used to weight the intersections of the input polygons with the provided datasets polygons, lines, points. Its name and the aggregation method must be provided. Valid aggregation methods are: SUM, AVG, MAX, MIN, COUNT. This variable is mandatory. If NULL then an error is raised.

  • custom_weight_query: STRING query that contains the custom variable to be used as weight together with a geography column geom. If it is set to NULL, then the data_query is used for the weight_variable.

  • output: ARRAY<STRING>|NULL containing the name of an output table to store the results and optionally an SQL clause that can be used to partition it. The name of the output table should include project and dataset, e.g. ['project-id.dataset-id.table-name'] or ['project-id.dataset-id.table-name', 'PARTITION BY number']. If NULL the enrichment result is returned. When the output table is the same than then input, the input table will be enriched in place.

Output

The output table will contain all the input columns provided in the input_query and one extra column for each variable in variables, named after its corresponding name and including a suffix indicating the aggregation method used.

If a new output table is created, it will be clustered by the geography column to optimize the performance of spatial filters and joins. This is important to visualize the results in a map efficiently. If an SQL clause is included in the output parameter this optimization will not be performed.

Examples

CALL `carto-un`.carto.ENRICH_POLYGONS_WEIGHTED(
  R'''
  SELECT id, geom FROM `my-project.my-dataset.my-table`
  ''',
  'geom',
  R'''
  SELECT id, v1, v2, geom FROM `my-project.my-dataset.my-table-enrich`
  ''',
  'geom',
   [('v1','avg'),('v2','avg')],
  ('v3','sum'),
  R'''
  SELECT id, v3, geom FROM `my-project.my-dataset.my-table-weight`
  ''',
  ['my-project.my-dataset.my-table']
);
-- The table `my-project.my-dataset.my-table` will be created
-- with columns: id, geom, v1_avg, v2_avg
CALL `carto-un-eu`.carto.ENRICH_POLYGONS_WEIGHTED(
  R'''
  SELECT id, geom FROM `my-project.my-dataset.my-table`
  ''',
  'geom',
  R'''
  SELECT id, v1, v2, geom FROM `my-project.my-dataset.my-table-enrich`
  ''',
  'geom',
   [('v1','avg'),('v2','avg')],
  ('v3','sum'),
  R'''
  SELECT id, v3, geom FROM `my-project.my-dataset.my-table-weight`
  ''',
  ['my-project.my-dataset.my-table']
);
-- The table `my-project.my-dataset.my-table` will be created
-- with columns: id, geom, v1_avg, v2_avg
CALL carto.ENRICH_POLYGONS_WEIGHTED(
  R'''
  SELECT id, geom FROM `my-project.my-dataset.my-table`
  ''',
  'geom',
  R'''
  SELECT id, v1, v2, geom FROM `my-project.my-dataset.my-table-enrich`
  ''',
  'geom',
   [('v1','avg'),('v2','avg')],
  ('v3','sum'),
  R'''
  SELECT id, v3, geom FROM `my-project.my-dataset.my-table-weight`
  ''',
  ['my-project.my-dataset.my-table']
);
-- The table `my-project.my-dataset.my-table` will be created
-- with columns: id, geom, v1_avg, v2_avg
CALL `carto-un`.carto.ENRICH_POLYGONS_WEIGHTED(
  R'''
  SELECT id, geom FROM `my-project.my-dataset.my-table`
  ''',
  'geom',
  R'''
  SELECT id, v1, v2, v3, geom FROM `my-project.my-dataset.my-table-enrich`
  ''',
  'geom',
  [('v1','avg'),('v2','avg')],
  ('v3','sum'),
  NULL,
  ['my-project.my-dataset.my-table']
);
-- The table `my-project.my-dataset.my-table` will be created
-- with columns: id, geom, v1_avg, v2_avg
CALL `carto-un-eu`.carto.ENRICH_POLYGONS_WEIGHTED(
  R'''
  SELECT id, geom FROM `my-project.my-dataset.my-table`
  ''',
  'geom',
  R'''
  SELECT id, v1, v2, v3, geom FROM `my-project.my-dataset.my-table-enrich`
  ''',
  'geom',
  [('v1','avg'),('v2','avg')],
  ('v3','sum'),
  NULL,
  ['my-project.my-dataset.my-table']
);
-- The table `my-project.my-dataset.my-table` will be created
-- with columns: id, geom, v1_avg, v2_avg
CALL carto.ENRICH_POLYGONS_WEIGHTED(
  R'''
  SELECT id, geom FROM `my-project.my-dataset.my-table`
  ''',
  'geom',
  R'''
  SELECT id, v1, v2, v3, geom FROM `my-project.my-dataset.my-table-enrich`
  ''',
  'geom',
  [('v1','avg'),('v2','avg')],
  ('v3','sum'),
  NULL,
  ['my-project.my-dataset.my-table']
);
-- The table `my-project.my-dataset.my-table` will be created
-- with columns: id, geom, v1_avg, v2_avg
CALL `carto-un`.carto.ENRICH_POLYGONS_WEIGHTED(
  R'''
  SELECT id, geom FROM `my-project.my-dataset.my-table`
  ''',
  'geom',
  R'''
  SELECT id, v1, v2, v3, geom FROM `my-project.my-dataset.my-table-enrich`
  ''',
  'geom',
  [('v1','avg'),('v2','avg'), ('v2','sum')],
  ('v3','sum'),
  NULL,
  ['my-project.my-dataset.my-table']
);
-- The table `my-project.my-dataset.my-table` will be created
-- with columns: id, geom, v1_avg, v2_avg, v2_sum
CALL `carto-un-eu`.carto.ENRICH_POLYGONS_WEIGHTED(
  R'''
  SELECT id, geom FROM `my-project.my-dataset.my-table`
  ''',
  'geom',
  R'''
  SELECT id, v1, v2, v3, geom FROM `my-project.my-dataset.my-table-enrich`
  ''',
  'geom',
  [('v1','avg'),('v2','avg'), ('v2','sum')],
  ('v3','sum'),
  NULL,
  ['my-project.my-dataset.my-table']
);
-- The table `my-project.my-dataset.my-table` will be created
-- with columns: id, geom, v1_avg, v2_avg, v2_sum
CALL carto.ENRICH_POLYGONS_WEIGHTED(
  R'''
  SELECT id, geom FROM `my-project.my-dataset.my-table`
  ''',
  'geom',
  R'''
  SELECT id, v1, v2, v3, geom FROM `my-project.my-dataset.my-table-enrich`
  ''',
  'geom',
  [('v1','avg'),('v2','avg'), ('v2','sum')],
  ('v3','sum'),
  NULL,
  ['my-project.my-dataset.my-table']
);
-- The table `my-project.my-dataset.my-table` will be created
-- with columns: id, geom, v1_avg, v2_avg, v2_sum

GRIDIFY_ENRICH

GRIDIFY_ENRICH(input_query, grid_type, grid_level, do_variables, do_source, custom_query, custom_variables, output_table, options)

Description

This procedure converts the input geometries into a grid of the specified type and resolution, and enriches it with Data Observatory and custom data. The user must be subscribed to all the Data Observatory datasets involved in the enrichment.

Input parameters

  • input_query: STRING query containing the geometries to be gridified and enriched, stored in a column named geom. The geometries can be either a set of points, a polygon or a collection of polygons (Polygons or MultiPolygons).

  • grid_type: STRING type of grid. Supported values are h3, and quadbin.

  • do_source: STRING name of the location where the Data Observatory subscriptions of the user are stored, in <my-dataobs-project>.<my-dataobs-dataset> format. If only the <my-dataobs-dataset> is included, it uses the project carto-data by default. It can be set to NULL or ''.

  • custom_query: STRING query that contains a geography column called geom and the columns with the custom data that will be used to enrich the grid cells. It can be set to NULL or ''.

  • custom_variables: ARRAY<STRUCT<variable STRING, aggregation STRING>> list with the columns of the custom_query and their corresponding aggregation method (sum, avg, max, min, count) that will be used to enrich the grid cells. It can be set to NULL.

  • output_table: STRING containing the name of the output table to store the results of the gridification and the enrichment processes performed by the procedure. The name of the output table should include project and dataset: project.dataset.table_name.

  • options: STRING the JSON string containing the available options as described in the table below.

    Option
    Description

    categorical_variables

    ARRAY<STRING> The array containing the names of the categorical (a.k.a. qualitative) columns

    kring

    INT64 size of the kring where the decay function will be applied. This value is set to 0, in which case no kring will be computed and the decay function won't be applied. This option is only applicable to numerical variables.

    decay

Output

The output table will contain all the input columns enriched with Data Observatory and custom data for each cell of the specified grid.

Example

CALL `carto-un`.carto.GRIDIFY_ENRICH(
    -- Input query
    'SELECT geom FROM `cartobq.docs.twin_areas_target`',
    -- Grid params: grid type and level
    'quadbin', 15,
    -- Data Observatory enrichment
    [('population_14d9cf55', 'sum')],
    'my-dataobs-project.my-dataobs-dataset',
    -- Custom data enrichment
    '''
    SELECT geom, var1, var2 FROM `my-project.my-dataset.my-data`
    ''',
    [('var1', 'sum'), ('var2', 'sum'), ('var2', 'max')],
    -- Output table
    'my-project.my-dataset.my-enriched-table',
    -- Options
    R'''
    {
        "kring": 1,
        "decay": "uniform"
    }
    '''
);
-- The table `my-project.my-dataset.my-enriched-table` will be created
-- with columns: index, population_14d9cf55_sum, var1_sum, var2_sum, var2_max
CALL `carto-un-eu`.carto.GRIDIFY_ENRICH(
    -- Input query
    'SELECT geom FROM `cartobq.docs.twin_areas_target`',
    -- Grid params: grid type and level
    'quadbin', 15,
    -- Data Observatory enrichment
    [('population_14d9cf55', 'sum')],
    'my-dataobs-project.my-dataobs-dataset',
    -- Custom data enrichment
    '''
    SELECT geom, var1, var2 FROM `my-project.my-dataset.my-data`
    ''',
    [('var1', 'sum'), ('var2', 'sum'), ('var2', 'max')],
    -- Output table
    'my-project.my-dataset.my-enriched-table',
    -- Options
    R'''
    {
        "kring": 1,
        "decay": "uniform"
    }
    '''
);
-- The table `my-project.my-dataset.my-enriched-table` will be created
-- with columns: index, population_14d9cf55_sum, var1_sum, var2_sum, var2_max
CALL carto.GRIDIFY_ENRICH(
    -- Input query
    'SELECT geom FROM `cartobq.docs.twin_areas_target`',
    -- Grid params: grid type and level
    'quadbin', 15,
    -- Data Observatory enrichment
    [('population_14d9cf55', 'sum')],
    'my-dataobs-project.my-dataobs-dataset',
    -- Custom data enrichment
    '''
    SELECT geom, var1, var2 FROM `my-project.my-dataset.my-data`
    ''',
    [('var1', 'sum'), ('var2', 'sum'), ('var2', 'max')],
    -- Output table
    'my-project.my-dataset.my-enriched-table',
    -- Options
    R'''
    {
        "kring": 1,
        "decay": "uniform"
    }
    '''
);
-- The table `my-project.my-dataset.my-enriched-table` will be created
-- with columns: index, population_14d9cf55_sum, var1_sum, var2_sum, var2_max

Additional examples

SUM: It assumes the aggregated variable is an (e.g. population). Accordingly, the value corresponding to the Data Observatory feature intersected is weighted by the fraction of area or length intersected. If the Data Observatory features are points or grid index matching is used, then a simple sum is performed.

MIN: It assumes the aggregated variable is an (e.g. temperature, population density). Thus, the value is not altered by the intersected area/length as it's the case for SUM.

MAX: It assumes the aggregated variable is an (e.g. temperature, population density). Thus, the value is not altered by the intersected area/length as it's the case for SUM.

AVG: It assumes the aggregated variable is an (e.g. temperature, population density). Thus, the value is not altered by the intersected area/length as it's the case for SUM. However, a is computed, using the intersection areas or lengths as the weight. When the Data Observatory features are points or grid index matching is used, a simple average is computed.

For other types of aggregation, the procedure can be used to obtain non-aggregated data that can be later applied to any desired custom aggregation.

variables: ARRAY<STRUCT<variable STRING, aggregation STRING>>. Variables of the Data Observatory that will be used to enrich the input polygons. For each variable, its slug and the aggregation method must be provided. Use 'default' to use the variable's default aggregation method. Valid aggregation methods are: SUM, AVG, MAX, MIN, COUNT. The catalog procedure can be used to find available variables and their slugs and default aggregation.

filters ARRAY<STRUCT<dataset STRING, expression STRING>>. Filters to be applied to the Data Observatory datasets used in the enrichment can be passed here. Each filter is applied to the Data Observatory dataset or geography, identified by its corresponding slug, passed in the dataset field of the structure. The second field of the structure, expression, is an SQL expression that will be inserted in a WHERE clause and that can reference any column of the dataset or geography table. Please note that column names (not slugs) should be applied here. The catalog procedures and can be used to find both the column names and the corresponding table slugs.

variables: ARRAY<STRING>. Variables of the Data Observatory that will be used to enrich the input polygons. For each variable, its slug must be provided. The catalog procedure can be used to find available variables and their slugs and default aggregation.

filters ARRAY<STRUCT<dataset STRING, expression STRING>>. Filters to be applied to the Data Observatory datasets used in the enrichment can be passed here. Each filter is applied to the Data Observatory dataset or geography, identified by its corresponding slug, passed in the dataset field of the structure. The second field of the structure, expression, is an SQL expression that will be inserted in a WHERE clause and that can reference any column of the dataset or geography table. Please note that column names (not slugs) should be applied here. The catalog procedures and can be used to find both the column names and the corresponding table slugs.

SUM: It assumes the aggregated variable is an (e.g. population). Accordingly, the value corresponding to the Data Observatory feature intersected is weighted by the fraction of area or length intersected. If the Data Observatory features are points or grid index matching is used, then a simple sum is performed.

MIN: It assumes the aggregated variable is an (e.g. temperature, population density). Thus, the value is not altered by the intersected area/length as it's the case for SUM.

MAX: It assumes the aggregated variable is an (e.g. temperature, population density). Thus, the value is not altered by the intersected area/length as it's the case for SUM.

AVG: It assumes the aggregated variable is an (e.g. temperature, population density). Thus, the value is not altered by the intersected area/length as it's the case for SUM. However, a is computed, using the intersection areas or lengths as the weight. When the Data Observatory features are points or grid index matching is used, a simple average is computed.

For other types of aggregation, the procedure can be used to obtain non-aggregated data that can be later applied to any desired custom aggregation.

variables: ARRAY<STRUCT<variable STRING, aggregation STRING>>. Variables of the Data Observatory that will be used to enrich the input polygons. For each variable, its slug and the aggregation method must be provided. Use 'default' to use the variable's default aggregation method. Valid aggregation methods are: SUM, AVG, MAX, MIN, COUNT. The catalog procedure can be used to find available variables and their slugs and default aggregation. When enriching with multiple variables, all of them must have the same type.

filters ARRAY<STRUCT<dataset STRING, expression STRING>>. Filters to be applied to the Data Observatory datasets used in the enrichment can be passed here. Each filter is applied to the Data Observatory dataset or geography, identified by its corresponding slug, passed in the dataset field of the structure. The second field of the structure, expression, is an SQL expression that will be inserted in a WHERE clause and that can reference any column of the dataset or geography table. Please note that column names (not slugs) should be applied here. The catalog procedures and can be used to find both the column names and the corresponding table slugs.

do_weight_filters: STRUCT<dataset STRING, expression STRING> In case the weight_variable is from the DO, filters can be applied. Each filter is applied to the Data Observatory dataset or geography, identified by its corresponding slug, passed in the dataset field of the structure. The second field of the structure, expression, is an SQL expression that will be inserted in a WHERE clause and that can reference any column of the dataset or geography table. Please note that column names (not slugs) should be applied here. The catalog procedures and can be used to find both the column names and the corresponding table slugs.

For special types of aggregation, the procedure can be used to obtain non-aggregated data that can be later applied to any desired custom aggregation.

variables: ARRAY<STRUCT<variable STRING, aggregation STRING>>. Variables of the Data Observatory that will be used to enrich the input polygons. For each variable, its slug and the aggregation method to be used must be provided. Use 'default' to use the variable's default aggregation method. Valid aggregation methods are: SUM, AVG, MAX, MIN, and COUNT. The catalog procedure can be used to find available variables and their slugs and default aggregation.

filters ARRAY<STRUCT<dataset STRING, expression STRING>>. Filters to be applied to the Data Observatory datasets used in the enrichment can be passed here. Each filter is applied to the Data Observatory dataset or geography, identified by its corresponding slug, passed in the dataset field of the structure. The second field of the structure, expression, is an SQL expression that will be inserted in a WHERE clause and that can reference any column of the dataset or geography table. Please note that column names (not slugs) should be applied here. The catalog procedures and can be used to find both the column names and the corresponding table slugs.

variables: ARRAY<STRING> of slugs (unique identifiers) of the Data Observatory variables to add to the input points. The catalog procedure can be used to find available variables and their slugs and default aggregation.

filters ARRAY<STRUCT<dataset STRING, expression STRING>>. Filters to be applied to the Data Observatory datasets used in the enrichment can be passed here. Each filter is applied to the Data Observatory dataset or geography, identified by its corresponding slug, passed in the dataset field of the structure. The second field of the structure, expression, is an SQL expression that will be inserted in a WHERE clause and that can reference any column of the dataset or geography table. Please note that column names (not slugs) should be applied here. The catalog procedures and can be used to find both the column names and the corresponding table slugs.

SUM: It assumes the aggregated variable is an (e.g. population). Accordingly, the value corresponding to the Data Observatory feature intersected is weighted by the fraction of area or length intersected. If the Data Observatory features are points, then a simple sum is performed.

MIN: It assumes the aggregated variable is an (e.g. temperature, population density). Thus, the value is not altered by the intersected area/length as it's the case for SUM.

MAX: It assumes the aggregated variable is an (e.g. temperature, population density). Thus, the value is not altered by the intersected area/length as it's the case for SUM.

AVG: It assumes the aggregated variable is an (e.g. temperature, population density). Thus, the value is not altered by the intersected area/length as it's the case for SUM. However, a is computed, using the intersection areas or lengths as the weight. When the Data Observatory features are points, a simple average is computed.

For other types of aggregation, the procedure can be used to obtain non-aggregated data that can be later applied any desired custom aggregation.

variables: ARRAY<STRUCT<variable STRING, aggregation STRING>>. Variables of the Data Observatory that will be used to enrich the input polygons. For each variable, its slug and the aggregation method to be used must be provided. Use 'default' to use the variable's default aggregation method. Valid aggregation methods are: SUM, AVG, MAX, MIN, and COUNT. The catalog procedure can be used to find available variables and their slugs and default aggregation.

filters ARRAY<STRUCT<dataset STRING, expression STRING>>. Filters to be applied to the Data Observatory datasets used in the enrichment can be passed here. Each filter is applied to the Data Observatory dataset or geography, identified by its corresponding slug, passed in the dataset field of the structure. The second field of the structure, expression, is an SQL expression that will be inserted in a WHERE clause and that can reference any column of the dataset or geography table. Please note that column names (not slugs) should be applied here. The catalog procedures and can be used to find both the column names and the corresponding table slugs.

variables: ARRAY<STRING> of slugs (unique identifiers) of the Data Observatory variables to enrich the input polygons. The catalog procedure can be used to find available variables and their slugs and default aggregation.

filters ARRAY<STRUCT<dataset STRING, expression STRING>>. Filters to be applied to the Data Observatory datasets used in the enrichment can be passed here. Each filter is applied to the Data Observatory dataset or geography, identified by its corresponding slug, passed in the dataset field of the structure. The second field of the structure, expression, is an SQL expression that will be inserted in a WHERE clause and that can reference any column of the dataset or geography table. Please note that column names (not slugs) should be applied here. The catalog procedures and can be used to find both the column names and the corresponding table slugs.

SUM: It assumes the aggregated variable is an (e.g. population). Accordingly, the value corresponding to the Data Observatory feature intersected is weighted by the fraction of the intersected weight variable. If the Data Observatory features are points, then a simple sum is performed.

MIN: It assumes the aggregated variable is an (e.g. temperature, population density). Thus, the value is not altered by the weight variable.

MAX: It assumes the aggregated variable is an (e.g. temperature, population density). Thus, the value is not altered by the weight variable.

AVG: It assumes the aggregated variable is an (e.g. temperature, population density). Thus, the value is not altered by the intersected area/length. However, a is computed, using the intersection values as the weight. When the Data Observatory features are points, a simple average is computed.

For other types of aggregation, the procedure can be used to obtain non-aggregated data that can be later applied any desired custom aggregation.

variables: ARRAY<STRUCT<variable STRING, aggregation STRING>>. Variables of the Data Observatory that will be used to enrich the input polygons. For each variable, its slug and the aggregation method to be used must be provided. Use 'default' to use the variable's default aggregation method. Valid aggregation methods are: SUM, AVG, MAX, MIN, and COUNT. The catalog procedure can be used to find available variables and their slugs and default aggregation. When enriching with multiple variables, all of them must have the same type.

filters ARRAY<STRUCT<dataset STRING, expression STRING>>. Filters to be applied to the Data Observatory datasets used in the enrichment can be passed here. Each filter is applied to the Data Observatory dataset or geography, identified by its corresponding slug, passed in the dataset field of the structure. The second field of the structure, expression, is an SQL expression that will be inserted in a WHERE clause and that can reference any column of the dataset or geography table. Please note that column names (not slugs) should be applied here. The catalog procedures and can be used to find both the column names and the corresponding table slugs.

do_weight_filters: STRUCT<dataset STRING, expression STRING> In case the weight_variable is from the DO, filters can be applied. Each filter is applied to the Data Observatory dataset or geography, identified by its corresponding slug, passed in the dataset field of the structure. The second field of the structure, expression, is an SQL expression that will be inserted in a WHERE clause and that can reference any column of the dataset or geography table. Please note that column names (not slugs) should be applied here. The catalog procedures and can be used to find both the column names and the corresponding table slugs.

SUM: It assumes the aggregated variable is an (e.g. population). Accordingly, the value corresponding to the enrichment feature intersected is weighted by the fraction of area or length intersected. If the enrichment features are points, then a simple sum is performed.

MIN: It assumes the aggregated variable is an (e.g. temperature, population density). Thus, the value is not altered by the intersected area/length as it's the case for SUM.

MAX: It assumes the aggregated variable is an (e.g. temperature, population density). Thus, the value is not altered by the intersected area/length as it's the case for SUM.

AVG: It assumes the aggregated variable is an (e.g. temperature, population density). Thus, the value is not altered by the intersected area/length as it's the case for SUM. However, a is computed, using the intersection areas or lengths as the weight. When the enrich features are points, a simple average is computed.

For other types of aggregation, the procedure can be used to obtain non-aggregated data that can be later applied to any desired custom aggregation.

For special types of aggregation, the procedure can be used to obtain non-aggregated data that can be later applied to any desired custom aggregation.

SUM: It assumes the aggregated variable is an (e.g. population). Accordingly, the value corresponding to the enrichment feature intersected is weighted by the fraction of area or length intersected. If the enrichment features are points, then a simple sum is performed.

MIN: It assumes the aggregated variable is an (e.g. temperature, population density). Thus, the value is not altered by the intersected area/length as it's the case for SUM.

MAX: It assumes the aggregated variable is an (e.g. temperature, population density). Thus, the value is not altered by the intersected area/length as it's the case for SUM.

AVG: It assumes the aggregated variable is an (e.g. temperature, population density). Thus, the value is not altered by the intersected area/length as it's the case for SUM. However, a is computed, using the intersection areas or lengths as the weight. When the enrichment features are points, a simple average is computed.

For other types of aggregation, the procedure can be used to obtain non-aggregated data that can be later applied to any desired custom aggregation.

SUM: It assumes the aggregated variable is an (e.g. population). Accordingly, the value corresponding to the feature intersected is weighted by the fraction of the intersected weight variable.

MIN: It assumes the aggregated variable is an (e.g. temperature, population density). Thus, the value is not altered by the weight variable.

MAX: It assumes the aggregated variable is an (e.g. temperature, population density). Thus, the value is not altered by the weight variable.

AVG: It assumes the aggregated variable is an (e.g. temperature, population density). A is computed, using the value of the intersected weight variable as weights.

The enrichment operations performed using Data Observatory data and custom data are those described in the and procedures, respectively. Please refer to their definition for more detailed information on the process. For numerical variables, the user can also select a k-ring size to aggregate neighboring cells and a decay function to weight the data of the neighbors cells according to their distance (a.k.a. neighboring order).

grid_level: INT64 level or resolution of the cell grid. Check the available and .

do_variables: ARRAY<STRUCT<variable STRING, aggregation STRING>> variables of the Data Observatory that will be used to enrich the grid cells. For each variable, its slug and the aggregation method must be provided. Use default to use the variable’s default aggregation method. Valid aggregation methods are: sum, avg, max, min, count. The catalog procedure can be used to find available variables and their slugs and default aggregation. It can be set to NULL.

STRING decay function to compute the . Available functions are: uniform, inverse, inverse_square and exponential. If set to NULL or '', uniform is used by default. This option is only applicable to numerical variables.

This project has received funding from the research and innovation programme under grant agreement No 960401.

extensive property
intensive property
intensive property
intensive property
weighted average
Opening a new Pizza Hut location in Honolulu
extensive property
intensive property
intensive property
intensive property
weighted average
extensive property
intensive property
intensive property
intensive property
weighted average
extensive property
intensive property
intensive property
intensive property
weighted average
Opening a new Pizza Hut location in Honolulu
Opening a new Pizza Hut location in Honolulu
extensive property
intensive property
intensive property
intensive property
weighted average
extensive property
intensive property
intensive property
intensive property
weighted average
extensive property
intensive property
intensive property
intensive property
weighted average
H3 levels
Quadbin levels
Find Twin Areas of top-performing stores
DATAOBS_ENRICH_GRID_RAW
DATAOBS_SUBSCRIPTION_VARIABLES
DATAOBS_SUBSCRIPTIONS
DATAOBS_SUBSCRIPTION_VARIABLES
DATAOBS_SUBSCRIPTION_VARIABLES
DATAOBS_SUBSCRIPTIONS
DATAOBS_SUBSCRIPTION_VARIABLES
DATAOBS_ENRICH_GRID_RAW
DATAOBS_SUBSCRIPTION_VARIABLES
DATAOBS_SUBSCRIPTIONS
DATAOBS_SUBSCRIPTION_VARIABLES
DATAOBS_SUBSCRIPTIONS
DATAOBS_SUBSCRIPTION_VARIABLES
DATAOBS_ENRICH_POINTS_RAW
DATAOBS_SUBSCRIPTION_VARIABLES
DATAOBS_SUBSCRIPTIONS
DATAOBS_SUBSCRIPTION_VARIABLES
DATAOBS_SUBSCRIPTION_VARIABLES
DATAOBS_SUBSCRIPTIONS
DATAOBS_SUBSCRIPTION_VARIABLES
DATAOBS_ENRICH_POLYGONS_RAW
DATAOBS_SUBSCRIPTION_VARIABLES
DATAOBS_SUBSCRIPTIONS
DATAOBS_SUBSCRIPTION_VARIABLES
DATAOBS_SUBSCRIPTION_VARIABLES
DATAOBS_SUBSCRIPTIONS
DATAOBS_SUBSCRIPTION_VARIABLES
DATAOBS_ENRICH_POLYGONS_RAW
DATAOBS_SUBSCRIPTION_VARIABLES
DATAOBS_SUBSCRIPTIONS
DATAOBS_SUBSCRIPTION_VARIABLES
DATAOBS_SUBSCRIPTIONS
DATAOBS_SUBSCRIPTION_VARIABLES
ENRICH_GRID_RAW
ENRICH_POINTS_RAW
ENRICH_POLYGONS_RAW
DATAOBS_ENRICH_GRID
ENRICH_GRID
DATAOBS_SUBSCRIPTION_VARIABLES
distance decay
European Union’s Horizon 2020
Data observatory subscriptions
Data observatory subscriptions