LogoLogo
HomeAcademyLoginTry for free
  • Welcome
  • What's new
    • Q2 2025
    • Q1 2025
    • Q4 2024
    • Q3 2024
    • Q2 2024
    • Q1 2024
    • Q4 2023
    • Q3 2023
    • Q2 2023
    • Q1 2023
    • Q4 2022
    • Q3 2022
  • FAQs
    • Accounts
    • Migration to the new platform
    • User & organization setup
    • General
    • Builder
    • Workflows
    • Data Observatory
    • Analytics Toolbox
    • Development Tools
    • Deployment Options
    • CARTO Basemaps
    • CARTO for Education
    • Support Packages
    • Security and Compliance
  • Getting started
    • What is CARTO?
    • Quickstart guides
      • Connecting to your data
      • Creating your first map
      • Creating your first workflow
      • Developing your first application
    • CARTO Academy
  • CARTO User Manual
    • Overview
      • Creating your CARTO organization
      • CARTO Cloud Regions
      • CARTO Workspace overview
    • Maps
      • Data sources
        • Simple features
        • Spatial Indexes
        • Pre-generated tilesets
        • Rasters
        • Defining source spatial data
        • Managing data freshness
        • Changing data source location
      • Layers
        • Point
          • Grid point aggregation
          • H3 point aggregation
          • Heatmap point aggregation
          • Cluster point aggregation
        • Polygon
        • Line
        • Grid
        • H3
        • Raster
        • Zoom to layer
      • Widgets
        • Formula widget
        • Category widget
        • Pie widget
        • Histogram widget
        • Range widget
        • Time Series widget
        • Table widget
      • SQL Parameters
        • Date parameter
        • Text parameter
        • Numeric parameter
        • Publishing SQL parameters
      • Interactions
      • Legend
      • Basemaps
        • Basemap selector
      • AI Agents
      • SQL analyses
      • Map view modes
      • Map description
      • Feature selection tool
      • Search locations
      • Measure distances
      • Exporting data
      • Download PDF reports
      • Managing maps
      • Publishing and sharing maps
        • Map settings for viewers
        • Map preview for editors
        • Collaborative maps
        • Embedding maps
        • URL parameters
      • Performance considerations
    • Workflows
      • Workflow canvas
      • Results panel
      • Components
        • Aggregation
        • Custom
        • Data Enrichment
        • Data Preparation
        • Generative AI
        • Input / Output
        • Joins
        • Parsers
        • Raster Operations
        • Spatial Accessors
        • Spatial Analysis
        • Spatial Constructors
        • Spatial Indexes
        • Spatial Operations
        • Statistics
        • Tileset Creation
        • BigQuery ML
        • Snowflake ML
        • Google Earth Engine
        • Google Environment APIs
        • Telco Signal Propagation Models
      • Data Sources
      • Scheduling workflows
      • Sharing workflows
      • Using variables in workflows
      • Executing workflows via API
      • Temporary data in Workflows
      • Extension Packages
      • Managing workflows
      • Workflows best practices
    • Data Explorer
      • Creating a map from your data
      • Importing data
        • Importing rasters
      • Geocoding data
      • Optimizing your data
    • Data Observatory
      • Terminology
      • Browsing the Spatial Data Catalog
      • Subscribing to public and premium datasets
      • Accessing free data samples
      • Managing your subscriptions
      • Accessing your subscriptions from your data warehouse
        • Access data in BigQuery
        • Access data in Snowflake
        • Access data in Databricks
        • Access data in Redshift
        • Access data in PostgreSQL
    • Connections
      • Google BigQuery
      • Snowflake
      • Databricks
      • Amazon Redshift
      • PostgreSQL
      • CARTO Data Warehouse
      • Sharing connections
      • Deleting a connection
      • Required permissions
      • IP whitelisting
      • Customer data responsibilities
    • Applications
    • Settings
      • Understanding your organization quotas
      • Activity Data
        • Activity Data Reference
        • Activity Data Examples
        • Activity Data Changelog
      • Users and Groups
        • Inviting users to your organization
        • Managing user roles
        • Deleting users
        • SSO
        • Groups
        • Mapping groups to user roles
      • CARTO Support Access
      • Customizations
        • Customizing appearance and branding
        • Configuring custom color palettes
        • Configuring your organization basemaps
        • Enabling AI Agents
      • Advanced Settings
        • Managing applications
        • Configuring S3 Bucket for Redshift Imports
        • Configuring OAuth connections to Snowflake
        • Configuring OAuth U2M connections to Databricks
        • Configuring S3 Bucket integration for RDS for PostgreSQL Exports in Builder
        • Configuring Workload Identity Federation for BigQuery
      • Data Observatory
      • Deleting your organization
    • Developers
      • Managing Credentials
        • API Base URL
        • API Access Tokens
        • SPA OAuth Clients
        • M2M OAuth Clients
      • Named Sources
  • Data and Analysis
    • Analytics Toolbox Overview
    • Analytics Toolbox for BigQuery
      • Getting access
        • Projects maintained by CARTO in different BigQuery regions
        • Manual installation in your own project
        • Installation in a Google Cloud VPC
        • Core module
      • Key concepts
        • Tilesets
        • Spatial indexes
      • SQL Reference
        • accessors
        • clustering
        • constructors
        • cpg
        • data
        • http_request
        • import
        • geohash
        • h3
        • lds
        • measurements
        • placekey
        • processing
        • quadbin
        • random
        • raster
        • retail
        • routing
        • s2
        • statistics
        • telco
        • tiler
        • transformations
      • Guides
        • Running queries from Builder
        • Working with Raster data
      • Release notes
      • About Analytics Toolbox regions
    • Analytics Toolbox for Snowflake
      • Getting access
        • Native App from Snowflake's Marketplace
        • Manual installation
      • Key concepts
        • Spatial indexes
        • Tilesets
      • SQL Reference
        • accessors
        • clustering
        • constructors
        • data
        • http_request
        • import
        • h3
        • lds
        • measurements
        • placekey
        • processing
        • quadbin
        • random
        • raster
        • retail
        • s2
        • statistics
        • tiler
        • transformations
      • Guides
        • Running queries from Builder
        • Working with Raster data
      • Release Notes
    • Analytics Toolbox for Databricks
      • Getting access
        • Personal (former Single User) cluster
        • Standard (former Shared) cluster
      • Reference
        • lds
        • tiler
      • Guides
      • Release Notes
    • Analytics Toolbox for Redshift
      • Getting access
        • Manual installation in your database
        • Installation in an Amazon Web Services VPC
        • Core version
      • Key concepts
        • Tilesets
        • Spatial indexes
      • SQL Reference
        • clustering
        • constructors
        • data
        • http_request
        • import
        • lds
        • placekey
        • processing
        • quadbin
        • random
        • s2
        • statistics
        • tiler
        • transformations
      • Guides
        • Running queries from Builder
      • Release Notes
    • Analytics Toolbox for PostgreSQL
      • Getting access
        • Manual installation
        • Core version
      • Key concepts
        • Tilesets
        • Spatial Indexes
      • SQL Reference
        • h3
        • quadbin
        • tiler
      • Guides
        • Creating spatial index tilesets
        • Running queries from Builder
      • Release Notes
    • CARTO + Python
      • Installation
      • Authentication Methods
      • Visualizing Data
      • Working with Data
        • How to work with your data in the CARTO Data Warehouse
        • How to access your Data Observatory subscriptions
        • How to access CARTO's Analytics Toolbox for BigQuery and create visualizations via Python notebooks
        • How to access CARTO’s Analytics Toolbox for Snowflake and create visualizations via Python notebooks
        • How to visualize data from Databricks
      • Reference
    • CARTO QGIS Plugin
  • CARTO for Developers
    • Overview
    • Key concepts
      • Architecture
      • Libraries and APIs
      • Authentication methods
        • API Access Tokens
        • OAuth Access Tokens
        • OAuth Clients
      • Connections
      • Data sources
      • Visualization with deck.gl
        • Basemaps
          • CARTO Basemap
          • Google Maps
            • Examples
              • Gallery
              • Getting Started
              • Basic Examples
                • Hello World
                • BigQuery Tileset Layer
                • Data Observatory Tileset Layer
              • Advanced Examples
                • Arc Layer
                • Extrusion
                • Trips Layer
            • What's New
          • Amazon Location
            • Examples
              • Hello World
              • CartoLayer
            • What's New
        • Rapid Map Prototyping
      • Charts and widgets
      • Filtering and interactivity
      • Summary
    • Quickstart
      • Make your first API call
      • Visualize your first dataset
      • Create your first widget
    • Guides
      • Build a public application
      • Build a private application
      • Build a private application using SSO
      • Visualize massive datasets
      • Integrate CARTO in your existing application
      • Use Boundaries in your application
      • Avoid exposing SQL queries with Named Sources
      • Managing cache in your CARTO applications
    • Reference
      • Deck (@deck.gl reference)
      • Data Sources
        • vectorTableSource
        • vectorQuerySource
        • vectorTilesetSource
        • h3TableSource
        • h3QuerySource
        • h3TilesetSource
        • quadbinTableSource
        • quadbinQuerySource
        • quadbinTilesetSource
        • rasterSource
        • boundaryTableSource
        • boundaryQuerySource
      • Layers (@deck.gl/carto)
      • Widgets
        • Data Sources
        • Server-side vs. client-side
        • Models
          • getFormula
          • getCategories
          • getHistogram
          • getRange
          • getScatter
          • getTimeSeries
          • getTable
      • Filters
        • Column filters
        • Spatial filters
      • CARTO APIs Reference
    • Release Notes
    • Examples
    • CARTO for React
      • Guides
        • Getting Started
        • Views
        • Data Sources
        • Layers
        • Widgets
        • Authentication and Authorization
        • Basemaps
        • Look and Feel
        • Query Parameters
        • Code Generator
        • Sample Applications
        • Deployment
        • Upgrade Guide
      • Examples
      • Library Reference
        • Introduction
        • API
        • Auth
        • Basemaps
        • Core
        • Redux
        • UI
        • Widgets
      • Release Notes
  • CARTO Self-Hosted
    • Overview
    • Key concepts
      • Architecture
      • Deployment requirements
    • Quickstarts
      • Single VM deployment (Kots)
      • Orchestrated container deployment (Kots)
      • Advanced Orchestrated container deployment (Helm)
    • Guides
      • Guides (Kots)
        • Configure your own buckets
        • Configure an external in-memory cache
        • Enable Google Basemaps
        • Enable the CARTO Data Warehouse
        • Configure an external proxy
        • Enable BigQuery OAuth connections
        • Configure Single Sign-On (SSO)
        • Use Workload Identity in GCP
        • High availability configuration for CARTO Self-hosted
        • Configure your custom service account
      • Guides (Helm)
        • Configure your own buckets (Helm)
        • Configure an external in-memory cache (Helm)
        • Enable Google Basemaps (Helm)
        • Enable the CARTO Data Warehouse (Helm)
        • Configure an external proxy (Helm)
        • Enable BigQuery OAuth connections (Helm)
        • Configure Single Sign-On (SSO) (Helm)
        • Use Workload Identity in GCP (Helm)
        • Use EKS Pod Identity in AWS (Helm)
        • Enable Redshift imports (Helm)
        • Migrating CARTO Self-hosted installation to an external database (Helm)
        • Advanced customizations (Helm)
        • Configure your custom service account (Helm)
    • Maintenance
      • Maintenance (Kots)
        • Updates
        • Backups
        • Uninstall
        • Rotating keys
        • Monitoring
        • Change the Admin Console password
      • Maintenance (Helm)
        • Monitoring (Helm)
        • Rotating keys (Helm)
        • Uninstall (Helm)
        • Backups (Helm)
        • Updates (Helm)
    • Support
      • Get debug information for Support (Kots)
      • Get debug information for Support (Helm)
    • CARTO Self-hosted Legacy
      • Key concepts
        • Architecture
        • Deployment requirements
      • Quickstarts
        • Single VM deployment (docker-compose)
      • Guides
        • Configure your own buckets
        • Configure an external in-memory cache
        • Enable Google Basemaps
        • Enable the CARTO Data Warehouse
        • Configure an external proxy
        • Enable BigQuery OAuth connections
        • Configure Single Sign-On (SSO)
        • Enable Redshift imports
        • Configure your custom service account
        • Advanced customizations
        • Migrating CARTO Self-Hosted installation to an external database
      • Maintenance
        • Updates
        • Backups
        • Uninstall
        • Rotating keys
        • Monitoring
      • Support
    • Release Notes
  • CARTO Native App for Snowflake Containers
    • Deploying CARTO using Snowflake Container Services
  • Get Help
    • Legal & Compliance
    • Previous libraries and components
    • Migrating your content to the new CARTO platform
Powered by GitBook
On this page
  • Authentication to CARTO
  • Listing our Data Observatory subscriptions and exploring their metadata
  • Accessing and exporting data from a Data Observatory subscription
  • Enriching data with a Data Observatory subscription

Was this helpful?

Export as PDF
  1. Data and Analysis
  2. CARTO + Python
  3. Working with Data

How to access your Data Observatory subscriptions

PreviousHow to work with your data in the CARTO Data WarehouseNextHow to access CARTO's Analytics Toolbox for BigQuery and create visualizations via Python notebooks

Last updated 1 year ago

Was this helpful?

This guide showcases how to access the data from your Data Observatory subscriptions available in your CARTO Data Warehouse by using the Analytics Toolbox from a Python notebook. You can find the original notebook .

To learn more about how to explore and subscribe to data from our Data Observatory, please check .

We first authenticate to the CARTO account so to be able to access the CARTO Data Warehouse resources with the carto_auth library, and then we use the Python client to explore your Data Observatory subscriptions and select variables of our interest. Finally, we perform an enrichment of a sample dataset with one of our subscriptions.

!pip install carto_auth[carto-dw] pydeck pydeck-carto -q

Authentication to CARTO

We start by using the carto_auth package to authenticate to our CARTO account and to get the necessary details to interact with data available in the CARTO Data Warehouse. Note that the CARTO Data Warehouse is based on Google BigQuery, so we will be using that platform for storing and computing on the data. This also means that we will be levarging the implementation of the .

import pydeck as pdk
import pydeck_carto as pdkc
from carto_auth import CartoAuth
# Authentication with CARTO
carto_auth = CartoAuth.from_oauth()
# CARTO Data Warehouse client
carto_dw_client = carto_auth.get_carto_dw_client()

Listing our Data Observatory subscriptions and exploring their metadata

We first retrieve a list of all our subscriptions as a pandas dataframe in order to explore what datasets from the Data Observatory we have available. For more details about how to use the following SQL functions, please refer to the .

To understand how the Data Observatory structures the datasets, we recommend you to read the of the Data Observatory documentation.

datasets = list(carto_dw_client.list_datasets())

if datasets:
    print("Datasets in CARTO Data Warehouse:")
    for dataset in datasets:
        print("\t{}".format(dataset.dataset_id))
else:
    print("CARTO Data Warehouse project does not contain any datasets.")
cdw_dataset = "carto-data.ac_7xhfwyml" # detail here your own cdw_dataset. This consists of your Google Cloud project_id followed by the BigQuery dataset where your subscriptions are stored: "{project_id}.{bigquery_dataset_id}"
get_subscriptions_q = 
f"""
CALL `carto-un`.carto.DATAOBS_SUBSCRIPTIONS('{cdw_dataset}',"dataset_license = 'Public data'");
"""
subs_df = carto_dw_client.query(get_subscriptions_q).result().to_dataframe()
subs_df.sample(5)

Let’s take a look at what subscriptions we have that are specifically for the “United States”.

subs_df.query("dataset_country == 'United States of America'").tail(5)

After exploring all the available datasets and their metadata, we decide to pick for this example the “Population” dataset from Worldpop and explore what variables it contains. For that we use the dataset_slug.

get_dataset_variables = 
f"""
CALL `carto-un`.carto.DATAOBS_SUBSCRIPTION_VARIABLES(
    "{cdw_dataset}",
    "dataset_slug = 'wp_population_704f6b75'"
    );
"""
vars_df = carto_dw_client.query(get_dataset_variables).result().to_dataframe(create_bqstorage_client=False)
vars_df

Accessing and exporting data from a Data Observatory subscription

Once we have explored the available variables, and we know what we want to do with the data; we can use the Python client for the CARTO Data Warehouse connection and use any available function from the Analytics Toolbox. Additionally, we can export the data to geodataframe or event into local files in csv or parquet for example.

In this example, we will retrieve the population variable for a 10 km buffer around Atlanta.

We will need the IDs of both data tables and geography tables of the specific subscription we want to work with.

dataset_id, geography_id = subs_df.query("dataset_slug == 'wp_population_704f6b75'")[["dataset_table", "associated_geography_table"]].values.ravel()
dataset_id, geography_id
usa_pop_q = 
f"""
WITH whole_usa AS (
SELECT population, geom
FROM `{cdw_dataset}.{dataset_id}` d
JOIN `{cdw_dataset}.{geography_id}` g
ON d.geoid = g.geoid
)
SELECT * FROM whole_usa
WHERE ST_INTERSECTS(geom, ST_BUFFER(ST_GEOGPOINT(-84.387655, 33.760213), 10000))
"""
atlanta_df = carto_dw_client.query(usa_pop_q).result().to_dataframe(create_bqstorage_client=False)
atlanta_df.sample(5)

Now that we have our data of interest in a dataframe, we can also save it in our local machine in several formats.

atlanta_df.to_csv("atlanta_df.csv")
atlanta_df.to_parquet("atlanta_df.parquet")

Enriching data with a Data Observatory subscription

The retail_stores is a dataset with information about revenue and size of retail stores in USA which can be found by default as demo data in your CARTO Data Warehouse. We are going to enrich this table with the population variable from the previous example (slug_id: population_e3a78133), based on the population reported by Worldpop in the location of each retail store.

We define an output table where the enriched data will be placed, also within the CARTO Data Warehouse. Later we use the pydeck-carto package to visualize the results, rendering directly from the table in the data warehouse.

output_table_id = 'carto-dw-ac-7xhfwyml.shared.retail_stores_enriched'
enrich_q = 
f"""
CALL `carto-un`.carto.DATAOBS_ENRICH_POINTS(
   R'''
   SELECT cartodb_id, revenue, geom FROM `carto-demo-data.demo_tables.retail_stores`
   ''',
   'geom',
   [('population_e3a78133', 'sum')],
   NULL,
   ['{output_table_id}'],
   '{cdw_dataset}'
);
"""
carto_dw_client.delete_table(output_table_id, not_found_ok = True)
carto_dw_client.query(enrich_q).result()
carto_dw_client.query(f"SELECT * FROM `{output_table_id}` WHERE population_e3a78133_sum > 0  LIMIT 10").result().to_dataframe(create_bqstorage_client=False)
# Register CartoLayer in pydeck
pdkc.register_carto_layer()
credentials = pdkc.get_layer_credentials(carto_auth)

enriched_layer = pdk.Layer(
    "CartoLayer",
    data = "SELECT * FROM `carto-dw-ac-7xhfwyml.shared.retail_stores_enriched`",
    geo_column=pdk.types.String("geom"),
    type_=pdkc.MapType.QUERY,
    connection=pdkc.CartoConnection.CARTO_DW,
    credentials=credentials,
    opacity=0.2,
    pickable=True,
    stroked=True,
    point_radius_min_pixels=2,
    get_fill_color=pdkc.styles.color_continuous("population_e3a78133_sum", [x*100 for x in range(10)], colors = "Tropic")
    )

tooltip = {
    "html": "Population: <b>{population_e3a78133_sum}</b> - Revenue <b>{revenue}</b>",
    "style": {"background": "grey", "color": "white", "font-family": '"Helvetica Neue", Arial', "z-index": "10000"},
}

view_state = pdk.ViewState(latitude=33.64, longitude=-117.94, zoom=4)
r = pdk.Deck(
    [enriched_layer],
    tooltip = tooltip,
    initial_view_state=view_state,
    map_style=pdk.map_styles.LIGHT,
)
r.to_html(iframe_height = 400)

To learn more about the Data Enrichment functions please check the relevant section of the of the Analytics Toolbox. There is also additional information about the Enrichment workflow with the Analytivccs Toolbox.

here
our documentation
Analytics Toolbox for BigQuery
Analytics Toolbox documentation
Terminology section
SQL Reference