LogoLogo
HomeAcademyLoginTry for free
  • Welcome
  • What's new
    • Q2 2025
    • Q1 2025
    • Q4 2024
    • Q3 2024
    • Q2 2024
    • Q1 2024
    • Q4 2023
    • Q3 2023
    • Q2 2023
    • Q1 2023
    • Q4 2022
    • Q3 2022
  • FAQs
    • Accounts
    • Migration to the new platform
    • User & organization setup
    • General
    • Builder
    • Workflows
    • Data Observatory
    • Analytics Toolbox
    • Development Tools
    • Deployment Options
    • CARTO Basemaps
    • CARTO for Education
    • Support Packages
    • Security and Compliance
  • Getting started
    • What is CARTO?
    • Quickstart guides
      • Connecting to your data
      • Creating your first map
      • Creating your first workflow
      • Developing your first application
    • CARTO Academy
  • CARTO User Manual
    • Overview
      • Creating your CARTO organization
      • CARTO Cloud Regions
      • CARTO Workspace overview
    • Maps
      • Data sources
        • Simple features
        • Spatial Indexes
        • Pre-generated tilesets
        • Rasters
        • Defining source spatial data
        • Managing data freshness
        • Changing data source location
      • Layers
        • Point
          • Grid point aggregation
          • H3 point aggregation
          • Heatmap point aggregation
          • Cluster point aggregation
        • Polygon
        • Line
        • Grid
        • H3
        • Raster
        • Zoom to layer
      • Widgets
        • Formula widget
        • Category widget
        • Pie widget
        • Histogram widget
        • Range widget
        • Time Series widget
        • Table widget
      • SQL Parameters
        • Date parameter
        • Text parameter
        • Numeric parameter
        • Publishing SQL parameters
      • Interactions
      • Legend
      • Basemaps
        • Basemap selector
      • AI Agents
      • SQL analyses
      • Map view modes
      • Map description
      • Feature selection tool
      • Search locations
      • Measure distances
      • Exporting data
      • Download PDF reports
      • Managing maps
      • Sharing and collaboration
        • Editor collaboration
        • Map preview for editors
        • Map settings for viewers
        • Comments
        • Embedding maps
        • URL parameters
      • Performance considerations
    • Workflows
      • Workflow canvas
      • Results panel
      • Components
        • Aggregation
        • Custom
        • Data Enrichment
        • Data Preparation
        • Generative AI
        • Input / Output
        • Joins
        • Parsers
        • Raster Operations
        • Spatial Accessors
        • Spatial Analysis
        • Spatial Constructors
        • Spatial Indexes
        • Spatial Operations
        • Statistics
        • Tileset Creation
        • BigQuery ML
        • Snowflake ML
        • Google Earth Engine
        • Google Environment APIs
        • Telco Signal Propagation Models
      • Data Sources
      • Scheduling workflows
      • Sharing workflows
      • Using variables in workflows
      • Executing workflows via API
      • Temporary data in Workflows
      • Extension Packages
      • Managing workflows
      • Workflows best practices
    • Data Explorer
      • Creating a map from your data
      • Importing data
        • Importing rasters
      • Geocoding data
      • Optimizing your data
    • Data Observatory
      • Terminology
      • Browsing the Spatial Data Catalog
      • Subscribing to public and premium datasets
      • Accessing free data samples
      • Managing your subscriptions
      • Accessing your subscriptions from your data warehouse
        • Access data in BigQuery
        • Access data in Snowflake
        • Access data in Databricks
        • Access data in Redshift
        • Access data in PostgreSQL
    • Connections
      • Google BigQuery
      • Snowflake
      • Databricks
      • Amazon Redshift
      • PostgreSQL
      • CARTO Data Warehouse
      • Sharing connections
      • Deleting a connection
      • Required permissions
      • IP whitelisting
      • Customer data responsibilities
    • Applications
    • Settings
      • Understanding your organization quotas
      • Activity Data
        • Activity Data Reference
        • Activity Data Examples
        • Activity Data Changelog
      • Users and Groups
        • Inviting users to your organization
        • Managing user roles
        • Deleting users
        • SSO
        • Groups
        • Mapping groups to user roles
      • CARTO Support Access
      • Customizations
        • Customizing appearance and branding
        • Configuring custom color palettes
        • Configuring your organization basemaps
        • Enabling AI Agents
      • Advanced Settings
        • Managing applications
        • Configuring S3 Bucket for Redshift Imports
        • Configuring OAuth connections to Snowflake
        • Configuring OAuth U2M connections to Databricks
        • Configuring S3 Bucket integration for RDS for PostgreSQL Exports in Builder
        • Configuring Workload Identity Federation for BigQuery
      • Data Observatory
      • Deleting your organization
    • Developers
      • Managing Credentials
        • API Base URL
        • API Access Tokens
        • SPA OAuth Clients
        • M2M OAuth Clients
      • Named Sources
  • Data and Analysis
    • Analytics Toolbox Overview
    • Analytics Toolbox for BigQuery
      • Getting access
        • Projects maintained by CARTO in different BigQuery regions
        • Manual installation in your own project
        • Installation in a Google Cloud VPC
        • Core module
      • Key concepts
        • Tilesets
        • Spatial indexes
      • SQL Reference
        • accessors
        • clustering
        • constructors
        • cpg
        • data
        • http_request
        • import
        • geohash
        • h3
        • lds
        • measurements
        • placekey
        • processing
        • quadbin
        • random
        • raster
        • retail
        • routing
        • s2
        • statistics
        • telco
        • tiler
        • transformations
      • Guides
        • Running queries from Builder
        • Working with Raster data
      • Release notes
      • About Analytics Toolbox regions
    • Analytics Toolbox for Snowflake
      • Getting access
        • Native App from Snowflake's Marketplace
        • Manual installation
      • Key concepts
        • Spatial indexes
        • Tilesets
      • SQL Reference
        • accessors
        • clustering
        • constructors
        • data
        • http_request
        • import
        • h3
        • lds
        • measurements
        • placekey
        • processing
        • quadbin
        • random
        • raster
        • retail
        • s2
        • statistics
        • tiler
        • transformations
      • Guides
        • Running queries from Builder
        • Working with Raster data
      • Release Notes
    • Analytics Toolbox for Databricks
      • Getting access
        • Personal (former Single User) cluster
        • Standard (former Shared) cluster
      • Reference
        • lds
        • tiler
      • Guides
      • Release Notes
    • Analytics Toolbox for Redshift
      • Getting access
        • Manual installation in your database
        • Installation in an Amazon Web Services VPC
        • Core version
      • Key concepts
        • Tilesets
        • Spatial indexes
      • SQL Reference
        • clustering
        • constructors
        • data
        • http_request
        • import
        • lds
        • placekey
        • processing
        • quadbin
        • random
        • s2
        • statistics
        • tiler
        • transformations
      • Guides
        • Running queries from Builder
      • Release Notes
    • Analytics Toolbox for PostgreSQL
      • Getting access
        • Manual installation
        • Core version
      • Key concepts
        • Tilesets
        • Spatial Indexes
      • SQL Reference
        • h3
        • quadbin
        • tiler
      • Guides
        • Creating spatial index tilesets
        • Running queries from Builder
      • Release Notes
    • CARTO + Python
      • Installation
      • Authentication Methods
      • Visualizing Data
      • Working with Data
        • How to work with your data in the CARTO Data Warehouse
        • How to access your Data Observatory subscriptions
        • How to access CARTO's Analytics Toolbox for BigQuery and create visualizations via Python notebooks
        • How to access CARTO’s Analytics Toolbox for Snowflake and create visualizations via Python notebooks
        • How to visualize data from Databricks
      • Reference
    • CARTO QGIS Plugin
  • CARTO for Developers
    • Overview
    • Key concepts
      • Architecture
      • Libraries and APIs
      • Authentication methods
        • API Access Tokens
        • OAuth Access Tokens
        • OAuth Clients
      • Connections
      • Data sources
      • Visualization with deck.gl
        • Basemaps
          • CARTO Basemap
          • Google Maps
            • Examples
              • Gallery
              • Getting Started
              • Basic Examples
                • Hello World
                • BigQuery Tileset Layer
                • Data Observatory Tileset Layer
              • Advanced Examples
                • Arc Layer
                • Extrusion
                • Trips Layer
            • What's New
          • Amazon Location
            • Examples
              • Hello World
              • CartoLayer
            • What's New
        • Rapid Map Prototyping
      • Charts and widgets
      • Filtering and interactivity
      • Summary
    • Quickstart
      • Make your first API call
      • Visualize your first dataset
      • Create your first widget
    • Guides
      • Build a public application
      • Build a private application
      • Build a private application using SSO
      • Visualize massive datasets
      • Integrate CARTO in your existing application
      • Use Boundaries in your application
      • Avoid exposing SQL queries with Named Sources
      • Managing cache in your CARTO applications
    • Reference
      • Deck (@deck.gl reference)
      • Data Sources
        • vectorTableSource
        • vectorQuerySource
        • vectorTilesetSource
        • h3TableSource
        • h3QuerySource
        • h3TilesetSource
        • quadbinTableSource
        • quadbinQuerySource
        • quadbinTilesetSource
        • rasterSource
        • boundaryTableSource
        • boundaryQuerySource
      • Layers (@deck.gl/carto)
      • Widgets
        • Data Sources
        • Server-side vs. client-side
        • Models
          • getFormula
          • getCategories
          • getHistogram
          • getRange
          • getScatter
          • getTimeSeries
          • getTable
      • Filters
        • Column filters
        • Spatial filters
      • CARTO APIs Reference
    • Release Notes
    • Examples
    • CARTO for React
      • Guides
        • Getting Started
        • Views
        • Data Sources
        • Layers
        • Widgets
        • Authentication and Authorization
        • Basemaps
        • Look and Feel
        • Query Parameters
        • Code Generator
        • Sample Applications
        • Deployment
        • Upgrade Guide
      • Examples
      • Library Reference
        • Introduction
        • API
        • Auth
        • Basemaps
        • Core
        • Redux
        • UI
        • Widgets
      • Release Notes
  • CARTO Self-Hosted
    • Overview
    • Key concepts
      • Architecture
      • Deployment requirements
    • Quickstarts
      • Single VM deployment (Kots)
      • Orchestrated container deployment (Kots)
      • Advanced Orchestrated container deployment (Helm)
    • Guides
      • Guides (Kots)
        • Configure your own buckets
        • Configure an external in-memory cache
        • Enable Google Basemaps
        • Enable the CARTO Data Warehouse
        • Configure an external proxy
        • Enable BigQuery OAuth connections
        • Configure Single Sign-On (SSO)
        • Use Workload Identity in GCP
        • High availability configuration for CARTO Self-hosted
        • Configure your custom service account
      • Guides (Helm)
        • Configure your own buckets (Helm)
        • Configure an external in-memory cache (Helm)
        • Enable Google Basemaps (Helm)
        • Enable the CARTO Data Warehouse (Helm)
        • Configure an external proxy (Helm)
        • Enable BigQuery OAuth connections (Helm)
        • Configure Single Sign-On (SSO) (Helm)
        • Use Workload Identity in GCP (Helm)
        • Use EKS Pod Identity in AWS (Helm)
        • Enable Redshift imports (Helm)
        • Migrating CARTO Self-hosted installation to an external database (Helm)
        • Advanced customizations (Helm)
        • Configure your custom service account (Helm)
    • Maintenance
      • Maintenance (Kots)
        • Updates
        • Backups
        • Uninstall
        • Rotating keys
        • Monitoring
        • Change the Admin Console password
      • Maintenance (Helm)
        • Monitoring (Helm)
        • Rotating keys (Helm)
        • Uninstall (Helm)
        • Backups (Helm)
        • Updates (Helm)
    • Support
      • Get debug information for Support (Kots)
      • Get debug information for Support (Helm)
    • CARTO Self-hosted Legacy
      • Key concepts
        • Architecture
        • Deployment requirements
      • Quickstarts
        • Single VM deployment (docker-compose)
      • Guides
        • Configure your own buckets
        • Configure an external in-memory cache
        • Enable Google Basemaps
        • Enable the CARTO Data Warehouse
        • Configure an external proxy
        • Enable BigQuery OAuth connections
        • Configure Single Sign-On (SSO)
        • Enable Redshift imports
        • Configure your custom service account
        • Advanced customizations
        • Migrating CARTO Self-Hosted installation to an external database
      • Maintenance
        • Updates
        • Backups
        • Uninstall
        • Rotating keys
        • Monitoring
      • Support
    • Release Notes
  • CARTO Native App for Snowflake Containers
    • Deploying CARTO using Snowflake Container Services
  • Get Help
    • Legal & Compliance
    • Previous libraries and components
    • Migrating your content to the new CARTO platform
Powered by GitBook
On this page
  • Loading libraries
  • Authentication to CARTO
  • Authentication to Snowflake
  • Downloading data from Snowflake into a Python dataframe
  • Uploading a dataframe back to Snowflake
  • Visualizing data in Snowflake with the pydeck-carto library

Was this helpful?

Export as PDF
  1. Data and Analysis
  2. CARTO + Python
  3. Working with Data

How to access CARTO’s Analytics Toolbox for Snowflake and create visualizations via Python notebooks

PreviousHow to access CARTO's Analytics Toolbox for BigQuery and create visualizations via Python notebooksNextHow to visualize data from Databricks

Last updated 1 year ago

Was this helpful?

This notebook guides the user through the process for connecting to both CARTO and Snowflake accounts and leverage CARTO’s Analytics Toolbox and CARTO’s integration with Pydeck to be able to perform spatial analytics at scale and create map visualizations from Python notebooks. You can find the original notebook .

The outline of this notebooks is as follows:

  • Authentication to CARTO: to be able to use ‘CartoLayer’ in Pydeck;

  • Authentication to Snowflake (credentials that have access to the database connected to CARTO with the Analytics Toolbox installed)

  • Operations and analysis using Snowpark Python connector and CARTO’s Analytics Toolbox

  • Map visualizations with CARTO and Pydeck

NOTE: snowflake-snowpark-python is only compatible with python >= 3.8, so be sure to run the notebook in an appropriate environment

!pip install snowflake-snowpark-python pandas pydeck pydeck-carto shapely python-dotenv

Loading libraries

import pydeck as pdk
import pydeck_carto as pdkc
from carto_auth import CartoAuth


import os
import json
from shapely.geometry import shape
from dotenv import load_dotenv
import pandas as pd


from snowflake.snowpark.session import Session

Authentication to CARTO

In order to authenticate to your CARTO account, install the carto_oauth package and use it to login with your credentials.

# Authentication with CARTO
carto_auth = CartoAuth.from_oauth()

Authentication to Snowflake

The cell below creates an .env file with the environment variables used for connecting to snowflake

with open(".env", "w+") as f:
    f.write(
"""
SF_ACCOUNT=XXXXXX
SF_USER=XXXXXX
SF_PASSWORD=XXXXXX
CARTO_APP_CREDS_FILE=creds.json
""")

load_dotenv() #loads env variables in .env file

We load our Snowflake credentials from the environment with os to create a Python connector to Snowpark

def create_session_object(database, schema, verbose = True):
    connection_parameters = {
      "account": os.environ.get("SF_ACCOUNT"),
      "user": os.environ.get("SF_USER"),
      "password": os.environ.get("SF_PASSWORD"),
      "database": database,
      "schema": schema,
    }
    session = Session.builder.configs(connection_parameters).create()
    if verbose:
        print(session.sql('select current_warehouse(), current_database(), current_schema()').collect())
    return session
sf_client = create_session_object("SFDATABASE","CARTO")

Downloading data from Snowflake into a Python dataframe

“Crossfit” is a gym chain located in California. We will be running a location analysis of “Crossfit” venues vs its competitors.

q = """
WITH crossfit_count AS (
SELECT CARTO_DEV_DATA.carto.H3_FROMGEOGPOINT(geom, 5) h3, COUNT(*) crossfit_gyms
FROM SFDATABASE.CARTO.GYMS_CA_CROSSFIT
GROUP BY h3
),
competition_count AS (
SELECT CARTO_DEV_DATA.carto.H3_FROMGEOGPOINT(geom, 5) h3, COUNT(*) competition_gyms
FROM SFDATABASE.CARTO.GYMS_CA_COMPETITION
GROUP BY h3
)
SELECT coalesce(a.h3,b.h3) h3, crossfit_gyms, competition_gyms, CARTO_DEV_DATA.carto.H3_BOUNDARY(coalesce(a.h3,b.h3)) geom
FROM crossfit_count a FULL OUTER JOIN competition_count b ON a.h3 = b.h3
"""
gyms_df = sf_client.sql(q).to_pandas()

We can export directly the output of a query as a pandas dataframe. The geometry column is downloaded as geojson text

gyms_df.head()
# converts from geojson string to polygon
text_to_geom = lambda t : shape(json.loads(t))

gyms_df["GEOM"] = gyms_df.GEOM.apply(text_to_geom)
gyms_df = gyms_df.fillna(0)
gyms_df.head()

Uploading a dataframe back to Snowflake

We transform our current dataframe, and we upload it back into our Snowflake database

total_gyms = gyms_df.drop(columns = ["GEOM"])
total_gyms["TOTAL_GYMS"] = gyms_df.CROSSFIT_GYMS + gyms_df.COMPETITION_GYMS

# We go from pandas DF to Snowflake DF. This creates a temp table with the data, which will be dropped at the end of the session.
snowflake_df = sf_client.create_dataframe(total_gyms)

# We persist such table.
snowflake_df.write.save_as_table("SFDATABASE.CARTO.GYMS_CA_TOTAL_CENTROID", mode = "overwrite")

Visualizing data in Snowflake with the pydeck-carto library

Here we visualize the uploaded data in two layers, using the new styling functions and the Analytics Toolbox installed in SF.

  • hexagons: renders the h3 cells with a colour continuos style representing the dominance ratio of crossfit gyms vs total number of gyms

  • points: plots the location of the gyms, with a color category style representing the gym type (crossfit gyms vs competition gyms)

# Note that the attribute name must be cased when passed to the styling functions even though in the query is uncased.
# This is because column names in SF tables are always with capital letters
# Snowflake

# Register CartoLayer in pydeck
pdkc.register_carto_layer()

hexagons_query = """
SELECT  CARTO_DEV_DATA.carto.H3_BOUNDARY("H3") H3_GEOM,
        CROSSFIT_GYMS / TOTAL_GYMS AS dominance_ratio
        FROM SFDATABASE.CARTO.GYMS_CA_TOTAL_CENTROID
        """

credentials = pdkc.get_layer_credentials(carto_auth)

hexagons = pdk.Layer(
    "CartoLayer",
    data = hexagons_query,
    geo_column=pdk.types.String("H3_GEOM"),
    type_=pdkc.MapType.QUERY,
    connection=pdk.types.String("snowflake"),
    credentials=credentials,
    opacity=0.2,
    stroked=True,
    get_fill_color=pdkc.styles.color_continuous("DOMINANCE_RATIO", [x/10 for x in range(10)], colors = "Tropic"),
    get_line_color=[0,42,42],
    line_width_min_pixels=2
    )

points_query = """
SELECT GEOM, 'crossfit' AS CATEGORY
FROM SFDATABASE.CARTO.GYMS_CA_CROSSFIT
UNION ALL
SELECT GEOM, 'competitors' AS CATEGORY
FROM SFDATABASE.CARTO.GYMS_CA_COMPETITION
"""

points = pdk.Layer(
    "CartoLayer",
    data = points_query,
    geo_column=pdk.types.String("GEOM"),
    type_=pdkc.MapType.QUERY,
    connection=pdk.types.String("snowflake"),
    credentials=credentials,
    opacity=0.8,
    stroked=True,
    pickable=True,
    point_radius_min_pixels=2,
    get_fill_color=pdkc.styles.color_categories("CATEGORY", ["competitors", "crossfit"], colors = "Tropic")
    )

view_state = pdk.ViewState(latitude=33.64, longitude=-117.94, zoom=5)
r = pdk.Deck(
    [hexagons, points],
    initial_view_state=view_state,
    map_style=pdk.map_styles.LIGHT,
)
r.to_html(iframe_height = 700)

We use the h3 module in to compute the H3 cell of each gym in the “Crossfit” and “Competition” tables, we then join them by h3 id and download the data.

here
CARTO’s Analytics Toolbox for Snowflake