How to visualize data from Databricks

This notebook guides the user through the process of connecting to a CARTO account and leverage CARTO’s integrating with Pydeck to be able to visualize data from a Databricks connections using both CARTO and Databricks native support of the H3 spatial index. You can find the original notebook here.

Install dependencies

!pip install pydeck-carto pydeck carto-auth -q
import pydeck as pdk
import pydeck_carto as pdkc
from carto_auth import CartoAuth

Authenticate with CARTO

The method below will get an OAuth access token that will give you permission to interact with the CARTO platform with the same privileges and access to resources as if you were logged in.

carto_auth = CartoAuth.from_oauth()

Loading an H3 Layer from a Databricks connection.

CARTO can connect to a Databricks SQL Warehouse or a Databricks cluster to push down SQL queries that will be fully executed in your Databricks environment. In both cases, the connection fully supports H3 indexes, leveraging Databricks' native capabilities.

Learn more about how to connect CARTO to Databricks here.

# Register CartoLayer in pydeck
pdkc.register_carto_layer()

# Render CartoLayer in pydeck
layer = pdk.Layer(
    "CartoLayer",
    data="hive_metastore.carto_dev_data.derived_spatialfeatures_ukr_h3int_res10_v1_yearly_v2_interpolated",
    type_=pdkc.MapType.TABLE,
    connection=pdk.types.String("databricksconn_cluster"),
    credentials=pdkc.get_layer_credentials(carto_auth),
    aggregation_exp=pdk.types.String("sum(population) as population"),
    aggregation_res_level=5,
    geo_column=pdk.types.String("h3"),
    get_fill_color=pdkc.styles.color_bins("population", [1, 10, 100, 1000, 10000, 100000], "SunsetDark"),
    get_line_color=[0, 0, 0, 80],
    line_width_min_pixels=0.5,
    stroked=True,
    extruded=False,
    pickable=True
)

tooltip = {
    "html": "Population: <b>{population}</b>",
    "style": {"background": "grey", "color": "white", "font-family": '"Helvetica Neue", Arial', "z-index": "10000"},
}

view_state = pdk.ViewState(latitude=49.0, longitude=30.0, zoom=5)
pdk.Deck(layer, map_style=pdk.map_styles.ROAD, initial_view_state=view_state)

Last updated