Access data in Databricks

In order to make the Data Observatory subscriptions accessible in your Databricks account, first of all CARTO needs to create and register your subscriptions in your CARTO account. Once this first step has been done, Admin users of your CARTO organization will be able to see the list of active subscriptions in the Data Observatory section in Settings.

Once your subscriptions have been created, the CARTO team will proceed with the data transfers so to make the data accessible directly on your Databricks account. Please contact your CARTO representative if you need to start this process on demand (e.g. public data subscriptions).

In order to make your subscriptions available in your Databricks account, CARTO leverages the native private exchanges mechanisms of Databricks, powered by Delta Sharing.

In order to be able to create and share a private exchange with Delta Sharing to your Databricks account, you will need to have Databricks' Unity Catalog enabled.

CARTO will create and maintain a private exchange shared with your Databricks account containing the data from all your Data Observatory subscriptions. To be able to create such private share we will need to receive the following information regarding your Databricks account:

  • Databricks Metastore ID or Databricks sharing identifier (more info)

  • Cloud

  • Region

Once CARTO has completed the data transfer, you will be able to navigate to the Delta Sharing section on the Catalog area and identify CARTO's private share on the "Shared with me" section.

In order to be able to access the data from the private share you then need to click on "Create Catalog". You will be prompted to provide a Catalog name.

Once the new catalog has been created you will be able to access the data tables within the carto schema.

Alternatively, CARTO can facilitate you the command to create the catalog associated with the private share directly on the SQL Editor. It will look like this (with the details of your specific private share):

It is important to note that the catalog that is created from the Delta Share is a “Delta Sharing Catalog”, which is read-only. This will prevent you to prepare the data for faster geospatial queries, and also to carry out any sort of processing of the data contained in that catalog.

We recommend you to copy the tables into a different Catalog within your Databricks organization - to do that you can execute the following query:

In order to use the data from your Data Observatory subscriptions that have been transferred directly to your Databricks account you should not use the Data Observatory tabs in Data Explorer, Workflows or Builder, but directly your own Databricks connections.

Access the data from your data warehouse connection in CARTO

Once the transfer has been completed, you will be able to explore your subscription directly from your data warehouse in the Data Explorer:

Using your data for spatial analysis in Databricks

CARTO supports Databricks native GEOMETRY columns. No special table preparation is required - CARTO works directly with your geometry data.

To learn more about Databricks connections in CARTO please access this section. For map performance considerations please read these recommendations.

Last updated

Was this helpful?