Personal (former Single User) cluster
Follow the instructions below to install the CARTO Analytics Toolbox for Databricks on a Single User cluster. In order to complete all steps, you will need:
A
.jar
package that will be installed in the cluster.An
carto_sql_init.sh
that will be used as init script for the cluster.
Upload your installer to the Databricks workspace
On the left-side panel in Databricks, click on 'Workspace'. Then locate and click on the three-dots button next to 'Share' and select 'Import'

Use the import dialog to upload both the JAR package and the init script.
Create a cluster
In the Databricks workspace go to 'Compute' and make sure you are on the 'All-purpose compute' tab. Click on 'Create compute' button.
When creating the cluster, take the following into consideration:
Photon Acceleration needs to be enabled in the cluster
Recommended DBR is 15.4 LTS.
Some DBR versions don't support Photon Acceleration. Make sure you select a DBR that allows enabling Photon.
Bear in mind that performance will vary with the cluster size.

Now, scroll down to Advance Options.
Check the 'Spark' tab and enter the following:
Spark config
spark.sql.extensions com.carto.analytics.toolbox.sql.SparkExtension
spark.serializer org.apache.spark.serializer.KryoSerializer
spark.kryo.registrator org.apache.sedona.core.serde.SedonaKryoRegistrator
Environment variables
SCALAPY_PYTHON_LIBRARY=python3.11
SCALAPY_PYTHON_PROGRAMNAME=/databricks/python3/bin/python
CARTO_AT_LOCATION=/Workspace/path_to_at_folder
The code above will be interpreted in a bash
script. Please be careful when setting environment variables in your cluster and ensure there are no intermediate or trailing spaces after any variable. Even small formatting issues like this can cause errors that are difficult to debug.
You can get the path to your JAR file from the Workspace UI, by clicking on the options (three-dots button) of the file from the directory where it was uploaded:

Still within Advanced Options, check the 'Init Scripts' tab. And use the UI to locate and add the File path to the script provided with the installer package. Use 'Workspace' as source:

Click on 'Create compute' and wait for the process to finish.
🎉Congrats! we're done and you should be able to use the Analytics Toolbox functions and procedures from your Databricks notebooks.
Last updated
Was this helpful?