Analytics Toolbox for BigQuery

Analytics Toolbox for BigQuery

Getting access

Access from CARTO Workspace

To get access to the entire collection of modules of the CARTO Analytics Toolbox (both core and advanced) you will need:

  • A CARTO account. If you still don’t have one, you can create a trial account here.
  • A Google Cloud Platform account. Find here more information about how to get one.

Getting access for a Google user

Access to the Analytics Toolbox for BigQuery is granted to all Google users that create a connection to BigQuery from the CARTO Workspace using OAuth. You can find step-by-step instructions on how to create a connection with BigQuery here.

Creating a connection with BigQuery from the CARTO Workspace

Once you create a connection, your user will have the necessary permissions to run all the functions and procedures of the Analytics Toolbox for BigQuery directly from the BigQuery console. They will be available in a specific project depending on the region of your BigQuery account and always under the carto dataset. For example, in the US and EU multi-regions, the analytics toolbox functions are available in the carto-un and carto-un-eu projects respectively. Please check the full list of projects for the different cloud regions in order to choose the optimal one depending on the location of your data.

Getting access for a Service Account

Access to the Analytics Toolbox for BigQuery is granted to every service account that is used to create a connection to BigQuery from the CARTO Workspace (Service Account option). You can find step-by-step instructions on how to create a connection with BigQuery here.

Creating a connection with BigQuery from the CARTO Workspace

These service accounts will have the necessary permissions to run all the functions and procedures of the Analytics Toolbox for BigQuery available in a specific project depending on the region of your BigQuery account and always under the carto dataset. For example, in the US and EU multi-regions, the analytics toolbox functions are available in the carto-un and carto-un-eu projects respectively. Please check the full list of projects for the different cloud regions in order to choose the optimal one depending on the location of your data.

Running the Analytics Toolbox

The Analytics Toolbox can be run from:

  • your BigQuery console, after creating an OAuth connection to BigQuery from the Workspace.
  • any BigQuery client, authenticated using a Service Account that has been previously used to create a connection to BigQuery from the Workspace.
  • directly from the Workspace, by:
    • Creating tilesets from the Data Explorer following this guide.
    • Creating custom SQL layers in Builder following this guide.
    • Enriching your data with Data Observatory subscriptions following this guide.

Free access to the core modules

If you are not a CARTO customer you can still use the core modules of the Analytics Toolbox. These modules are available to all BigQuery authenticated users through the carto-os and carto-os-eu projects. These projects are deployed in the US and EU multi-regions, respectively, and you may choose one or the other depending on the location of your data.

Manual Installation

The Analytics Toolbox is available in a specific project depending on the region of your BigQuery account. Please check the full list of projects for the different cloud regions in order to choose the optimal one depending on the location of your data.

The CARTO Analytics Toolbox contains two packages:

  • core: this is the public and open-source package. It contains all the core GIS functions that complement the GIS native functions available in BigQuery.
  • advanced: this is a premium package. It contains advanced GIS functions to power high-level GIS analytics in BigQuery.

We can divide the process into three steps: preparation, setup and installation. The setup must be done only the first time, then the installation must be done every time you want to install a new version of the packages.

In this guide we will use Google Cloud Shell to setup and install the toolbox. Please open the GCP console and select the project to install the toolbox, then use the “>_” button (top right) to “Activate Cloud Shell”.

GCP Cloud Shell

Preparation

You will need a GCP project to install the Toolbox, as well as a storage bucket in the same project to store the JavasScript libraries needed. Users of the Toolbox will need permissions to read both the BigQuery dataset (where the functions and procedures will be installed) and the bucket in order to run the CARTO Analytics Toolbox.

We will set the project and bucket names as well as the location where the toolbox will be created (should be the same as the bucket) as Cloud Shell environment variables:

  • PROJECT: Project id where the toolbox dataset will be created
  • REGION: Location of the BigQuery dataset that will be created to install the Analytics Toolbox
  • BUCKET: Name of the bucket to store the JavasScript libraries needed by the Toolbox (please omit any protocol prefix like gs://)

Set these variables by executing the following in Cloud Shell (after replacing the appropriate values):

1
2
3
export PROJECT="<my-project>"
export REGION="<my-region>"
export BUCKET="<my-bucket>"

Setup

This step is only required before the first installation. Activate the Cloud Shell in the target project and make sure the environment variables from the preparation step above are set.

Before starting the process make sure the target GCP project exists and that it is the correct one by executing the following:

1
2
# Check project existence
gcloud projects describe $PROJECT

Then, create a BigQuery dataset named carto, where the Toolbox will be installed:

1
2
# Create dataset "carto"
bq mk --location=$REGION --description="CARTO dataset" -d $PROJECT:carto

Installation

To install the Analytics Toolbox in the carto dataset we will use the this installation package and follow the instructions below. Please note that this process should be repeated every time a new version of the Toolbox is available.

Access the Cloud Shell and set the environment variables as described in the preparation step above, then run the following commands:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
# Download package
wget https://storage.googleapis.com/carto-analytics-toolbox-core/bigquery/carto-analytics-toolbox-core-bigquery-latest.zip
unzip carto-analytics-toolbox-core-bigquery-latest.zip

# Enter the directory
cd $(unzip -Z -1 carto-analytics-toolbox-core-bigquery-latest.zip | head -1)

# Prepare SQL code
sed -e 's!@@BUCKET@@!'"$BUCKET"'!g'  modules.sql > modules_rep.sql

# Copy libs to bucket
gsutil -m cp -r libs/ gs://$BUCKET/carto/

# Install the functions and procedures
bq --location=$REGION --project_id=$PROJECT query --use_legacy_sql=false --max_statement_results=10000 --format=prettyjson < modules_rep.sql

Check the installed version

Execute the following in the BigQuery console, in the same project where the Toolbox was installed:

1
SELECT carto.VERSION_CORE();

You can also check all the installed routines (functions and procedures) with:

1
SELECT * FROM carto.INFORMATION_SCHEMA.ROUTINES;

Congratulations! you have successfully installed the CARTO Analytics Toolbox in your BigQuery project. Now you can start using the functions.

EU flag

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 960401.