Installation in a Google Cloud VPC
This guide will walk you through the process of configuring the CARTO Analytics Toolbox to work within a VCP with a CARTO Self-hosted installation within Google Cloud Platform.
Is your CARTO Self Hosted deployment in a Google Cloud VPC?
When the CARTO platform is self hosted within a Google Cloud VPC, the functions and procedures of the Analytics Toolbox need to be accessed from within the same VPC.
That makes this installation method the only suitable one for this kind of CARTO platform's deployment.
Install CARTO Analytics Toolbox inside your BigQuery project
The first step would be to install the Analytics Toolbox in a BigQuery project of your own.
Once the Analytics Toolbox is installed in your project, use this guide to deploy the AT Gateway in your VPC.
Deploy the infrastructure needed to allow Location Data Services usage
Some functionalities of the CARTO Analytics Toolbox for BigQuery require making external calls from BigQuery to CARTO services. These calls are implemented via BigQuery Remote Functions:
AT Gateway: Creation of isolines, geocoding and routing require making calls to CARTO LDS API. Some functions of the Analytics Toolbox require making a request to the CARTO Platform backend (like importing from a URL or the 'Send by Email' component in Workflows) . For this purpose, Cloud Run functions need to be deployed in your VPC.
When installing the Analytics Toolbox manually in your own project, there is some configuration required:
Create a BigQuery connection that will allow to call Cloud Run functions from BigQuery.
An AT Gateway endpoint inside your VPC.
Architecture overview
To deploy the Analytics Toolbox within a VPC, the CARTO platform needs to deploy some additional infrastructure pieces within your GCP project. In the following diagram, you can check how all these pieces interact with each other:
We'll set up the following pieces inside your project to start using the Analytics Toolbox on your CARTO Self-hosted platform:
One BQ connection used to perform requests agains two different Cloud Run services.
One subnetwork used to deploy the containers created by the two Cloud Run services that are required.
One Cloud Run service needed for BigQuery to interact with the Self-hosted platform.
One VPC Serverless Access Connector that will be used by the Cloud Run services to access your VPC.
An internal DNS record pointing to the IP address of your CARTO Self-hosted platform.
You just need to follow the following steps to set up the required infrastructure pieces:
All following commands and instructions should be executed from the Cloud Shell in your console or from authenticated gcloud
and bq
CLI sessions.
1. Configure a BQ connection to enable requests to Cloud Run services
Your BigQuery project will need to make requests to the two Cloud Run services configured in this guide. To configure the BQ connection that allows this usage, you'll need to run the following command:
Create a connection from a command line:
Replace the following:
PROJECT_ID
: your Google Cloud project IDREGION
: your connection region.US
andEU
regions are not available, so you'll have to select a more specific GCP region. You can check the list of available regions here
Once the connection has been configured, GCP will automatically create a service account that we'll need to use to grant permissions to access the cloud runs. You can check that the service account has been created correctly by running the following command:
Obtain the Service Account created when configuring a BQ connection:
Replace the following:
PROJECT_ID
: your Google Cloud project IDREGION
: your connection region
2. Deploy the AT Gateway container in Cloud Run
The BQ connection created in the previous step will have to a Cloud Run service to use the AT. This service is the AT Gateway container, and prior to creating the services we'll need to create a subnetwork for it, as we'll have to use a VPC Access Connector for them:
Create subnet for the VPC Access Connector:
Replace the following:
VPC_NETWORK
: the name of the network created in your VPC projectSUBNETWORK_IPS_RANGE
: the range of IPs that this subnetwork will use
The IPs range selected for the subnetwork must be created using a CIDR /28 block
REGION
: the same GCP region used when creating the BQ connection in the previous step. This region has to be exactly the same used to create the BQ connectionPROJECT_ID
: your Google Cloud project ID
Now that the subnet is correctly configured, you'll need to create a Serverless VPC Access connector for the Cloud Run services.
Create connector for the Cloud Run services:
Replace the following variables:
PROJECT_ID
: your Google Cloud project IDREGION
: the same GCP region used when creating the BQ connection in the previous step
Once the connector has been correctly created, we can proceed with the Cloud Run services deployment. You'll have to execute the following commands:
Deploy AT Gateway service
The NODE_TLS_REJECT_UNAUTHORIZED environment variable is used to disable the verification of custom TLS certificates in the Self-hosted deployment
Replace the following variables:
PROJECT_ID
: your Google Cloud project IDREGION
: the same GCP region used when creating the BQ connection in the previous step
3. Create DNS entry for CARTO Self-hosted platform
The AT Gateway service will need to access the CARTO Self-hosted LDS API to perform requests to the different LDS providers. As the requests will be handled inside the VPC, it's mandatory to add an internal DNS registry so that the Cloud Run service can reach the CARTO platform APIs.
Firstly, we have to obtain the internal IP address of the CARTO Self-hosted platform. Once the internal IP has been obtained, you can create a DNS zone inside GCP using the following command:
If you already have an internal DNS configured in your GCP project you can skip this step and directly add a new domain pointing to the CARTO platform internal IP address.
Replace the following variables:
PROJECT_ID
: your Google Cloud project IDDNS_ZONE_NAME
: the name that will use your new DNS zoneVPC_NETWORK
: name of the VPC network created in your GCP project
Then we'll have to create a new registry inside the new DNS zone, configuring a domain that points to CARTO Self-hosted platform's internal IP address:
Start a transaction to add a record in your DNS zone
Add the new domain to your DNS zone
Execute the transaction to write the new changes in your DNS zone
Replace the following:
PROJECT_ID
: your Google Cloud project IDDNS_ZONE
: the name of your DNS zoneCARTO_PLATFORM_IP
: internal IP address of your CARTO Self-hosted deploymentINTERNAL_DOMAIN
: the internal domain that will be pointing to your CARTO Self-hosted deployment inside your VPC
You'll have to change CARTO_PLATFORM_IP
variable in the previous command for the one used by your CARTO Self-hosted installation.
4. Check firewall rules to ensure that Cloud Run can reach the Self-hosted instance
Cloud Run services needs access to the CARTO Self-hosted environment, so you'll have to check that the firewall rules configured on your project allow the traffic between these two pieces.
The CARTO Self-hosted platform has to be accessible through the 443 port, and it should be allowed to respond requests performed by the Cloud Run services deployed in the previous steps.
All requests will be handled inside the VPC, so all network traffic involved in this process will take place between the subnetworks created and the CARTO Self-hosted instance.
Configure the AT Gateway in your CARTO Analytics Toolbox installation
Now that we've both installed the Analytics Toolbox and deployed the required infrastructure pieces in GCP, we have to configure the Analytics Toolbox so that it's able to use the AT Gateway.
The Analytics Toolbox provides a procedure to update the required configuration values to start using the remote functions needed. These functions can be configured executing the following query in your BigQuery project:
Replace the following:
CONNECTION
: name of the connection created in the previous step. The default value is{PROJECT_ID}.{REGION}.carto-conn
ENDPOINT
: endpoint of the AT Gateway function deployed in Cloud RunAPI_BASE_URL
: the API base URL of your CARTO Self-hosted platform.API_ACCESS_TOKEN
: access token generated inside CARTO platform with permissions to use the LDS API
The ENDPOINT
expected value can be obtained executing the following command:
After running the previous query, the CARTO Analytics Toolbox should be ready to work in your BigQuery project. In order to check if the installation process has worked as expected, you can execute the following queries in the BigQuery console. It will create a table called geocode_test_table
containing a gecoded address.
Now, remember to setup your connections to BigQuery with the correct Analytics Toolbox location setting to ensure that all queries generated by CARTO applications use it.
Last updated