Installation in a Google Cloud VPC
Last updated
Was this helpful?
Last updated
Was this helpful?
This guide will walk you through the process of configuring the CARTO Analytics Toolbox to work within a VCP with a CARTO Self-hosted installation within Google Cloud Platform.
The first step would be to .
Once the Analytics Toolbox is installed in your project, use this guide to deploy the AT Gateway in your VPC.
Some functionalities of the CARTO Analytics Toolbox for BigQuery require making external calls from BigQuery to CARTO services. These calls are implemented via :
AT Gateway: Creation of isolines, geocoding and routing require making calls to CARTO LDS API. Some functions of the Analytics Toolbox require making a request to the CARTO Platform backend (like importing from a URL or the 'Send by Email' component in Workflows) . For this purpose, Cloud Run functions need to be deployed in your VPC.
When installing the Analytics Toolbox manually in your own project, there is some configuration required:
Create a that will allow to call Cloud Run functions from BigQuery.
An AT Gateway endpoint inside your VPC.
To deploy the Analytics Toolbox within a VPC, the CARTO platform needs to deploy some additional infrastructure pieces within your GCP project. In the following diagram, you can check how all these pieces interact with each other:
We'll set up the following pieces inside your project to start using the Analytics Toolbox on your CARTO Self-hosted platform:
One BQ connection used to perform requests agains two different Cloud Run services.
One subnetwork used to deploy the containers created by the two Cloud Run services that are required.
One Cloud Run service needed for BigQuery to interact with the Self-hosted platform.
One VPC Serverless Access Connector that will be used by the Cloud Run services to access your VPC.
An internal DNS record pointing to the IP address of your CARTO Self-hosted platform.
You just need to follow the following steps to set up the required infrastructure pieces:
Your BigQuery project will need to make requests to the two Cloud Run services configured in this guide. To configure the BQ connection that allows this usage, you'll need to run the following command:
Create a connection from a command line:
Replace the following:
PROJECT_ID
: your Google Cloud project ID
Once the connection has been configured, GCP will automatically create a service account that we'll need to use to grant permissions to access the cloud runs. You can check that the service account has been created correctly by running the following command:
Obtain the Service Account created when configuring a BQ connection:
Replace the following:
PROJECT_ID
: your Google Cloud project ID
REGION
: your connection region
Create subnet for the VPC Access Connector:
Replace the following:
VPC_NETWORK
: the name of the network created in your VPC project
SUBNETWORK_IPS_RANGE
: the range of IPs that this subnetwork will use
REGION
: the same GCP region used when creating the BQ connection in the previous step. This region has to be exactly the same used to create the BQ connection
PROJECT_ID
: your Google Cloud project ID
Now that the subnet is correctly configured, you'll need to create a Serverless VPC Access connector for the Cloud Run services.
Create connector for the Cloud Run services:
Replace the following variables:
PROJECT_ID
: your Google Cloud project ID
REGION
: the same GCP region used when creating the BQ connection in the previous step
Once the connector has been correctly created, we can proceed with the Cloud Run services deployment. You'll have to execute the following commands:
Deploy AT Gateway service
Replace the following variables:
PROJECT_ID
: your Google Cloud project ID
REGION
: the same GCP region used when creating the BQ connection in the previous step
The AT Gateway service will need to access the CARTO Self-hosted LDS API to perform requests to the different LDS providers. As the requests will be handled inside the VPC, it's mandatory to add an internal DNS registry so that the Cloud Run service can reach the CARTO platform APIs.
Firstly, we have to obtain the internal IP address of the CARTO Self-hosted platform. Once the internal IP has been obtained, you can create a DNS zone inside GCP using the following command:
If you already have an internal DNS configured in your GCP project you can skip this step and directly add a new domain pointing to the CARTO platform internal IP address.
Replace the following variables:
PROJECT_ID
: your Google Cloud project ID
DNS_ZONE_NAME
: the name that will use your new DNS zone
VPC_NETWORK
: name of the VPC network created in your GCP project
Then we'll have to create a new registry inside the new DNS zone, configuring a domain that points to CARTO Self-hosted platform's internal IP address:
Start a transaction to add a record in your DNS zone
Add the new domain to your DNS zone
Execute the transaction to write the new changes in your DNS zone
Replace the following:
PROJECT_ID
: your Google Cloud project ID
DNS_ZONE
: the name of your DNS zone
CARTO_PLATFORM_IP
: internal IP address of your CARTO Self-hosted deployment
INTERNAL_DOMAIN
: the internal domain that will be pointing to your CARTO Self-hosted deployment inside your VPC
You'll have to change CARTO_PLATFORM_IP
variable in the previous command for the one used by your CARTO Self-hosted installation.
Cloud Run services needs access to the CARTO Self-hosted environment, so you'll have to check that the firewall rules configured on your project allow the traffic between these two pieces.
The CARTO Self-hosted platform has to be accessible through the 443 port, and it should be allowed to respond requests performed by the Cloud Run services deployed in the previous steps.
All requests will be handled inside the VPC, so all network traffic involved in this process will take place between the subnetworks created and the CARTO Self-hosted instance.
Now that we've both installed the Analytics Toolbox and deployed the required infrastructure pieces in GCP, we have to configure the Analytics Toolbox so that it's able to use the AT Gateway.
The Analytics Toolbox provides a procedure to update the required configuration values to start using the remote functions needed. These functions can be configured executing the following query in your BigQuery project:
Replace the following:
CONNECTION
: name of the connection created in the previous step. The default value is {PROJECT_ID}.{REGION}.carto-conn
ENDPOINT
: endpoint of the AT Gateway function deployed in Cloud Run
API_ACCESS_TOKEN
: access token generated inside CARTO platform with permissions to use the LDS API
The ENDPOINT
expected value can be obtained executing the following command:
After running the previous query, the CARTO Analytics Toolbox should be ready to work in your BigQuery project. In order to check if the installation process has worked as expected, you can execute the following queries in the BigQuery console. It will create a table called geocode_test_table
containing a gecoded address.
REGION
: your connection region. US
and EU
regions are not available, so you'll have to select a more specific GCP region. You can check the list of available regions
The BQ connection created in the previous step will have to a Cloud Run service to use the AT. This service is the AT Gateway container, and prior to creating the services we'll need to create a subnetwork for it, as we'll have to use a for them:
API_BASE_URL
: the of your CARTO Self-hosted platform.
Congratulations! If the previous query execution finishes and you can obtain the geocoded address querying the table that has been created, your CARTO Analytics Toolbox is successfully installed and configured inside your VPC.
Now, remember to setup your connections to BigQuery with the correct setting to ensure that all queries generated by CARTO applications use it.