Installation in an Azure VNet
This guide walks you through the process of configuring the CARTO Analytics Toolbox AT Gateway for Databricks to work within an Azure Virtual Network with a CARTO Self-Hosted installation on Azure.
Is your CARTO Self-Hosted deployment in an Azure VNet?
When the CARTO platform is self-hosted within an Azure VNet with an internal Load Balancer, the AT Gateway needs to be deployed inside a peered VNet so that it can reach the CARTO platform's internal endpoint. This guide covers that deployment using Azure Container Apps.
Prerequisites
Before following this guide, ensure you have:
CARTO Self-Hosted deployed on Azure, accessible through a private network endpoint (e.g., an internal Load Balancer with no public IP). This guide is designed for deployments where the CARTO platform is not publicly accessible
The Analytics Toolbox functions installed in your Databricks workspace (see the SQL Warehouse installation guide)
Azure CLI (
az) authenticated with sufficient permissions to create VNets, subnets, Container Apps, and Private DNS zonesA Databricks workspace with Unity Catalog enabled and a SQL Warehouse provisioned
The internal IP address of your CARTO Self-Hosted platform's Load Balancer
Architecture overview
The AT Gateway acts as a bridge between Databricks and the CARTO Self-Hosted platform. Databricks SQL Warehouses use the http_request() function to call the AT Gateway over public HTTPS. The AT Gateway then forwards requests to the CARTO platform over the internal network via VNet peering.
The deployment consists of the following infrastructure pieces:
An Azure VNet with a dedicated subnet delegated to
Microsoft.App/environments, used to deploy the AT Gateway containerAn Azure Container Apps environment (Workload Profiles mode) with VNet integration and external ingress
Bidirectional VNet peering between the ACA VNet and the CARTO VNet (the VNet where CARTO Self-Hosted is deployed), enabling private network connectivity
The AT Gateway container deployed as a Container App, exposing a public HTTPS endpoint for Databricks and routing traffic internally to CARTO
An Azure Private DNS zone linked to the ACA VNet, resolving the CARTO domain to the internal Load Balancer IP
An HTTP Connection in Databricks pointing to the AT Gateway's public URL
Databricks http_request() routes through Databricks-managed infrastructure, not through the customer VNet. This is why the AT Gateway requires a public ingress endpoint — Databricks cannot reach endpoints that are only accessible within your VNet.
All following commands and instructions should be executed from an authenticated az CLI session.
Step 1: Prepare VNet Infrastructure
Create a dedicated VNet and subnet for the Azure Container Apps environment. This VNet will be peered with the VNet where CARTO Self-Hosted is deployed to allow the AT Gateway to reach the CARTO internal Load Balancer.
If using AKS: Do not create resources in the AKS node resource group (MC_*). Microsoft explicitly states that modifying resources in the managed node resource group is unsupported and may cause cluster failures. Always create networking resources in your own resource group.
Same-VNet alternative: If your CARTO Self-Hosted deployment uses a BYO (Bring Your Own) VNet — which is the Microsoft-recommended approach for AKS — you can add the ACA subnet directly to the CARTO VNet instead of creating a separate VNet. This eliminates the need for VNet peering (Step 2) and simplifies the architecture. Simply create the delegated subnet (Step 1.2) in the existing CARTO VNet and skip Step 2 entirely.
1.1 Create a VNet for Azure Container Apps
Replace the following:
RESOURCE_GROUP: the resource group where you want to create the ACA networking resources (use your own resource group, not theMC_*node resource group)ACA_VNET_NAME: the name for the new VNet (e.g.,aca-at-gateway-vnet)ACA_VNET_CIDR: the address space for the VNet (e.g.,10.1.0.0/16)REGION: the Azure region, which should match your CARTO Self-Hosted deployment region
The ACA VNet CIDR must not overlap with the CARTO VNet address space or any of the following Azure-reserved ranges: 169.254.0.0/16, 172.30.0.0/16, 172.31.0.0/16, 192.0.2.0/24.
1.2 Create a subnet for the Container Apps environment
Replace the following:
ACA_SUBNET_NAME: the name for the subnet (e.g.,aca-subnet)ACA_SUBNET_CIDR: the subnet CIDR block (e.g.,10.1.0.0/27)
The subnet must use a minimum CIDR block of /27 (32 addresses) when using Workload Profiles. The subnet must be delegated to Microsoft.App/environments — this is required for Workload Profiles mode.
Save the Subnet Resource ID from the output for use in later steps. You can retrieve it with:
Step 2: Set Up VNet Peering
Create bidirectional VNet peering between the ACA VNet and the CARTO VNet so that the AT Gateway can reach the CARTO internal Load Balancer.
Enterprise hub-spoke topologies: For hub-spoke network topologies where CARTO Self-Hosted and ACA are in separate spoke VNets, the peered VNet approach documented here is the correct pattern. If VNet peering system routes do not work as expected (e.g., due to custom route tables on the hub), add a User-Defined Route (UDR) on the ACA subnet with the CARTO VNet address space pointing to the appropriate next hop.
2.1 Retrieve the CARTO VNet information
You need the resource ID of the VNet where your CARTO Self-Hosted platform is deployed.
If using AKS with a BYO (Bring Your Own) VNet:
Replace the following:
CARTO_RESOURCE_GROUP: the resource group containing your CARTO Self-Hosted deploymentCARTO_CLUSTER_NAME: the name of your cluster
If using AKS without a BYO VNet, the VNet is in the managed node resource group:
CARTO_NODE_RESOURCE_GROUP: the managed node resource group (typicallyMC_{RESOURCE_GROUP}_{CLUSTER_NAME}_{REGION})
For embedded cluster deployments (k0s, k3s, or similar on a VM): identify the VNet where the VM is deployed using az network vnet list --resource-group {CARTO_RESOURCE_GROUP} or through the Azure Portal. Use the VNet resource ID in the following steps.
2.2 Create bidirectional VNet peering
Replace the following:
CARTO_VNET_ID: the resource ID of the CARTO VNet (obtained in Step 2.1)CARTO_VNET_RESOURCE_GROUP: the resource group containing the CARTO VNetCARTO_VNET_NAME: the name of the CARTO VNet
2.3 Verify peering status
The peering state should be Connected for both directions.
Step 3: Deploy Azure Container Apps Environment
Create an Azure Container Apps environment with Workload Profiles mode and VNet integration. This environment will host the AT Gateway container.
You must use a Workload Profiles environment. Consumption-only environments do not support VNet peering egress, UDR, or custom outbound routing. If you use a Consumption-only environment, the AT Gateway will not be able to reach the CARTO platform through VNet peering.
Replace the following:
ACA_ENV_NAME: the name for the Container Apps environment (e.g.,at-gateway-env)ACA_SUBNET_ID: the subnet resource ID obtained in Step 1.2
Setting --internal-only false enables external ingress on the environment. This is required because Databricks http_request() needs to reach the AT Gateway over the public internet.
Setting --enable-workload-profiles true enables Workload Profiles mode, which is required for VNet peering egress to work correctly.
Step 4: Deploy the AT Gateway
Deploy the AT Gateway container on the Azure Container Apps environment. This container proxies requests between Databricks and the CARTO Self-Hosted platform.
Replace the following:
AT_GATEWAY_NAME: the name for the Container App (e.g.,carto-at-gateway)REGION: the Azure region matching your deployment
The NODE_TLS_REJECT_UNAUTHORIZED=0 environment variable disables the verification of custom TLS certificates in Self-Hosted deployments.
Scaling: The AT Gateway is configured with a minimum of 1 and maximum of 3 replicas by default. Adjust the --min-replicas and --max-replicas values based on your expected query volume. For production workloads with high concurrency, consider increasing the minimum replica count to ensure consistent response times.
Once the deployment completes, retrieve the AT Gateway's public URL:
Save this URL for use in the Databricks configuration step.
Step 5: Create Private DNS Zone for CARTO
The AT Gateway needs to resolve the CARTO Self-Hosted platform's domain to the internal Load Balancer IP address. Create an Azure Private DNS zone linked to the ACA VNet to enable this resolution.
If you are configuring the AT Gateway to connect to CARTO using a direct IP address rather than a domain name, you can skip this step. However, using a Private DNS zone is recommended for production deployments.
5.1 Create the Private DNS zone
Replace the following:
CARTO_DOMAIN: the domain used by your CARTO Self-Hosted platform (e.g.,carto.internal.example.com)
5.2 Link the DNS zone to the ACA VNet
5.3 Create an A record pointing to the CARTO internal IP
Replace the following:
CARTO_INTERNAL_IP: the internal IP address of your CARTO Self-Hosted platform's Load Balancer. You can obtain this with:
Step 6: Verify Network Connectivity
Before configuring Databricks, verify that the AT Gateway container can reach the CARTO Self-Hosted platform through the VNet peering.
6.1 Test connectivity from the ACA environment
Deploy a temporary test container in the ACA environment to verify connectivity to the CARTO internal endpoint:
Once running, execute a connectivity test:
A successful TLS handshake with the CARTO Self-Hosted platform confirms that the VNet peering is working correctly. Look for a 200 HTTP response or a valid TLS certificate (CN=*.yourdomain.com) in the output.
6.2 Verify firewall and NSG rules
Ensure that the Network Security Groups (NSGs) associated with both the ACA subnet and the CARTO subnet allow:
Outbound traffic from the ACA subnet to the CARTO subnet on port 443
The CARTO Self-Hosted platform to respond to requests from the ACA subnet
All requests are handled inside the peered VNets, so all network traffic involved occurs between the ACA subnet and the CARTO Self-Hosted instance.
6.3 Clean up the test container
Step 7: Configure the AT Gateway in Databricks
Now that the AT Gateway is deployed and can reach the CARTO platform, configure Databricks to use it.
7.1 Create an HTTP Connection
Connect to your Databricks SQL Warehouse and run the following SQL:
Replace the following:
AT_GATEWAY_URL: the public URL of the AT Gateway obtained in Step 4 (e.g.,https://carto-at-gateway.niceocean-abcd1234.westeurope.azurecontainerapps.io)
7.2 Grant permissions on the connection
Replace the following:
PRINCIPAL: the Databricks principal (user, group, or service principal) that needs access to the AT Gateway
7.3 Run the SETUP procedure
Replace the following:
API_BASE_URL: the API base URL of your CARTO Self-Hosted platformAPI_ACCESS_TOKEN: an access token generated inside the CARTO platform with permissions to use the LDS API
Keep your API Access Token secure. Anyone with access to this token can consume the LDS quota assigned to your account.
Step 8: Verification
After completing the configuration, verify that the Analytics Toolbox is working correctly through the AT Gateway.
8.1 Check the Analytics Toolbox version
You should receive a version string (e.g., 0.1.0), confirming that the AT Gateway is correctly proxying requests to the CARTO platform.
8.2 Check the LDS quota
You should receive a JSON response showing your available quota and configured providers:
This confirms that:
The AT Gateway is deployed and reachable from Databricks
VNet peering is working correctly between the ACA VNet and the CARTO VNet
The CARTO Self-Hosted platform is accessible through the internal network
Your LDS credentials are valid
Congratulations!
Your CARTO Analytics Toolbox AT Gateway for Databricks is now successfully deployed and configured inside your Azure VNet.
After an installation or update of the Analytics Toolbox is performed, the CARTO connection needs to be refreshed by the owner of the connection by clicking on the refresh button on the connection's card.
Now you can start using the functions in the SQL Reference.
Last updated
Was this helpful?
