Installation in an Amazon Web Services VPC

This guide will walk you through the process of configuring the CARTO Analytics Toolbox to work within a VCP with a CARTO Self-hosted installation on Amazon Web Services.

Install CARTO Analytics Toolbox inside your Redshift database

The first step would be to install the Analytics Toolbox in a Redshift database of your own.

Once the Analytics Toolbox is installed in your project, use this guide to deploy the AT Gateway function in your VPC.

Deploy the infrastructure needed to allow Location Data Services usage

Some functionalities of the CARTO Analytics Toolbox for Redshift require performing external calls from Redshift to CARTO services. These calls are implemented via Amazon Lambda Functions:

  • Creation of isolines, geocoding and routing require making calls to CARTO LDS API. In order to call this API, Redshift needs access to a Lambda function that works as a proxy. For this purpose, Lambda functions need to be deployed in your VPC.

When installing the Analytics Toolbox manually in your own project, there is some configuration required to deploy the AT Gateway function.

Architecture overview

To deploy the Analytics Toolbox within a VPC, the CARTO platform needs to deploy some additional infrastructure pieces within your AWS project. In the following diagram, you can check how all these pieces interact with each other:

We'll set up the following pieces inside your project to start using the Analytics Toolbox on your CARTO Self-hosted platform:

  • One subnetwork used to deploy the containers created by the Lambda function that is required.

  • A Lambda function required for Redshift to interact with the Self-hosted platform.

  • An internal DNS record pointing to the IP address of your CARTO Self-hosted platform.

  • A VPC endpoint used to allow the communication between your Redshift instance and the VPC where CARTO Self-Hosted platform is installed.

You just need to follow the following steps to set up the required infrastructure pieces:

All following commands and instructions should be executed from an authenticated aws CLI sessions.

1. Deploy the AT Gateway Function

Redshift will have to call a Lambda function to use the AT. This service is the AT Gateway Function, and prior to creating the services, we'll need to create a subnetwork for it:

  1. Create a subnet for the Lambda function:

aws ec2 create-subnet \
    --vpc-id {VPC_NETWORK} \
    --cidr-block {SUBNETWORK_IPS_RANGE} \
    --region {REGION} \
    --tag-specifications 'ResourceType=subnet,Tags=[{Key=Name,Value={SUBNETWORK_NAME}}]'

Replace the following:

  • VPC_NETWORK: the name of the network created in your VPC project

  • SUBNETWORK_IPS_RANGE: the range of IPs that this subnetwork will use

The IPs range selected for the subnetwork must be created using a CIDR /24 block

  • REGION: the region used to create the subnetwork

  • SUBNETWORK_NAME: the name of the subnetwork that will be created

  1. Create a security group for your AT Gateway Lambda Function:

aws ec2 create-security-group \
  --group-name {GROUP_NAME} \
  --region {REGION} \
  --description "Security Group for AT Gateway Lambda Function" \
  --vpc-id {VPC_NETWORK}

As this security group will be used by your AT Gateway Lambda Function, it should be able to perform requests and receive requests from Redshift and your CARTO Self-Hosted installation.

Replace the following:

  • GROUP_NAME: the name of the security group created in your AWS project

  • REGION: the region used to create the security group

  • VPC_NETWORK: the ID of the network created in your VPC project

Once the subnetwork has been correctly created, we need a VPC endpoint that will be used for connecting Redshift with the VPC:

  1. Provision a VPC endpoint:

aws ec2 create-vpc-endpoint \
    --vpc-id {VPC_NETWORK} \
    --service-name com.amazonaws.{REGION}.lambda \
    --vpc-endpoint-type Interface \
    --security-group-ids {AT_GATEWAY_FUNCTION_SECURITY_GROUP} \
    --region {REGION} \
    --private-dns-enabled

Replace the following:

  • VPC_NETWORK: the ID of the network created in your VPC project

  • REGION: the region used to create the VPC endpoint

  • AT_GATEWAY_FUNCTION_SECURITY_GROUP: ID of the security group created in the previous step

  1. Create a role for the Lambda Function:

aws iam create-role \
    --role-name {ROLE_NAME} \
    --description "Role created for the CARTO AT Gateway Function" \
    --assume-role-policy-document file://carto-functions-role.json
  • ROLE_NAME: Name of the role that will be created for the AT Gateway Lambda Function

The carto-functions-role.json file should contain the following policy json, which allows the Lambda function to assume the role:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowLambdaToAssumeRole",
      "Effect": "Allow",
      "Principal": {
        "Service": "lambda.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
  1. Attach permissions to the role created in the previous step for executing the function accesing the VPC network

aws iam attach-role-policy \
    --role-name {ROLE_NAME} \
    --policy-arn arn:aws:iam::aws:policy/service-role/AWSLambdaVPCAccessExecutionRole

ROLE_NAME: Name of the role created in the previous step

  1. Deploy the AT Gateway Function in AWS Lambda

aws lambda create-function \
  --function-name {NAME} \
  --description "CARTO AT Gateway Function" \
  --role {ROLE_ARN} \
  --code ImageUri=436593525295.dkr.ecr.us-east-1.amazonaws.com/at-gateway/aws-lambda:latest \
  --package-type Image \
  --region {REGION} \
  --timeout 180 \
  --vpc-config '{
    "SubnetIds": ["{SUBNET_ID}"],
    "SecurityGroupIds": ["SECURITY_GROUP_ID"]
  }' \
  --environment "Variables={NODE_TLS_REJECT_UNAUTHORIZED=0}"

The NODE_TLS_REJECT_UNAUTHORIZED environment variable is used to disable the verification of custom TLS certificates in the Self-hosted deployment

Replace the following variables:

  • NAME: name of the AT Gateway function

  • ROLE_ARN: arn of the role created for the Lambda function in the previous step

  • SUBNET_ID: id of the subnetwork created for the Lambda function

  • SECURITY_GROUP_ID: id of the security group created for the Lambda function

  1. Update the invoke configuration of the AT Gateway Lambda Function

aws lambda put-function-event-invoke-config \
  --function-name {AT_GATEWAY_FUNCTION_NAME} \
  --maximum-retry-attempts 0 \
  --region us-east-1
  • AT_GATEWAY_FUNCTION_NAME: name of the AT Gateway function created in the last step

2. Create the needed role to perform requests from Redshift to the AT Gateway function

Once we have deployed the AT Gateway function in AWS Lambda, we have to create a role with permissions to perform requests to the function from Redshift. Your Redshift cluster will use that role when interacting with the AT Gateway function.

  1. Create a role to invoke the Lambda function

aws iam create-role \
    --role-name {ROLE_NAME} \
    --description "Role used to invoke the AT Gateway function from Redshift" \
    --assume-role-policy-document file://carto-invoke-function-role.json
  • ROLE_NAME: Name of the role that will be created for invoking the Lambda function

The carto-invoke-function-role.json file should contain the following policy json, which allows Redshift to assume the role:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": "sts:AssumeRole",
      "Principal": {
        "Service": "redshift.amazonaws.com"
      }
    }
  ]
}
  1. Create a policy for the role

aws iam create-policy \
    --policy-name {POLICY_NAME} \
    --description "Policy to allow invoking the AT Gateway function from Redshift" \
    --policy-document file://carto-invoke-function-role-policy.json
  • POLICY_NAME: Name of the policy that will be attached to the role created in the previous step

The carto-invoke-function-role-policy.json file should contain the following policy json, which allows Redshift to assume the role:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Invoke",
            "Effect": "Allow",
            "Action": [
                "lambda:InvokeFunction"
            ],
            "Resource": "{LAMBDA_FUNCTION_ARN}"
        }
    ]
}
  • LAMBDA_FUNCTION_ARN: ARN of the Lambda function deployed in the first step

  1. Attach the policy to the role used to invoke the Lambda function

aws iam attach-role-policy \
    --role-name {ROLE_NAME} \
    --policy-arn {POLICY_ARN}
  • ROLE_NAME: Name of the role created to invoke the AT Gateway function from Redshift

  • POLICY_ARN: ARN of the policy created in the previous step

  1. Attach the role to the Redshift cluster

aws redshift modify-cluster-iam-roles \
  --cluster-identifier {CLUSTER_NAME} \
  --add-iam-roles {ROLE_ARN} \
  --region {REGION}
  • CLUSTER_NAME: Name of the Redshift cluster

  • ROLE_ARN: ARN of the role created to invoke the AT Gateway function from Redshift

  • REGION: the region where your Redshift cluster is deployed

3. Check security groups to ensure that the AT Gateway Lambda function can reach the Self-hosted instance

The Lambda function needs access to the CARTO Self-hosted environment, so you'll have to check that the security groups configured on your project allow the traffic between these two pieces.

The CARTO Self-hosted platform has to be accessible through the 443 port, and it should be allowed to respond requests performed by the Lambda function deployed in the previous steps.

All requests will be handled inside the VPC, so all network traffic involved in this process will take place between the created subnetwork and the CARTO Self-hosted instance.

4. Create DNS entry for CARTO Self-hosted platform

The AT Gateway Function service will need to access the CARTO Self-hosted LDS API to perform requests to the different LDS providers. As the requests will be handled inside the VPC, it's mandatory to add an internal DNS registry so that the Lambda functions can reach the CARTO platform APIs.

Firstly, we have to obtain the internal IP address of the CARTO Self-hosted platform. Once the internal IP has been obtained, you can create a DNS zone inside AWS using the following command:

If you already have an internal DNS configured in your AWS project you can skip this step and directly add a new domain pointing to the CARTO platform internal IP address.

aws route53 create-hosted-zone \
    --name {DNS_ZONE} \
    --vpc '{"VPCRegion":"{REGION}","VPCId":"{VPC_ID}"}' \
    --caller-reference $(date +%s)
  • DNS_ZONE: the name of your DNS zone

  • REGION: region where the zone is going to be created

  • VPC_ID: your AWS VPC id

Then we'll have to create a new registry inside the DNS zone, configuring a domain that points to CARTO Self-hosted platform's internal IP address:

aws route53 change-resource-record-sets \
  --hosted-zone-id {DNS_ZONE_ID} \
  --change-batch '{
    "Changes": [
      {
        "Action": "CREATE",
        "ResourceRecordSet": {
          "Name": "{INTERNAL_DOMAIN}",
          "Type": "A",
          "TTL": 300,
          "ResourceRecords": [
            {
              "Value": "{CARTO_PLATFORM_IP}"
            }
          ]
        }
      }
    ]
  }'

Replace the following:

  • DNS_ZONE_ID: the id of your DNS zone

  • INTERNAL_DOMAIN: the internal domain that will be pointing to your CARTO Self-hosted deployment inside your VPC

  • CARTO_PLATFORM_IP: internal IP address of your CARTO Self-hosted deployment

Configure the AT Gateway in your CARTO Analytics Toolbox installation

Now that we've both installed the Analytics Toolbox and deployed the required infrastructure pieces in AWS, we have to configure the Analytics Toolbox so that it's able to use the AT Gateway function.

The Analytics Toolbox provides a procedure to update the required configuration values to start using the remote function required. The function can be configured executing the following query in your Redshift database:

CALL carto.SETUP('{
   "lambda": "{AT_GATEWAY_LAMBDA}",
   "roles": "{ROLE_ARN}",
   "api_base_url": "{API_BASE_URL}",
   "api_access_token": "{API_ACCESS_TOKEN}"
}');

Replace the following:

  • AT_GATEWAY_LAMBDA: name of the AT Gateway Lambda function deployed in AWS

  • ROLE_ARN: arn of the role created to allow requests from Redshift to your AT Gateway Lambda

  • API_BASE_URL: the API base URL of your CARTO Self-hosted platform

  • API_ACCESS_TOKEN: access token generated inside CARTO platform with permissions to use the LDS API

🎉Congratulations! Your CARTO Analytics Toolbox is now successfully installed and configured inside your VPC.

Last updated