Configure your own buckets

This documentation is for the CARTO Self-Hosted Legacy Version. Use only if you've installed this specific version. Explore our latest documentation for updated features.

For every CARTO Self-Hosted installation, we need some configured buckets to store resources that will be used by the platform. These storage buckets are part of the required infrastructure for importing and exporting data, map thumbnails, customization assets (custom logos and markers) and other internal data.

You can create and use your own storage buckets in any of the following supported storage providers:

Pre-requisites

  1. Create 2 buckets in your preferred Cloud provider:

    • Import Bucket

    • Thumbnails Bucket.

There're no name constraints.

Map thumbnails storage objects (.png files) can be configured to be public (default) or private. In order to change this, set WORKSPACE_THUMBNAILS_PUBLIC="false". Some features, such as branding and custom markers, won't work unless the bucket is public. However, there's a workaround to avoid making the whole bucket public, which requires allowing public objects, allowing ACLs (or non-uniform permissions) and disabling server-side encryption.

  1. Create the data export bucket. This bucket has to be created in different storage providers depending on your data warehouse:

For buckets created in AWS S3:

  • ACLs should be allowed.

  • If server-side encryption is enabled, the user must be granted with permissions over the KMS key following the AWS documentation.

  1. CORS configuration: Thumbnails and Import buckets require having the following CORS headers configured.

    • Allowed origins: *

    • Allowed methods: GET, PUT, POST

    • Allowed headers (common): Content-Type, Content-MD5, Content-Disposition, Cache-Control

    • GCS (extra): x-goog-content-length-range, x-goog-meta-filename

    • Azure (extra): Access-Control-Request-Headers, X-MS-Blob-Type

    • Max age: 3600

CORS is configured at bucket level in GCS and S3, and at storage account level in Azure.

How do I setup CORS configuration? Check the provider docs: GCS, AWS S3, Azure Blob Storage.

  1. Generate credentials with Read/Write permissions to access those buckets, our supported authentication methods are:

    • GCS: Service Account Key

    • AWS: Access Key ID and Secret Access Key

    • Azure Blob: Access Key

Single VM deployments (Docker Compose)

Import and Thumbnails buckets

In order to use Google Cloud Storage custom buckets you need to:

  1. Create a custom Service account.

  2. Grant this service account with the following role (in addition to the buckets access): roles/iam.serviceAccountTokenCreator.

  3. Set the following variables in your customer.env file:

# Thumbnails bucket
WORKSPACE_THUMBNAILS_PROVIDER='gcp'
WORKSPACE_THUMBNAILS_PUBLIC=<true|false>
WORKSPACE_THUMBNAILS_BUCKET=<thumbnails_bucket_name>
WORKSPACE_THUMBNAILS_KEYFILENAME=/usr/src/certs/<gcp_key>.json
WORKSPACE_THUMBNAILS_PROJECTID=<gcp_project_id>

# Import bucket
IMPORT_PROVIDER='gcp'
IMPORT_BUCKET=<import_bucket_name>
IMPORT_KEYFILENAME=/usr/src/certs/<gcp_key>.json
IMPORT_PROJECTID=<gcp_project_id>

The service account that is used to access the GCP buckets should be copied into the certs folder, which is located inside the CARTO installation folder.

If <BUCKET>_KEYFILENAME is not defined env GOOGLE_APPLICATION_CREDENTIALS is used as default value. When the selfhosted service account is setup in a Compute Engine instance as the default service account, there's no need to set any of these, as the containers will inherit the instance default credentials.

If <BUCKET>_PROJECTID is not defined env GOOGLE_CLOUD_PROJECT is used as default value.

Data export bucket

Configure data export bucket for BigQuery

To enable exporting data from BigQuery on CARTO Self-Hosted platform, we need a GCS bucket where we can store the exported data, and a service account with permissions to manage the bucket. These are the required steps:

  1. Grant read/write permissions to the service account used by your CARTO Self-Hosted installation on the GCS export bucket created in the pre-requisites.

  2. Update the customer.env file with the following values:

EXPORTS_GCS_BUCKET_NAME=<GCP_BUCKET_NAME>
EXPORTS_GCS_CREDENTIALS_KEY_FILENAME=/usr/src/certs/<GCP_KEY_NAME>.json

Configure data exports in Snowflake and Redshift

Snowflake and Redshift require an AWS S3 bucket to export data from CARTO platform. These are the needed steps for allowing exporting data from CARTO Self-Hosted in these providers:

  1. Create an IAM user and generate a programmatic key ID and secret. If server-side encryption is enabled, the user must be granted with permissions over the KMS key used.

If you've already configured the Import and Thumbnails buckets using AWS S3, you can use the same user you already created for these buckets.

  1. Create an AWS IAM role with the following settings:

    1. Trusted entity type: Custom trust policy.

    2. Custom trust policy: Make sure to replace <your_aws_user_arn>.

    {
      "Version": "2012-10-17",
      "Statement": [
          {
              "Effect": "Allow",
              "Principal": {
                  "AWS": "<your_aws_user_arn>"
              },
              "Action": [
                  "sts:AssumeRole",
                  "sts:TagSession"
              ]
          }
      ]
    }
    1. Add permissions: Create a new permissions' policy, replacing <your_aws_s3_bucket_name>.

    {
       "Version": "2012-10-17",
       "Statement": [
           {
               "Effect": "Allow",
               "Action": "s3:ListBucket",
               "Resource": "arn:aws:s3:::<your_aws_s3_bucket_name>"
           },
           {
               "Effect": "Allow",
               "Action": "s3:*Object",
               "Resource": "arn:aws:s3:::<your_aws_s3_bucket_name>/*"
           }
       ]
    }
  2. Add the following environment variables in your customer.env file and apply the changes:

EXPORTS_S3_BUCKET_NAME=<BUCKET_NAME>
EXPORTS_S3_BUCKET_REGION=<REGION>
EXPORTS_S3_BUCKET_ROLE_ARN=<ROLE_ARN>
EXPORTS_S3_BUCKET_ACCESS_KEY_ID=<ACCESS_KEY_ID>
EXPORTS_S3_BUCKET_SECRET_ACCESS_KEY=<SECRET_ACCESS_KEY>

Configure data exports in Amazon RDS for PostgreSQL

The bucket to export data from Amazon RDS for PostgreSQL can be configured from the CARTO platform UI. Once your Self-Hosted installation is finished, you can check in the following documentation how to configure your S3 bucket integration for Amazon RDS for PostgreSQL.

Orchestrated container deployment (Kubernetes)

Import and Thumbnails buckets

In order to use Google Cloud Storage custom buckets you need to:

  1. Add the following lines to your customizations.yaml and replace the <values> with your own settings:

appConfigValues:
  storageProvider: "gcp"
  workspaceImportsBucket: <import_bucket_name>
  workspaceImportsPublic: <false|true>
  workspaceThumbnailsBucket: <thumbnails_bucket_name>
  workspaceThumbnailsPublic: <false|true>
  thumbnailsBucketExternalURL: <public or authenticated external bucket URL>
  googleCloudStorageProjectId: <gcp_project_id>

Note that thumbnailsBucketExternalURL could be https://storage.googleapis.com/<thumbnails_bucket_name>/ for public access or https://storage.cloud.google.com/<thumbnails_bucket_name>/ for authenticated access.

  1. Select a Service Account that will be used by the application to interact with the buckets. There are two options:

    1. Using a custom Service Account, that will be used not only for the buckets, but for the services deployed by CARTO as well. If you are using Workload Identity, that's your option.

    2. Using a dedicated Service Account only for the buckets

  2. Grant the selected Service Account with the role roles/iam.serviceAccountTokenCreator in the GCP project where it was created.

⚠️ We don't recommend granting this role at project IAM level, but instead at the Service Account permissions level (IAM > Service Accounts > your_service_account > Permissions).

  1. Grant the selected Service Account with the role roles/storage.admin to the buckets created.

  2. [OPTIONAL] Pass your GCP credentials as secrets: This is only required if you are going to use a dedicated Service Account only for the buckets

    • Option 1: Automatically create the secret:

      appSecrets:
        googleCloudStorageServiceAccountKey:
          value: |
            <REDACTED>

    appSecrets.googleCloudStorageServiceAccountKey.value should be in plain text, preserving the multiline and correctly tabulated.

    • Option 2: Using existing secret: Create a secret running the command below, after replacing the <PATH_TO_YOUR_SECRET.json> value with the path to the file of the Service Account:

      kubectl create secret generic \
        [-n my-namespace] \
        mycarto-google-storage-service-account \
        --from-file=key=<PATH_TO_YOUR_SECRET.json>

      Add the following lines to your customizations.yaml, without replacing any value:

      appSecrets:
        googleCloudStorageServiceAccountKey:
          existingSecret:
            name: mycarto-google-storage-service-account
            key: key

Data export bucket

Configure data export bucket for BigQuery

To enable exporting data from BigQuery on CARTO Self-Hosted platform these are the required steps:

  1. Grant read/write permissions to the service account used by your CARTO Self-Hosted installation on the GCS bucket created in the pre-requisites.

  2. Define the name the bucket that will be used for exporting data in your customizations.yaml file:

appConfigValues:
    workspaceExportsBucket: <YOUR_EXPORTS_BUCKET>

Configure data exports in Snowflake and Redshift

Snowflake and Redshift require an AWS S3 bucket to export data from CARTO platform. These are the needed steps for allowing exporting data from CARTO Self-Hosted in these providers:

  1. Create an IAM user and generate a programmatic key ID and secret. If server-side encryption is enabled, the user must be granted with permissions over the KMS key used.

If you've already configured the Import and Thumbnails buckets using AWS S3, you can use the same user you already created for these buckets.

  1. Create an AWS IAM role with the following settings:

    1. Trusted entity type: Custom trust policy.

    2. Custom trust policy: Make sure to replace <your_aws_user_arn>.

    {
      "Version": "2012-10-17",
      "Statement": [
          {
              "Effect": "Allow",
              "Principal": {
                  "AWS": "<your_aws_user_arn>"
              },
              "Action": [
                  "sts:AssumeRole",
                  "sts:TagSession"
              ]
          }
      ]
    }
    1. Add permissions: Create a new permissions' policy, replacing <your_aws_s3_bucket_name>.

    {
       "Version": "2012-10-17",
       "Statement": [
           {
               "Effect": "Allow",
               "Action": "s3:ListBucket",
               "Resource": "arn:aws:s3:::<your_aws_s3_bucket_name>"
           },
           {
               "Effect": "Allow",
               "Action": "s3:*Object",
               "Resource": "arn:aws:s3:::<your_aws_s3_bucket_name>/*"
           }
       ]
    }
  2. Update your customizations.yaml file with the following values:

appConfigValues:
    awsExportBucket: <BUCKET_NAME>
    awsExportBucketRegion: <REGION>
    exportAwsRoleArn: <ROLE_ARN>

Pass your AWS credentials as secrets:

  • Option 1: Automatically create the secret:

    appSecrets:
      exportAwsSecretAccessKey:
        value: <REDACTED>
      exportAwsAccessKeyId:
        value: <REDACTED>

appSecrets.exportAwsSecretAccessKey.value and appSecrets.exportAwsAccessKeyId.value be in plain text, preserving the multiline and correctly tabulated.

  • Option 2: Using existing secret: Create a secret running the command below:

    kubectl create secret generic \
      [-n my-namespace] \
      mycarto-export-aws-access-key \
      --from-literal=key-id=<ACCESS_KEY_ID>
      --from-literal=key-secret=<ACCESS_KEY_SECRET>

    Add the following lines to your customizations.yaml, without replacing any value:

    appSecrets:
      exportAwsSecretAccessKey:
        existingSecret:
          name: mycarto-export-aws-access-key
          key: key-id
      exportAwsAccessKeyId:
        existingSecret:
          name: mycarto-export-aws-access-key-id
          key: key-secret

Configure data exports in Amazon RDS for PostgreSQL

The bucket to export data from Amazon RDS for PostgreSQL can be configured from the CARTO platform UI. Once your Self-Hosted installation is finished, you can check in the following documentation how to configure your S3 bucket integration for Amazon RDS for PostgreSQL.

Last updated