This section contains guides to configure specific aspects of your CARTO Self-hosted installation.
The content of this section applies only to Single VM deployments and Standard Orchestrated containers deployments using Kots.
For CARTO Self-hosted using Kots
For every CARTO Self-Hosted installation, we need some configured buckets to store resources that will be used by the platform. These storage buckets are part of the required infrastructure for importing and exporting data, map thumbnails, customization assets (custom logos and markers) and other internal data.
You can create and use your own storage buckets in any of the following supported storage providers:
Select your preferred storage provider:
Once you've made your selection, please proceed to configure your storage preferences by completing the necessary fields below:
When configuring Google Cloud Storage as your storage provider, you'll have to:
Create 3 buckets in GCS:
Assets Bucket
Temp Bucket
Data export Bucket (optional in case you'd like to allow exporting data from your data warehouse)
Configure CORS: Temp and Assets buckets require having the following CORS headers configured:
Ensure that the identity used to access your GCS buckets has read/write permissions on all of them. It should have the over the buckets that will be used.
Provide the Project ID of the Google Cloud Platform (GCP) project where your GCS buckets are located.
Specify the names of the GCS buckets that your application will be using. This allows your application to target the specific buckets for storing and retrieving data.
When configuring AWS S3 as your storage provider, you'll have to:
Create 3 buckets in AWS S3 account:
Assets Bucket
Temp Bucket
Data export Bucket (optional in case you'd like to allow exporting data from your data warehouse)
When creating your buckets, please check that:
ACLs should be allowed.
If server-side encryption is enabled, the user must be granted with permissions over the KMS key following the AWS documentation
Configure CORS: Temp and Assets buckets require having the following CORS headers configured:
Provide an Access Key ID and Secret Access Key that will be used to access your S3 buckets. You can generate these credentials through the AWS Management Console by creating an IAM user with appropriate permissions for accessing S3 resources.
Configure the region in which these buckets are located. All the buckets must be created in the same AWS region.
Specify the names of the AWS buckets that your application will be using. This allows your application to target the specific buckets for storing and retrieving data.
Create an AWS IAM role with the following settings:
Trusted entity type: Custom trust policy
Custom trust policy: Make sure to replace <your_aws_user_arn>
with the ARN of the user which Access Key has been configured on CARTO deployment configuration
Add permissions: Create a new permissions' policy. Please, take into account that you can omit the export bucket permissions if you wouldn't like to enable exporting data from CARTO platform.
This role has permissions to use both the exports bucket and the temp bucket to store that will be imported into Redshift. In order to enable exporting data from Redshift you'll have to specify the ARN of the role and the name of the exports bucket in the CARTO Self-Hosted configuration.
In case you'd like to enable importing data to Redshift, then it's not mandatory to provide the exports bucket's name, but you'll have to follow these instructions once the CARTO Self-Hosted deployment is ready.
Create an AWS IAM role with the following settings:
Trusted entity type: Custom trust policy
Custom trust policy: Make sure to replace <your_aws_user_arn>
with the ARN of the user which Access Key has been configured on CARTO deployment configuration
Add permissions: Create a new permissions' policy. Please, take into account that you can omit the export bucket permissions if you wouldn't like to enable exporting data from CARTO platform.
This role has permissions to use the exports bucket to store the data exported from Snowflake. In order to enable exporting data from Snowflake you'll have to specify the ARN of the role and the name of the data export bucket in the CARTO Self-Hosted configuration.
When configuring Azure Blob as your storage provider, you'll have to:
Create 3 containers in your Azure Blob storage account:
Assets Bucket
Temp Bucket
Data export Bucket (optional in case you'd like to allow exporting data from your data warehouse)
Configure CORS: Temp and Assets buckets require having the following CORS headers configured:
Provide an Access Key that will be used to access your containers.
Specify the names of the buckets that your application will be using. This allows your application to target the specific buckets for storing and retrieving data.
For CARTO Self-hosted using Kots
CARTO Self-Hosted requires Redis (version 6 or above) to work. This Redis instance does not need persistence, as it is used solely as a cache.
Both Single VM deployment and Orchestrated container deployment come already with an internal Redis deployment, but they lack any backups, autoscaling, or monitoring. Cloud vendors already offer Redis deployments at scale as a service:
In this section, you will see how to configure the Self-hosted to use an external Redis provided by your cloud vendor. The needed changes can be applied from the Admin Console, as you'll have to update the following config:
By default, CARTO will try to connect to your Redis without TLS enabled. In case you want to connect via TLS, you can enable it and configure your SSL certificate if it's a selfsigned one or it's signed by a custom CA.
For CARTO Self-hosted using Kots
CARTO Self-hosted can be configured to use Google Basemaps in Builder, allowing you to choose between different Basemap styles provided by Google. All you need is a Google Maps API key and a few simple configuration steps.
The CARTO Self-hosted deployment requires a Google Maps Javascript API key in order to use Google Basemaps from Builder. You can follow these steps to generate a new key:
Enable Google Maps JavaScript API:
In the Google Cloud Console, navigate to the APIs & Services section and go to the Library tab
Click on the Enable APIs & Services button
Search for Google Maps JavaScript API and enable it
Create Credentials:
After enabling the API, navigate to Credentials tab
Click on Create Credentials and pick API key. Your new API key should appear as soon as it's generated!
Copy Your API Key: This is the API Key that the CARTO Selfhosted instance will use to load the different Google Basemaps in Builder.
Ensure the security of your API key by applying a restrictive usage policy. After setting up your API key, consider configuring key restrictions such as restricting it to be used just for the Google Maps Javascript API or to enable the usage from your domain.
In order to enable Google Maps basemaps inside CARTO Self Hosted, you need to own a Google Maps API key and add it to your configuration:
Once you've applied the new API Key, Google Basemaps should be available in Builder! 🎉
For CARTO Self-hosted using Kots
The CARTO Data Warehouse is a default connection that will help you get started with our platform. It gives you access to some demo datasets in order to start using the platform from the very beginning in case you don't have your own data warehouse with spatial data yet. It will also allow using spatial datasets from CARTO’s Data Observatory without connecting your own data warehouse to get access to cloud resources.
By default, the CARTO Data Warehouse in CARTO Self-Hosted is disabled. In this state, certain functionalities are limited compared to when it's enabled. Understanding these limitations is essential for optimal use of CARTO's capabilities.
The CARTO platform has some demo tables, maps, and workflows powered by the CARTO Data Warehouse that can't be used when it is not enabled. Once it's enabled, you'll be able to complete the onboarding experience using these demo resources.
Without the CARTO Data Warehouse, you can still explore the Spatial Data Catalog, but you can’t subscribe or use the data it offers it in Workflows/Builder directly. Instead, you’ll need to contact our team, who can deploy a copy of the data in your Data Warehouse.
With the activation of the CARTO Data Warehouse, a dedicated connection is automatically established for every new created organization within your Self-Hosted installation. This seamlessly integrated connection allow users to import data and start using the platform's capabilities.
To take full advantage of the CARTO Data Warehouse in your organization, enable it in your configuration:
Once the update process has finished, every CARTO organization created in your Self-Hosted deployment will have access to the CARTO Data Warehouse by default. Consequently, you'll be able to complete the onboarding process and access the Data Observatory subscriptions.
When enabling the CARTO Data Warehouse in a Self-Hosted installation, there are new requirements that should be accomplished as a new piece is added into the CARTO Self-Hosted architecture.
If you check the previous diagram, the CARTO Data Warehouse is a new external piece connected to your Self-Hosted installation. Consequently, your deployment should be able to perform requests to an external component that will be hosted on CARTO's side.
As it's explained in the deployment requirements, this data warehouse is powered by Google BigQuery, so you'll have to allow requests to BigQuery APIs in order to use this feature.
For CARTO Self-hosted using Kots
CARTO Self-hosted supports operating behind an HTTP or HTTPS proxy. The proxy acts as a gateway, enabling CARTO Self-hosted components to establish connections with essential external services like CARTO licensing system, or auth.carto.com
. You can find detailed information about these components and services in the network requirements section.
CARTO Self-hosted does not provide or install any proxy component; It's built to connect to an existing proxy software deployed on your side.
A comprehensive list of domains that must be whitelisted by the proxy for the proper operation of CARTO Self-hosted can be found here. Such list includes domains for the core services of CARTO Self-hosted, as well as some optional domains that should be enabled to access specific features.
In order to configure an external HTTP proxy on your CARTO Self-hosted installation, you'll have to:
Please be aware that proxy support is not available for our Single VM Deployment at this time.
Update your installation so that it uses an HTTP proxy:
The no-proxy
flag receives a comma-separated list of domains to exclude from proxying. The .svc.cluster.local
domain must be in the list to allow internal communication between components within your cluster.
In order to obtain the k8s_cluster_ip_service
IP address is the one that belongs to the Cluster IP service that Kubernetes creates by default in your default namespace. You can obtain it running the following command:
Once your installation has been updated, you'll have to edit the config of CARTO Self-Hosted platform from the Admin Console to allow the usage of an external proxy.
Please, take into account that if you're configuring an external proxy in a CARTO Self-Hosted installation running in GKE with Workload Identity configured, you'll have to add the following excluded domains:
pubsub.googleapis.com,*.googleapis.com,169.254.169.254,metadata,metadata.google.internal
These domains are required when authenticating the requests performed from an installation using Workload Identity.
To configure an HTTPS proxy on CARTO Self-hosted, you'll have to change the following configuration:
Please be aware that proxy support is not available for our Single VM Deployment at this time.
Update your installation so that it uses an HTTPS proxy:
The no-proxy
flag receives a comma-separated list of domains to exclude from proxying. The .svc.cluster.local
domain must be in the list to allow internal communication between components within your cluster.
In order to obtain the k8s_cluster_ip_service
IP address is the one that belongs to the Cluster IP service that Kubernetes creates by default in your default namespace. You can obtain it running the following command:
Once your installation has been updated, you'll have to edit the config of CARTO Self-Hosted platform from the Admin Console to allow the usage of an external proxy.
Please, take into account that if you're configuring an external proxy in a CARTO Self-Hosted installation running in GKE with Workload Identity configured, you'll have to add the following excluded domains:
pubsub.googleapis.com,*.googleapis.com,169.254.169.254,metadata,metadata.google.internal
These domains are required when authenticating the requests performed from an installation using Workload Identity.
While certain data warehouses can be configured to work with a proxy, there are some providers that will inherently bypass it. This means that the connection to these data warehouses won't be created through the proxy, so CARTO Self-hosted services will try to directly perform requests to the providers.
BigQuery: It supports both HTTP and HTTPs proxy.
PostgreSQL and Redshift: They use a TCP connection instead of HTTP(S), so the proxy is bypassed.
Databricks: Proxy is not supported, so the HTTPS connection will be bypassed.
Snowflake: It supports HTTP proxy, but HTTPS is not supported and will have to be bypassed. In order to bypass it, you'll have to add snowflakecomputing.com
to the list of excluded domains.
Password authentication is not supported for the proxy connection.
Importing data using an HTTPS Proxy configured with a certificate signed by a Custom CA is not supported.
For CARTO Self-hosted using Kots
This document will walk you through the process of setting up OAuth connections in your CARTO Self-hosted installation, enabling secure and seamless authentication when creating your BigQuery connections from the CARTO platform.
The first thing that has to be configured is an OAuth consent screen to allow the creation of OAuth connections. You'll have to navigate to APIs & Services > OAuth consent screen and enable this by filling up the application name, a support email for your consent screen, the authorized domain for your application and an email for developer contact. The authorized domain you choose should be the one used in the emails that will to use that feature..
The following are required to be able to create a BigQuery OAuth connection from CARTO platform.
https://www.googleapis.com/auth/userinfo.email
https://www.googleapis.com/auth/userinfo.profile
https://www.googleapis.com/auth/bigquery
Navigate to APIs & Services > Credentials > Create credentials to access the OAuth credentials creation form. The following details will be required to create the OAuth client ID:
Application type: Web application.
Authorized JavaScript origins: https://<your_selfhosted_domain>
.
Authorized redirect URIs: https://<your_selfhosted_domain>/connections/bigquery/oauth
.
Once the create button is clicked, you should be able to download the credentials generated for your application. These credentials will contain the required client_id
and client_secret
to enable OAuth connections in the CARTO installation.
Update the config of your CARTO Self-Hosted deployment from the Admin Console. You'll have to provide your OAuth client ID and client secret:
Once you've configured your CARTO Self-Hosted platform to use the OAuth credentials created in GCP, the Sign in with Google button should be available creating a BigQuery connection from the Workspace
For CARTO Self-hosted using Kots
This guide outlines the steps to configure Single Sign-On (SSO) for your CARTO Self-Hosted instance. SSO integration enhances security and user experience by allowing users to log in with a single set of credentials across multiple systems.
Contact CARTO Support:
Initiate contact with the CARTO Support team to request assistance with SSO configuration.
Work closely with the Support team to communicate your organization's specific requirements.
Obtain organization ID:
Once SSO is successfully configured from the CARTO side, CARTO Support team will provide you with a unique identifier known as the organization ID. This organization ID is required to continue with the SSO configuration in your CARTO Self-Hosted installation.
In order to configure the SSO in your orchestrated container deployment, the organization ID should be injected into your CARTO Self-Hosted instance. This value can be configured following these steps:
Inject the organization ID into your CARTO Self-hosted:
Deploy Changes:
Deploy the updated configuration to apply the changes to your CARTO Self-Hosted instance.
With the successful integration of SSO and the Organization ID, your CARTO Self-Hosted instance is now configured to provide a seamless and secure Single Sign-On experience for your users. You can now navigate to your CARTO deployment domain, and it should use your IdP to log into the platform.
For CARTO Self-hosted using Kots
What is Workload Identity?
Applications running on Google Kubernetes Engine might need access to Google Cloud APIs such as Compute Engine API, BigQuery API, or Storage APIs.
Workload Identity allows a Kubernetes service account in your GKE cluster to act as an IAM service account. Pods that use the configured Kubernetes service account automatically authenticate as the IAM service account when accessing Google Cloud APIs. Using Workload Identity allows you to assign distinct, fine-grained identities and authorization for each application in your cluster.
When you enable Workload Identity on a cluster, GKE automatically creates a fixed workload identity pool for the cluster's Google Cloud project. A workload identity pool allows IAM to understand and trust Kubernetes service account credentials. GKE uses this pool for all clusters in the project that use Workload Identity. The workload identity pool has the following format:
PROJECT_ID.svc.id.goog
When you configure a Kubernetes service account in a namespace to use Workload Identity, IAM authenticates the credentials using the following member name:
serviceAccount:PROJECT_ID.svc.id.goog[KUBERNETES_NAMESPACE/KUBERNETES_SERVICE_ACCOUNT]
In this member name:
PROJECT_ID
: your Google Cloud project ID.
KUBERNETES_NAMESPACE
: the namespace of the Kubernetes service account.
KUBERNETES_SERVICE_ACCOUNT
: the name of the Kubernetes service account making the request.
The process of configuring Workload Identity includes using an IAM policy binding to bind the Kubernetes service account member name to an IAM service account that has the permissions your workloads need. Any Google Cloud API calls from workloads that use this Kubernetes service account are authenticated as the bound IAM service account.
In order to enable Workload Identity in your CARTO Self-Hosted installation, you'll have to follow these steps:
Create an IAM service account for your application, or use an existing IAM service account instead.
IAM_SERVICE_ACCOUNT_NAME
: name of the new service account.
PROJECT_ID
: ID of the project where the GKE cluster is deployed.
Service Account needs roles/iam.serviceAccountTokenCreator
role to sign URLs, you can grant it with this command:
IAM_SERVICE_ACCOUNT_NAME
: name of the new service account used in previous step
IAM_SERVICE_ACCOUNT_EMAIL
: email of the service account generated with the previous command.
Send email to CARTO Support Team support@carto.com with the Service Account email
Reach out to our CARTO support team and provide them with the email associated with the Service Account you’ve created. This step ensures seamless integration of your Service Account with your CARTO Self-Hosted deployment. Email CARTO Support Team support@carto.com with the service account email.
IMPORTANT: The provided Service Account will not work unless the CARTO support team has been notified and they proceed with the grants mentioned above.
Create the Kubernetes service account used for Workload Identity:
Use the following command to generate the service account in your cluster:
SERVICE_ACCOUNT_NAME
: name of the service account that will be generated in your namespace. You'll have to provide that name when configuring CARTO platform.
NAMESPACE
: namespace where you're deploying CARTO Self-Hosted platform.
Once your service account is created in your kubernetes cluster, you'll have to annotate it with the email of the service account that you genereated in your GCP project:
SERVICE_ACCOUNT_NAME
: name of the service account that will be generated in your namespace.
NAMESPACE
: namespace where you're deploying CARTO Self-Hosted platform.
GCP_SERVICE_ACCOUNT_EMAIL
: email of the service account that you created in your GCP project.
Allow the Kubernetes service account that is going to be created in your GKE cluster to impersonate the IAM service account by adding an IAM policy binding between the two service accounts. This binding allows the Kubernetes service account to act as the IAM service account.
IAM_SERVICE_ACCOUNT_EMAIL
: email of the service account generated in the first step.
PROJECT_ID
: ID of the project where the GKE cluster is deployed.
KUBERNETES_NAMESPACE
: namespace where CARTO application is deployed.
KUBERNETES_SERVICE_ACCOUNT
: name of the kubernetes service account used by CARTO application. Default value is carto-common-backend
.
Add the workload identity service account name in your Admin Console:
CARTO Self-Hosted running on a GKE cluster can take advantage of GKE Workload Identity feature to create a connection between the CARTO Self-Hosted platform and BigQuery without any user action.
Setup GKE Workload Identity for CARTO Self-Hosted following the documentation.
Grant your Workload Identity service account with BigQuery required permissions to your data warehouse project.
Enable the BigQuery workload identity connection in your Admin Console:
In the CARTO connection owner ID field you'll have to use the ID of the CARTO user who will be the owner of the connection (i.e. "auth0|3idsj230990sj4wsddd10"
). This can be obtained by running the following curl
command:
Once you've applied the changes performed in your customizations.yaml file, your CARTO deployment will automatically create a new BigQuery connection using Workload Identity owned by the CARTO user specified in the deployment configuration!
For CARTO Self-hosted using Kots
This guide details configuring High Availability (HA) for CARTO Self-Hosted deployments using Kubernetes. HA ensures continuous service during server outages or failures, keeping your CARTO deployment operational.
To enable the high availability configuration on your CARTO Self-Hosted deployment, you'll have to check the following option in your configuration:
This change will enable the default options for high availability on your Self-Hosted installation, and we'll deploy multiple replicas for the services that are considered critical for CARTO platform's usage.
In case you'd like to customize the high availability configuration, these are the available options:
Minimum Number of Replicas: This setting determines the minimum number of copies of each CARTO service running at any given time. Increasing this value enhances fault tolerance by providing more replicas to handle requests during failures. The default value is 2.
Maximum Number of Replicas: This setting defines the upper limit on the number of replicas for each service. This allows you to scale your deployment up or down based on usage. The default value is 3.
While the HTTP cache component doesn't support true HA, you can still optimize its performance by allocating resources:
Cache Memory Request: This specifies the amount of RAM memory the cache container can request from your Kubernetes cluster.
Cache CPU Request: This defines the amount of CPU units the cache container can request from your cluster.
By following these steps, you can configure a highly available CARTO deployment using Kubernetes. This setup ensures service continuity during failures, improving uptime and fault tolerance for your CARTO instance.
For CARTO Self-hosted using Kots
When deploying CARTO Self-Hosted, each installation is accompanied by a dedicated infrastructure, including a Service Account key necessary for utilizing certain deployed services. However, if you prefer to utilize your own Google Cloud Platform (GCP) Service Account, please follow these steps before initiating the Self-Hosted installation:
Create a Dedicated Service Account: Generate a dedicated Service Account specifically for your CARTO Self-Hosted deployment within your Google Cloud Platform console.
Contact CARTO Support: Reach out to our CARTO support team and provide them with the email associated with the Service Account you've created. This step ensures seamless integration of your Service Account with your CARTO Self-Hosted deployment. Email CARTO Support Team support@carto.com with the service account email.
For any queries or assistance regarding this process, feel free to contact our support team. Once the service account has the proper permissions on CARTO infrastructure, you'll be able to use your custom service account to deploy the CARTO Self-Hosted platform.