Use Workload Identity in GCP
This documentation is for the CARTO Self-Hosted Legacy Version. Use only if you've installed this specific version. Explore our latest documentation for updated features.
What is Workload Identity?
Applications running on Google Kubernetes Engine might need access to Google Cloud APIs such as Compute Engine API, BigQuery API, or Storage APIs.
Workload Identity allows a Kubernetes service account in your GKE cluster to act as an IAM service account. Pods that use the configured Kubernetes service account automatically authenticate as the IAM service account when accessing Google Cloud APIs. Using Workload Identity allows you to assign distinct, fine-grained identities and authorization for each application in your cluster.
Enabling Workload Identity in your Self-Hosted installation is just available for the orchestrated container deployment of CARTO.
How does Workload Identity work?
When you enable Workload Identity on a cluster, GKE automatically creates a fixed workload identity pool for the cluster's Google Cloud project. A workload identity pool allows IAM to understand and trust Kubernetes service account credentials. GKE uses this pool for all clusters in the project that use Workload Identity. The workload identity pool has the following format:
PROJECT_ID.svc.id.goog
When you configure a Kubernetes service account in a namespace to use Workload Identity, IAM authenticates the credentials using the following member name:
serviceAccount:PROJECT_ID.svc.id.goog[KUBERNETES_NAMESPACE/KUBERNETES_SERVICE_ACCOUNT]
In this member name:
PROJECT_ID
: your Google Cloud project ID.KUBERNETES_NAMESPACE
: the namespace of the Kubernetes service account.KUBERNETES_SERVICE_ACCOUNT
: the name of the Kubernetes service account making the request.
The process of configuring Workload Identity includes using an IAM policy binding to bind the Kubernetes service account member name to an IAM service account that has the permissions your workloads need. Any Google Cloud API calls from workloads that use this Kubernetes service account are authenticated as the bound IAM service account.
Configure CARTO deployment to use Workload Identity
In order to enable Workload Identity in your CARTO Self-Hosted installation, you'll have to follow these steps:
Create an IAM service account for your application, or use an existing IAM service account instead.
IAM_SERVICE_ACCOUNT_NAME
: name of the new service account.PROJECT_ID
: ID of the project where the GKE cluster is deployed.
Service Account needs roles/iam.serviceAccountTokenCreator
role to sign URLs, you can grant it with this command:
IAM_SERVICE_ACCOUNT_NAME
: name of the new service account used in previous stepIAM_SERVICE_ACCOUNT_EMAIL
: email of the service account generated with the previous command.
Send email to CARTO Support Team support@carto.com with Service Account Contact CARTO Support to let us know the Service Account you want to use for Workload Identity. We will ensure that your Service Account is granted the required roles to run CARTO Self-Hosted.
IMPORTANT: You cannot change the Service Account without contacting support.
Add the following lines to your customizations.yaml file:
IAM_SERVICE_ACCOUNT_EMAIL
: email of the service account generated in the first step.
The chart gives the possibility of disabling commonBackendServiceAccount
account creation with commonBackendServiceAccount.create: false
but this is not compatible with enableGCPWorkloadIdentity: true
Allow the Kubernetes service account that is going to be created in your GKE cluster to impersonate the IAM service account by adding an IAM policy binding between the two service accounts. This binding allows the Kubernetes service account to act as the IAM service account.
IAM_SERVICE_ACCOUNT_EMAIL
: email of the service account generated in the first step.PROJECT_ID
: ID of the project where the GKE cluster is deployed.KUBERNETES_NAMESPACE
: namespace where CARTO application is deployed.KUBERNETES_SERVICE_ACCOUNT
: name of the kubernetes service account used by CARTO application. Default value iscarto-common-backend
.
You can find the gcloud
command with the KUBERNETES_NAMESPACE
and KUBERNETES_SERVICE_ACCOUNT
values in the helm output notes once you execute the installation process.
Create a BigQuery connection managed using Workload Identity
CARTO Self-Hosted running on a GKE cluster can take advantage of GKE Workload Identity feature to create a connection between the CARTO Self-Hosted platform and BigQuery without any user action.
Configuration
Setup GKE Workload Identity for CARTO Self-Hosted following the documentation.
Grant your Workload Identity service account with BigQuery required permissions to your data warehouse project.
Add the following environment variables in your customizations.yaml file:
WORKFLOWS_TEMP_LOCATION
: BigQuery dataset ID used for storing temporary tables (i.e.my_gcp_project.my_dataset
).BILLING_PROJECT_ID
: GCP project to be charged with the BigQuery costs.WORKLOAD_IDENTITY_SA_EMAIL
: Service account email configured for Workload Identity.CARTO_OWNER_ID
: ID of the CARTO user who will be the owner of the connection (i.e."auth0|3idsj230990sj4wsddd10"
). This can be obtained by running the followingcurl
command:
Follow the previous command output and grant the service account the following role:
WORKLOAD_IDENTITY_SA_EMAIL
: Service account email configured for Workload Identity.PROJECT_ID
: ID of the project where the GKE cluster is deployed.KUBERNETES_NAMESPACE
: namespace where CARTO application is deployed.
Once you've applied the changes performed in your customizations.yaml file, your CARTO deployment will automatically create a new BigQuery connection using Workload Identity owned by the CARTO user specified in the deployment configuration!
Last updated