Use Workload Identity in GCP

What is Workload Identity?

Applications running on Google Kubernetes Engine might need access to Google Cloud APIs such as Compute Engine API, BigQuery API, or Storage APIs.

Workload Identity allows a Kubernetes service account in your GKE cluster to act as an IAM service account. Pods that use the configured Kubernetes service account automatically authenticate as the IAM service account when accessing Google Cloud APIs. Using Workload Identity allows you to assign distinct, fine-grained identities and authorization for each application in your cluster.

Enabling Workload Identity in your Self-Hosted installation is just available for the orchestrated container deployment of CARTO.

How does Workload Identity work?

When you enable Workload Identity on a cluster, GKE automatically creates a fixed workload identity pool for the cluster's Google Cloud project. A workload identity pool allows IAM to understand and trust Kubernetes service account credentials. GKE uses this pool for all clusters in the project that use Workload Identity. The workload identity pool has the following format:

PROJECT_ID.svc.id.goog

When you configure a Kubernetes service account in a namespace to use Workload Identity, IAM authenticates the credentials using the following member name:

serviceAccount:PROJECT_ID.svc.id.goog[KUBERNETES_NAMESPACE/KUBERNETES_SERVICE_ACCOUNT]

In this member name:

  • PROJECT_ID: your Google Cloud project ID.

  • KUBERNETES_NAMESPACE: the namespace of the Kubernetes service account.

  • KUBERNETES_SERVICE_ACCOUNT: the name of the Kubernetes service account making the request.

The process of configuring Workload Identity includes using an IAM policy binding to bind the Kubernetes service account member name to an IAM service account that has the permissions your workloads need. Any Google Cloud API calls from workloads that use this Kubernetes service account are authenticated as the bound IAM service account.

Configure CARTO deployment to use Workload Identity

In order to enable Workload Identity in your CARTO Self-Hosted installation, you'll have to follow these steps:

  1. Create an IAM service account for your application, or use an existing IAM service account instead.

gcloud iam service-accounts create {IAM_SERVICE_ACCOUNT_NAME} \
    --project={PROJECT_ID}
  • IAM_SERVICE_ACCOUNT_NAME: name of the new service account.

  • PROJECT_ID: ID of the project where the GKE cluster is deployed.

Service Account needs roles/iam.serviceAccountTokenCreator role to sign URLs, you can grant it with this command:

gcloud iam service-accounts add-iam-policy-binding \
  {IAM_SERVICE_ACCOUNT_EMAIL} \
  --member=serviceAccount:{IAM_SERVICE_ACCOUNT_EMAIL} \
  --role=roles/iam.serviceAccountTokenCreator
  • IAM_SERVICE_ACCOUNT_NAME: name of the new service account used in previous step

  • IAM_SERVICE_ACCOUNT_EMAIL: email of the service account generated with the previous command.

  1. Send email to CARTO Support Team support@carto.com with Service Account Contact CARTO Support to let us know the Service Account you want to use for Workload Identity. We will ensure that your Service Account is granted the required roles to run CARTO Self-Hosted.

IMPORTANT: You cannot change the Service Account without contacting support.

  1. Add the workload identity email in your Admin Console:

  1. Allow the Kubernetes service account that is going to be created in your GKE cluster to impersonate the IAM service account by adding an IAM policy binding between the two service accounts. This binding allows the Kubernetes service account to act as the IAM service account.

gcloud iam service-accounts add-iam-policy-binding {IAM_SERVICE_ACCOUNT_EMAIL} \
--role roles/iam.workloadIdentityUser \
--member "serviceAccount:{PROJECT_ID}.svc.id.goog[{KUBERNETES_NAMESPACE}/{KUBERNETES_SERVICE_ACCOUNT}]"
  • IAM_SERVICE_ACCOUNT_EMAIL: email of the service account generated in the first step.

  • PROJECT_ID: ID of the project where the GKE cluster is deployed.

  • KUBERNETES_NAMESPACE: namespace where CARTO application is deployed.

  • KUBERNETES_SERVICE_ACCOUNT: name of the kubernetes service account used by CARTO application. Default value is carto-common-backend.

You can find the gcloud command with the KUBERNETES_NAMESPACE and KUBERNETES_SERVICE_ACCOUNT values in the helm output notes once you execute the installation process.

Create a BigQuery connection managed using Workload Identity

CARTO Self-Hosted running on a GKE cluster can take advantage of GKE Workload Identity feature to create a connection between the CARTO Self-Hosted platform and BigQuery without any user action.

Configuration

  1. Setup GKE Workload Identity for CARTO Self-Hosted following the documentation.

  2. Grant your Workload Identity service account with BigQuery required permissions to your data warehouse project.

  3. Enable the BigQuery workload identity connection in your Admin Console:

In the CARTO connection owner ID field you'll have to use the ID of the CARTO user who will be the owner of the connection (i.e. "auth0|3idsj230990sj4wsddd10"). This can be obtained by running the following curl command:

curl -s 'https://accounts.app.carto.com/users/me' \
  -H 'Authorization: Bearer <your_carto_jwt_token>' \
  | jq '.user_id'

Once you've applied the changes performed in your customizations.yaml file, your CARTO deployment will automatically create a new BigQuery connection using Workload Identity owned by the CARTO user specified in the deployment configuration!

Last updated