Google BigQuery
Last updated
Last updated
CARTO can connect to your BigQuery Data Warehouse, allowing you to use your data for building Maps, Workflows and custom applications. There are three ways to set up a connection to Google BigQuery.
Recommended methods:
Sign in with Google: Connect your own Google account and use all the Google BigQuery permissions that you have access to, with the possibility of enforcing viewer credentials. This method is ideal if you want to use in CARTO exactly the same permissions that you have in your BigQuery console. This method is also called OAuth 2.0.
Workload Identity Federation: Leverage CARTO identities directly in Google Cloud Platform, with permissions being granted via IAM to a Workload Identity Pool, previously configured by your GCP administrators. This method is ideal if you want to use granular, restricted permissions exclusively for CARTO.
Other methods
Service Account: These are a set of credentials (a key in JSON format) generated in Google Cloud, representing a set of permissions for a database or a project, not associated with an individual. This is likely a quicker and more flexible solution for testing, but the other methods represent a more secure strategy for production environments.
Please make sure that your credentials (regardless of the method used) have the necessary permissions for CARTO to run. For more information, see Required BigQuery Permissions.
In all methods you will need to indicate a billing project. All queries performed by CARTO will use the billing account associated with the selected billing project. We recommend you review the different BigQuery pricing models, and more importantly, configure specific limits in BigQuery to avoid any unexpected charges.
CARTO is a fully cloud-native platform that runs queries on your behalf to power maps, workflows, etc. We never create or maintain any copies of your data.
To connect CARTO and BigQuery using your Google account simply click the Continue with Google button. This will open a Google login flow that will request the necessary scopes for CARTO to connect to your BigQuery data.
After allowing CARTO to access your Google BigQuery data, you will see a form where you'll specify the remaining details for this connection:
Name: This will be the name used to identify this connection across CARTO. It needs to be unique and there are special format rules: 3-50 characters long, containing only lowercase letters and numbers. Dashes and underscores are allowed if they're not leading or trailing.
Billing project: All queries performed by CARTO will run against this Google Cloud Platform project, and its associated billing account.
When using OAuth-based connections (such as this "Continue with Google"), you might be asked to reconnect at any time. It could happen, for example, after a few months or after changing your password. The reason is that this type of connections are linked to your Google account consent to CARTO, which you can also revoke at any moment.
Please make sure your Google account has adequate permissions for CARTO, at least at the billing project level. Learn more at Required BigQuery Permissions.
CARTO can connect to BigQuery by leveraging Workload Identity Federation.
Initial setup required
To connect to BigQuery using this method, the organization admin must first set up a Workload Identity Federation integration in CARTO. Once this is done, it will be available to all users within the organization. Read more about setting up a Workload Identity Federation for BigQuery integration.
To use it in your connections (and after your administator has completed the integration) simply click on Connect with Workload Identity Federation in the connection method selection screen. A new connection form will appear, and you'll need to complete the setup for your connection.
Name: This will be the name used to identify this connection across CARTO. It needs to be unique and there are special format rules: 3-50 characters long, containing only lowercase letters and numbers. Dashes and underscores are allowed if they're not leading or trailing.
IAM Principal: this is the identity that this connection will use in Google Cloud. Use this IAM directly in Google Cloud to grant permissions to this connection.
Service Account email for impersonation (optional): if you want to impersonate a Service Account using this connection, enter the service account email here. This assumes the IAM Principal has permissions to impersonate the service account.
Billing project: All queries performed by CARTO will run against this Google Cloud Platform project, and its associated billing account.
Please make sure that your Workload Identity Federation IAM Principal has adequate permissions in Google BigQuery for CARTO, at least at the billing project level. Learn more at Required BigQuery Permissions. If you're not sure, check with the Google Cloud administrator in your organization that integrated CARTO via Workload Identity Federation.
If you select Connect using a Service Account, you'll see a form where you'll specify the details for this connection:
Name: This will be the name used to identify this connection across CARTO. It needs to be unique and there are special format rules: 3-50 characters long, containing only lowercase letters and numbers. Dashes and underscores are allowed if they're not leading or trailing.
Service account key: The credentials file in JSON format. Please read the following instructions to learn how to create a service account and a service account key file in Google Cloud.
Billing project: All queries performed by CARTO will run against this Google Cloud Platform project, and its associated billing account.
Please make sure the Service Account has adequate permissions for CARTO, at least at the billing project level. Learn more at Required BigQuery Permissions.
Connections to Google BigQuery using both OAuth and Workload Identity Federation can be set up to require viewer credentials. This means that when the connection is shared, other users trying to access it will have to provide their own credentials to use it, instead of using the credentials (token) of the user that created the connection.
For more information, see Requiring viewing credentials for shared connections.
When you share Workload Identity Federation connections in "Requiring Viewer Credentials" mode, users will automatically use their own identity, and do not need to provide any additional information.
When creating the connection, CARTO will check that you have a minimum set of permissions that will allow the connection to operate with CARTO. These checks are performed at the Billing-project level.
You can then granularly specify a different set of permissions for each resource. For example, the connection could have edit permissions in some tables but read-only in others. Please note that you can give limited and granular permissions to resources in completely different projects than the billing project. We call this Resource level.
For each area + resource combination, connection credentials must have at least the “Minimum permissions” to work. Some optional features may require additional permissions to work as expected.
For the best experience in CARTO, we advise you to set up the “Recommended role”:
bigquery.dataEditor
bigquery.user
CARTO requires the following permissions at the billing project to connect to BigQuery:
Billing project (as specified in the connection)
bigquery.dataEditor
bigquery.user
bigquery.jobs.list
bigquery.jobs.create
resourcemanager.projects.get
CARTO requires the following permissions for each BigQuery resource in order to operate with those resources, such as projects, datasets, or tables.
Listing projects, datasets and tables in CARTO (Data Explorer)
bigquery.dataEditor
bigquery.user
resourcemanager.projects.get
resourcemanager.projects.list bigquery.tables.list
Projects, datasets, and tables used for map visualization (Builder)
bigquery.dataEditor
bigquery.user
bigquery.jobs.create
bigquery.tables.list
Projects, datasets, and tables used for spatial analysis (Workflows)
bigquery.dataEditor
bigquery.user
bigquery.jobs.create
bigquery.jobs.list
bigquery.tables.list
bigquery.tables.create
bigquery.datasets.create
bigquery.datasets.get
Projects, datasets, and tables used in custom applications (CARTO for Developers)
bigquery.dataEditor
bigquery.user
bigquery.jobs.create
You can also check our (more generic) guide about why CARTO requires each permission, with examples on setting different connections for different teams.
Analytics Toolbox location: This setting controls the location of the Analytics Toolbox used in SQL queries generated by Workflows components, Builder SQL Analyses, 'Create Tileset', 'Geocode Table' and 'Enrich Data' functionalities. By default, CARTO will automatically determine the corresponding AT Location based on the actual region of the data.
Data Observatory location: This settings controls the location of the Data Observatory subscriptions. This setting will be observed by Data Explorer, Workflows and Enrichment to access your data subscriptions.
By default, a specific project for your account (created automatically and maintained by CARTO) will be used. For example carto-data.ac_xxxxxxxx
Workflows temp. location: This setting controls the location (project.dataset) where Workflows will create temporal tables for each node. By default, it's a carto
dataset that will be created in the connection's project during the execution of a workflow. Learn more about it here.
Data Transfer Version Info: This setting is only necessary for Scheduling Workflows with 'Sign in with Google' connections. Learn about how to generate the code here.
Max number of concurrent queries: This setting controls the maximum number of simultaneous queries that CARTO will send to BigQuery using this connection.
Max query timeout: This setting controls the maximum allowed duration of queries that CARTO runs in BigQuery using this connection.
If you're using the cloud version of CARTO (SaaS), CARTO will connect to BigQuery using a set of static IPs for each region. Check this guide to find the IPs you need to allow for your specific region.
What is Workload Identity Federation and how does it work?
When you complete the integration, identities coming from CARTO (ie: CARTO users) will be recognized in your Google Cloud account as part of a Workload Identity Pool. You can then assign granular permissions to each of those CARTO users, using IAM rules directly in Google Cloud.
For example, you will be able to grant access to a BigQuery project for a specific set of CARTO users, or grant a specific permission to a specific group inside CARTO.
Learn more about Workload Identity Federation in Google Cloud.
Workload Identity Federation is recommended over using Service Account keys. Google discourages the use of Service Account keys whenever possible, and authenticating your connections via Workload Identitiy Federation can provide the same flexibility.