# Configure your own buckets

## 1. Overview

Every CARTO Self-Hosted installation requires **two cloud storage buckets** to handle data and assets used by the platform:

<table><thead><tr><th>Purpose</th><th width="152.7734375">Is mandatory?</th><th width="358.0859375">Description</th><th>Example contents</th></tr></thead><tbody><tr><td><strong>1. Temp Bucket</strong></td><td>Yes</td><td>Used to upload and import datasets into CARTO.</td><td><code>.csv</code>, <code>.geojson</code>, <code>.zip</code></td></tr><tr><td><strong>2. Assets Bucket</strong></td><td>Yes</td><td>Stores generated map thumbnails and customization assets (logos, markers, etc.).</td><td><code>.png</code></td></tr><tr><td><strong>3. Export Bucket</strong></td><td>No, this bucket is optional.</td><td>Used for exporting data from your data warehouse (BigQuery, Snowflake, Redshift, Databricks, Oracle...). Create this only if you plan to use data export features.</td><td><code>.csv</code>, <code>.json</code>, <code>.parquet</code></td></tr></tbody></table>

You can create and use your own storage buckets in any of the following supported storage providers:

* [Google Cloud Storage](https://cloud.google.com/storage)
* [AWS S3](https://aws.amazon.com/s3/)
* [Azure Blob Storage](https://azure.microsoft.com/es-es/products/storage/blobs/)

## 2. Pre-requisites

### 2.1. Create the required buckets

1. **Temp Bucket**
2. **Assets Bucket**
3. **Exports Bucket** (optional)

> There are no naming constraints for these buckets.

## 3. Configuration notes

### 3.1. Thumbnails bucket access

* This bucket stores the `.png` image files used for map previews.
  * **Bucket naming**: There are no specific naming constraints for this bucket; you can use any name that complies with your cloud provider's rules.
  * **Access control**: By default, all **thumbnail objects are configured to be publicly accessible**. This ensures that features like custom branding and map markers function correctly.
* You can change this in the Admin Console using the "**Assets bucket is Public**" config.

<figure><img src="https://3029946802-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FybPdpmLltPkzGFvz7m8A%2Fuploads%2Fgit-blob-c46e961bc6e16e114c2c632dba2dfb2145a49413%2Fimage%20(770).png?alt=media" alt="" width="563"><figcaption></figcaption></figure>

{% hint style="warning" %}
Important: setting the bucket to private will disable public markers feature since it rely on public image assets.
{% endhint %}

To maintain a private bucket while still enabling these features, you must configure your bucket with a hybrid access model that meets the following conditions:

1. **Allow public objects**: The bucket policy must allow individual objects to be made public, even if the bucket itself is private by default.
2. **Enable fine-grained ACLs**: You must enable Access Control Lists (ACLs), sometimes referred to as "non-uniform permissions." This allows CARTO to set `public-read` permissions on specific thumbnail files while all other objects remain private.
3. **Disable server-side encryption**: Bucket-level server-side encryption is often incompatible with setting public ACLs and must be disabled.

### 3.2. Export bucket provider requirements

Optionally, you can create an Exports bucket to configure the Exports from your data warehouse.

| Data Warehouse              | Required Storage Provider |
| --------------------------- | ------------------------- |
| **BigQuery**                | Google Cloud Storage      |
| **Snowflake**               | AWS S3                    |
| **Redshift**                | AWS S3                    |
| **Amazon RDS (PostgreSQL)** | AWS S3                    |
| **Oracle**                  | AWS S3                    |

### 3.3. CORS configuration

You need to setup CORS policy for the Assets and Temp Buckets. This is mandatory for them to work as expected. Below you'll find the details to configure the policy:

<table><thead><tr><th width="189.984375">Setting</th><th>Value</th></tr></thead><tbody><tr><td>Allowed origins</td><td><code>*</code></td></tr><tr><td>Allowed methods</td><td><code>GET, PUT, POST</code></td></tr><tr><td>Allowed headers (common)</td><td><code>Content-Type, Content-MD5, Content-Disposition, Cache-Control</code></td></tr><tr><td>GCS extra headers (only for Google Cloud)</td><td><code>x-goog-content-length-range, x-goog-meta-filename</code></td></tr><tr><td>Azure Blob extra headers (only for Azure)</td><td><code>Access-Control-Request-Headers, X-MS-Blob-Type</code></td></tr><tr><td>Max age</td><td><code>3600</code></td></tr></tbody></table>

CORS configuration location:

* **GCS / S3:** Bucket level
* **Azure Blob:** Storage account level

For more details, refer to:

* [Google Cloud Storage](https://cloud.google.com/storage)
* [AWS S3](https://aws.amazon.com/s3/)
* [Azure Blob Storage](https://azure.microsoft.com/es-es/products/storage/blobs/)

### 3.4. Authentication requirements

The buckets access from the Carto platform require authentication configuration. You'll find below the authentication methods available for each provider:

| Provider       | Auth Method                       |
| -------------- | --------------------------------- |
| **GCS**        | Service Account Key               |
| **AWS S3**     | Access Key ID + Secret Access Key |
| **Azure Blob** | Access Key                        |

{% hint style="info" %}
If you can't setup Service Account Keys, Access Keys or Secret Access Key due to security constraints or other reasons you can setup GCP Workload Identity or EKS Pod Identity using Advanced Orchestrated Deployment Method with Helm.
{% endhint %}

## 4. Configuration

Select your preferred storage provider:

<figure><img src="https://3029946802-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FybPdpmLltPkzGFvz7m8A%2Fuploads%2Fgit-blob-01ad702104126827eb6a3e0d9aea1da5967bdc21%2FScreenshot%202024-03-04%20at%2011.48.50.png?alt=media" alt=""><figcaption></figcaption></figure>

Once you've made your selection, please proceed to configure your storage preferences by completing the necessary fields below:

### 4.1 Google Cloud Storage

When configuring Google Cloud Storage as your storage provider, you'll have to:

1. Create 2 buckets in GCS:
   * **Assets Bucket**
   * **Temp Bucket**
2. Optionally you can create the **Data export Bucket** in case you'd like to allow exporting data from your data warehouse.

{% hint style="info" %}
Custom markers won't work unless the assets bucket is public.
{% endhint %}

2. Configure CORS: Temp and Assets buckets require having the following CORS headers configured:

```json
[
    {
      "origin": ["*"],
      "method": ["GET", "PUT", "POST"],
      "responseHeader": ["Content-Type", "Content-MD5", "Content-Disposition", "Cache-Control" , "x-goog-content-length-range", "x-goog-meta-filename"],
      "maxAgeSeconds": 3600
    }
]
```

{% hint style="info" %}
How do I setup CORS configuration? Check the [provider docs](https://cloud.google.com/storage/docs/using-cors).
{% endhint %}

3. Ensure that the identity used to access your GCS buckets has read/write permissions on all of them. It should have the Storage Admin role over the buckets that will be used.
4. Provide the Project ID of the Google Cloud Platform (GCP) project where your GCS buckets are located.
5. Specify the names of the GCS buckets that your application will be using. This allows your application to target the specific buckets for storing and retrieving data.

### 4.2 AWS S3

When configuring AWS S3 as your storage provider, you'll have to:

1. Create 3 buckets in AWS S3 account:
   * **Assets Bucket**
   * **Temp Bucket**
   * **Data export Bucket** (optional in case you'd like to allow exporting data from your data warehouse)

{% hint style="info" %}
Custom markers won't work unless the assets bucket is public.
{% endhint %}

{% hint style="danger" %}
When creating your buckets, please check that:

* ACLs should be allowed.
* If server-side encryption is enabled, the user must be granted with permissions over the KMS key following the [AWS documentation](https://repost.aws/knowledge-center/s3-bucket-access-default-encryption)
  {% endhint %}

2. Configure CORS: Temp and Assets buckets require having the following CORS headers configured:

```json
[
    {
        "AllowedHeaders": [
            "Content-Type",
            "Content-MD5",
            "Content-Disposition",
            "Cache-Control"
        ],
        "AllowedMethods": [
            "PUT",
            "POST",
            "GET"
        ],
        "AllowedOrigins": [
            "*"
        ],
        "ExposeHeaders": [],
        "MaxAgeSeconds": 3600
    }
]
```

{% hint style="info" %}
How do I setup CORS configuration? Check the [provider docs](https://docs.aws.amazon.com/AmazonS3/latest/userguide/enabling-cors-examples.html).
{% endhint %}

3. Provide an Access Key ID and Secret Access Key that will be used to access your S3 buckets. You can generate these credentials through the AWS Management Console by creating an IAM user with appropriate permissions for accessing S3 resources.
4. Configure the region in which these buckets are located. All the buckets must be created in the same AWS region.
5. Specify the names of the AWS buckets that your application will be using. This allows your application to target the specific buckets for storing and retrieving data.

### 4.3 Configuration for Redshift

Create an AWS IAM role with the following settings:

1. Trusted entity type: `Custom trust policy`
2. Custom trust policy: Make sure to replace `<your_aws_user_arn>` with the ARN of the user which Access Key has been configured on CARTO deployment configuration

```json
{
  "Version": "2012-10-17",
  "Statement": [
      {
          "Effect": "Allow",
          "Principal": {
              "AWS": "<your_aws_user_arn>"
          },
          "Action": [
              "sts:AssumeRole",
              "sts:TagSession"
          ]
      }
  ]
}
```

3. Add permissions: Create a new permissions' policy. Please, take into account that you can omit the export bucket permissions if you wouldn't like to enable exporting data from CARTO platform.

```json
{
   "Version": "2012-10-17",
   "Statement": [
       {
           "Effect": "Allow",
           "Action": "s3:ListBucket",
           "Resource": "arn:aws:s3:::<your_aws_s3_data_export_bucket_name>"
       },
       {
           "Effect": "Allow",
           "Action": "s3:*Object",
           "Resource": "arn:aws:s3:::<your_aws_s3_data_export_bucket_name>/*"
       },
       {
           "Effect": "Allow",
           "Action": "s3:ListBucket",
           "Resource": "arn:aws:s3:::<your_aws_s3_temp_bucket_name>"
       },
       {
           "Effect": "Allow",
           "Action": "s3:*Object",
           "Resource": "arn:aws:s3:::<your_aws_s3_temp_bucket_name>/*"
       }
   ]
}
```

This role has permissions to use both the exports bucket and the temp bucket to store that will be imported into Redshift. In order **to enable exporting data from Redshift you'll have to specify the ARN of the role and the name of the exports bucket** in the CARTO Self-Hosted configuration.

In case you'd like to enable importing data to Redshift, then it's not mandatory to provide the exports bucket's name, but you'll have to follow [these instructions](https://docs.carto.com/carto-user-manual/settings/advanced-settings/configuring-s3-bucket-for-redshift-imports) once the CARTO Self-Hosted deployment is ready.

### 4.4 Configuration for Snowflake

Create an AWS IAM role with the following settings:

1. Trusted entity type: `Custom trust policy`
2. Custom trust policy: Make sure to replace `<your_aws_user_arn>` with the ARN of the user which Access Key has been configured on CARTO deployment configuration

```json
{
  "Version": "2012-10-17",
  "Statement": [
      {
          "Effect": "Allow",
          "Principal": {
              "AWS": "<your_aws_user_arn>"
          },
          "Action": [
              "sts:AssumeRole",
              "sts:TagSession"
          ]
      }
  ]
}
```

3. Add permissions: Create a new permissions' policy. Please, take into account that you can omit the export bucket permissions if you wouldn't like to enable exporting data from CARTO platform.

```json
{
   "Version": "2012-10-17",
   "Statement": [
       {
           "Effect": "Allow",
           "Action": "s3:ListBucket",
           "Resource": "arn:aws:s3:::<your_aws_s3_data_export_bucket_name>"
       },
       {
           "Effect": "Allow",
           "Action": "s3:*Object",
           "Resource": "arn:aws:s3:::<your_aws_s3_data_export_bucket_name>/*"
       }
   ]
}
```

This role has permissions to use the exports bucket to store the data exported from Snowflake. In order **to enable exporting data from Snowflake you'll have to specify the ARN of the role and the name of the data export bucket** in the CARTO Self-Hosted configuration.

### 4.5 Azure Blob

When configuring Azure Blob as your storage provider, you'll have to:

1. Create 3 containers in your Azure Blob storage account:
   * **Assets Bucket**
   * **Temp Bucket**
   * **Data export Bucket** (optional in case you'd like to allow exporting data from your data warehouse)

{% hint style="info" %}
Custom markers won't work unless the assets bucket is public.
{% endhint %}

2. Configure CORS: Temp and Assets buckets require having the following CORS headers configured:

```json
[
    {
      "origin": ["*"],
      "method": ["GET", "PUT", "POST"],
      "responseHeader": ["Content-Type", "Content-MD5", "Content-Disposition", "Cache-Control" , "Access-Control-Request-Headers", "X-MS-Blob-Type"],
      "maxAgeSeconds": 3600
    }
]
```

{% hint style="info" %}
How do I setup CORS configuration? Check the [provider docs](https://learn.microsoft.com/en-us/rest/api/storageservices/cross-origin-resource-sharing--cors--support-for-the-azure-storage-services#enabling-cors-for-azure-storage).
{% endhint %}

3. Provide an Access Key that will be used to access your containers.
4. Specify the names of the buckets that your application will be using. This allows your application to target the specific buckets for storing and retrieving data.
