Importing rasters
Last updated
Was this helpful?
Last updated
Was this helpful?
Raster data plays a critical role in geospatial applications, enabling the visualization and analysis of spatial patterns, trends, and relationships. CARTO provides end-to-end support for raster data, allowing users to store, analyze, and render raster datasets directly in their cloud data warehouses.
This documentation outlines the complete process of preparing raster data, ensuring it meets CARTO’s specifications, and importing it into supported cloud data warehouses, including Google BigQuery, Snowflake, and Databricks.
Before importing raster data into your cloud data warehouse using CARTO, it must meet specific format, tiling, and projection requirements to ensure compatibility with CARTO.
Required specifications:
Format: Cloud Optimized GeoTIFF (COG)
Tiling Schema: Google Maps Tiling Schema
Projection: EPSG:4326
To process and prepare raster data efficiently, certain software dependencies must be installed, particularly Python and GDAL for geospatial data manipulation.
Check Python installation
Ensure that Python 3 is installed on your system. Run the following command to verify:
If Python is not installed, download and install it from Python.org.
Set up a virtual environment (Recommended)
Using a virtual environment helps manage dependencies and prevents conflicts with system-wide packages. Run the following commands to create and activate a virtual environment:
Install GDAL (Python bindings):
Once the virtual environment is activated, install GDAL using pip
:
This will install the necessary GDAL bindings for Python, allowing you to manipulate and process raster data.
Use gdalinfo
to get information and metadata about your file that will be useful for debugging and preparation.
NODATA values represent missing or invalid data within a raster file which can be inspected using gdalinfo
command. These values are automatically ignored by CARTO to ensure accurate analysis and visualization. Defining NODATA values properly is crucial, as they will not be displayed on the map and will be excluded from analytical queries.
It is advisable to reproject your raster to EPSG:4326 before converting it into a Cloud Optimized GeoTiff. This is easily done with gdalwarp
CARTO requires that raster files are in the Cloud Optimized GeoTIFF (COG) format. For this, use GDAL's gdalwarp
tool to transform our raster to this projection using the Google Maps tiling scheme, as per below example:
Reprojecting a raster during the conversion to COG can introduce artifacts, especially when interpolating pixel values.
The RESAMPLING=NEAREST
method is used to avoid these distortions by assigning the value of the nearest pixel rather than interpolating between multiple pixels. This method is particularly useful for categorical data (such as land cover classifications), where preserving exact pixel values is essential to maintain data integrity.
Other resampling methods exist, such as BILINEAR
or CUBIC
, but they are better suited for continuous data like elevation models or temperature maps, where smooth transitions between pixels are desirable.
Raster overviews, also known as pyramids, are lower-resolution copies of the original raster data stored within the file. Overviews allow CARTO to display raster data efficiently at different zoom levels by loading a lower-resolution version of the raster when the user is zoomed out, reducing processing and loading times.
By including the -co OVERVIEWS=IGNORE_EXISTING
option, you ensure that overviews are generated correctly, allowing CARTO to request the appropriate resolution dynamically based on the zoom level. Without overviews, CARTO would have to load the full-resolution raster even at distant zoom levels, leading to slow rendering performance.
In terms of adding an alpha band to your COG, -co ADD_ALPHA=NO
is the safer general option. However, in some cases it's advisable to convert your NO_DATA values to an alpha band and use -co ADD_ALPHA=YES
instead.
gdalwarp
supports many other options when creating a COG. Take a look at the complete documentation of the COG driver to see all of them.
This section outlines the available methods for importing raster data into CARTO when using Google BigQuery and Snowflake as your data warehouse. The method you choose will depend on the file size and the level of control required during the upload process.
Available import methods:
Import interface: Best suited for smaller raster files (≤1GB) where advanced configuration is not necessary.
Raster Loader: Recommended for larger raster files or cases where more control is needed during the upload process.
Recommended for files smaller than 1GB (excluding Databricks). This is the most straightforward approach but has limitations on file size and complexity.
The CARTO Raster Loader is a Python utility that can import a COG raster file to Google BigQuery and Snowflake as a CARTO raster table.
The raster-loader
library can be installed from pip
like:
Installation within a virtual environment is highly recommended.
Before uploading rasters to BigQuery, you need to authenticate with Google Cloud:
For Snowflake, authentication is performed during the uploading command itself.
Find a complete guide and reference at the Raster Loader documentation.
The basic command to upload a COG to BigQuery as a CARTO raster table is:
Find a complete guide and reference at the Raster Loader documentation.
The basic command to upload a COG to BigQuery as a CARTO raster table is:
Options for raster bands
By default, Raster Loader will upload the first band in the raster file, but it's possible to specify a different band with a command like:
Uploading multiple bands, with (optionally) custom names is supporting by concatenating both the bands to include and the label if required.
Options for very large files
For large raster files, you can use the --chunk_size
flag to specify the number of rows to upload at once. The default chunk size is 1000 rows.
For example, the following command uploads the raster in chunks of 2000 rows:
For large raster files, you also have the option to enable the --compress
flag which enables compression of the band data using gzip compression which can significantly reduce storage size.
Once your raster data is stored in your cloud data warehouse, you can analyze and visualize it using the CARTO platform. Below are the available options:
Using Analytics Toolbox for Raster Analysis: Perform advanced spatial analysis on raster data using SQL-based functions within your data warehouse using Raster modules for BigQuery, and Snowflake.
Processing Raster Data with Workflows: Build low-code geospatial analysis pipelines that integrate raster data with other datasets and processing tools.
Visualizing Raster Data in CARTO Builder: Add and style raster layers directly in CARTO Builder for interactive map exploration and presentation.
Rendering Raster Data with CARTO and Deck.gl: Use Deck.gl for high-performance client-side rendering of raster layers in custom applications.
For more details, visit the linked sections or explore CARTO’s documentation.