Data Observatory
The Data Observatory is a spatial data platform that enables you to augment your data with the latest and greatest in spatial data. With a catalog of thousands of spatial datasets from public and premium sources that have been vetted by our Data team, the Data Observatory provides a streamlined process to reduce the operational inefficiencies of discovering, licensing, and accessing spatial data.
We strongly recommend reading through the Terminology section to get familiar with all the components of the Data Observatory.
In the following sections you will find a collection of resources where you can learn how to:
Some of these actions require the usage of CARTO tools and libraries that integrate with the Data Observatory, such as the Analytics Toolbox.
Data Observatory metadata structure
The Data Observatory metadata structure organizes all the data in the Data Observatory which includes:
Countries
Categories
Licenses
Sources
Place types
These datapoints will allow you to filter and discover datasets based on specific attributes. For example, you can start by filtering your data to Mexico, then focus on one or more categories such as points of interest or demographics.
Once you select a dataset is selected, additional metadata attributes become available, such as a summary description, list of key variables, temporal aggregation, update frequency, associated geography, sample data, detailed dataset schema, and map preview.
Data subscriptions
CARTO Workspace has a Spatial Data Catalog that offers public and premium datasets. Anyone in your organization can access the public dataset subscription you make. Premium dataset subscription requests require the help of the CARTO team.
Check out the following guides to learn how to find and subscribe to Data Observatory datasets:
Data access and analysis
You can access your subscriptions directly from your data warehouses connected to CARTO. The Data Observatory is currently supported for BigQuery, Snowflake, Redshift and PostgreSQL connections.
Access information can be checked through the Access in button available in the dataset’s detail page. Please refer to this step-by-step guide to learn more.
The Analytics Toolbox for BigQuery, Snowflake and Redshift offer a set of functions to enrich your datasets with any of the variables from your Data Observatory subscriptions by performing a spatial join between them and your own data. Enrichment is an essential step to incorporate Data Observatory data into your spatial analysis workflows.
It is also possible to enrich your tables with features from Data Observatory subscriptions using the "Enrich table" functionality from the Data Explorer.
Please check out the following resources to learn more:
BigQuery: data enrichment functions and step-by-step guide.
Snowflake: data enrichment functions.
Redshift: data enrichment functions.
Data visualization
To visualize Data Observatory datasets, you can use CARTO Workspace using Builder. You can do so by clicking on the "Create map" action in the subscription’s detail page from the Data Observatory section of the Data Explorer. Alternatively, you can add a new Data Observatory source to an existing map.
If the size of the dataset is within platform limits, it will be visualized in full. However, with bigger datasets, you will need to filter that data through SQL or through a Workflow to use it.
Or by adding a new Data Observatory source to an existing map:
Those datasets whose size is within platform limits will be visualized in full. Bigger datasets will be applied a spatial filter (a buffer around the centroid of the most populated city of the dataset’s country) and may require creating a tileset to be visualized in full.
Please refer to this guide to learn more on how to visualize Data Observatory datasets using Builder.
Creating Tilesets
Some of the spatial datasets offered in the Data Observatory are massive, and their visualization requires the creation of a tileset. When a Data Observatory subscription requires a tileset to be visualized in full, a "Create tileset" option will be available from the subscription’s page.
Tileset creation through the Workspace interface is currently available for CARTO Data Warehouse, BigQuery, Snowflake, Redshift, and PostgreSQL connections.
You can also create tilesets directly from your data warehouse using SQL commands. This option is currently only available using any of the available procedures in the Analytics Toolbox. You can create two types of tilesets: Simple and Aggregation. Both procedures take as input a SQL query specifying the data that you want to transform into a tileset. If the case of Data Observatory datasets, you can use the example query provided as part of the Access in functionality.
You can learn more on how to create a tileset with your Data Observatory subscriptions in this guide.
Last updated