# tiler

We currently provide procedures to create the following kind of tilesets:

* Spatial index tiles (aggregates spatial indexes into tiles at specific resolutions)
* Geometry-based tiles of two types:
  * *simple* tilesets to visualize features individually
  * *aggregation* tilesets to generate aggregated point visualizations

## CREATE\_VECTOR\_TILESET <a href="#create_vector_tileset" id="create_vector_tileset"></a>

```sql
CREATE_VECTOR_TILESET(input, output, options)
```

**Description**

Generates a simple tileset.

* `input`: `STRING` that can either contain a table name (e.g. `database.schema.tablename`) or a full query (e.g.`(SELECT * FROM database.schema.tablename)`).
* `output`: `STRING` of the format `database.schema.tablename` where the resulting tileset will be stored. The database and schema must exist and the caller needs to have permissions to create a new table in it.
* `options`: `STRING` containing a valid JSON with the different options. Valid options are described in the table below.

{% hint style="warning" %}
**warning**

If a query is passed in `input`, it might be evaluated multiple times to generate the tileset. Thus, non-deterministic functions, such as \[`ROW_NUMBER`] should be avoided. If such a function is needed, the query should be saved into a table first and then passed as `input`, to avoid inconsistent results.
{% endhint %}

| Option                 | Description                                                                                                                                                                                                                                                                                                                                                                                       |
| ---------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `if_exists`            | Default: `"fail"`. A `STRING` that indicates if the process will fail if the table already exists, if set to `"fail"`. Or any existing table will be replaced, if set to `"replace"`.                                                                                                                                                                                                             |
| `geom_column`          | Default: `"geom"`. A `STRING` that indicates the name of the geography column that will be used. The geography column must be a WKB with type `BINARY`. If your input table contains geographies in WKT format, they can be converted to WKB by using `ST_ASWKB(ST_GEOMFROMWKT(<geom_column>)) AS <geom_column>` in your input query.                                                             |
| `zoom_min`             | Default: `0`. A `INTEGER` that defines the minimum zoom level at which tiles will be generated. Any zoom level under this level won't be generated.                                                                                                                                                                                                                                               |
| `zoom_max`             | Default: `12`. A `INTEGER` that defines the maximum zoom level at which tiles will be generated. Any zoom level over this level won't be generated.                                                                                                                                                                                                                                               |
| `max_tile_vertices`    | Default: `200000`. A `INTEGER` that sets the maximum number of vertices a tile can contain. This limit only applies when the input geometries are lines or polygons. When this maximum is reached, the procedure will drop features. You can configure in which order the features are kept by setting the `tile_feature_order` property.                                                         |
| `max_tile_features`    | Default: `10000`. A `INTEGER` that sets the maximum number of features a tile can contain. This limit only applies when the input geometries are points. When this limit is reached, the procedure will stop adding features into the tile. You can configure in which order the features are kept by setting the `tile_feature_order` property.                                                  |
| `tile_feature_order`   | Default: `RANDOM()` for points, `ST_AREA() DESC` for polygons, `ST_LENGTH() DESC` for lines. A `STRING` defining the order in which properties are added to a tile. This expects the SQL `ORDER BY` **keyword definition**, such as `"aggregated_total DESC"`. The `"ORDER BY"` part must not be included. You can use any source column even if it is not included in the tileset as a property. |
| `include_geoids`       | Default: `false`. Generates an additional `geoids` column that contains the `geoid` value from each row in the input data that intersects with the tile. This option is required to use the resulting tileset as a boundary.                                                                                                                                                                      |
| `metadata`             | Default: `{}`. A JSON object to specify the associated metadata of the tileset. Use this to set the name, description and legend to be included in the [TileJSON](https://github.com/mapbox/tilejson-spec/tree/master/2.2.0). Other fields will be included in the object extra\_metadata.                                                                                                        |
| `max_categories_limit` | Default: `10`. A `INTEGER` that sets the maximum number of categories a property can have.                                                                                                                                                                                                                                                                                                        |
| `properties`           | Default: `""`. A `STRING` that defines the properties that will be included associated with each cell feature. Each `property` is defined by using SQL syntax and can make use of the columns present in the input table. Note that every property different from Number will be casted to String. Different properties must be separated by `;`.                                                 |

**Example**

```scala
import com.carto.analytics.toolbox.ATExecute

ATExecute.sql(
 """
  |CALL_CARTO carto_un.carto.CREATE_VECTOR_TILESET(
  |
  | '(SELECT geom, population, category FROM database.schema.population_table)',
  | 'database.schema.population_tileset',
  | '{
  | "if_exists": "replace",
  | "geom_column": "geom",
  | "zoom_min": 0,
  | "zoom_max": 6,
  | "properties": "population; category",
  | "metadata": {
  |   "name": "Population",
  |   "description": "Population in the cities"
  |   }
  | }'
  | );
  | """.stripMargin,
  spark
)
```

## CREATE\_POINT\_AGG\_TILESET <a href="#create_point_agg_tileset" id="create_point_agg_tileset"></a>

```sql
CREATE_POINT_AGG_TILESET(input, output, options)
```

**Description**

Generates a point aggregation tileset.

* `input`: `STRING` that can either contain a table name (e.g. `database.schema.tablename`) or a full query (e.g.`(SELECT * FROM database.schema.tablename)`).
* `output`: `STRING` of the format `database.schema.tablename` where the resulting tileset will be stored. The database and schema must exist and the caller needs to have permissions to create a new table in it.
* `options`: `STRING` containing a valid JSON with the different options. Valid options are described in the table below.

{% hint style="warning" %}
**warning**

If a query is passed in `input`, it might be evaluated multiple times to generate the tileset. Thus, non-deterministic functions, such as \[`ROW_NUMBER`] should be avoided. If such a function is needed, the query should be saved into a table first and then passed as `input`, to avoid inconsistent results.
{% endhint %}

| Option                   | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
| ------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `if_exists`              | Default: `"fail"`. A `STRING` that indicates if the process will fail if the table already exists, if set to `"fail"`. Or any existing table will be replaced, if set to `"replace"`.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
| `geom_column`            | Default: `"geom"`. A `STRING` that indicates the name of the geography column that will be used. The geography column must be a WKB with type `BINARY`. If your input table contains geographies in WKT format, they can be converted to WKB by using `ST_ASWKB(ST_GEOMFROMWKT(<geom_column>)) AS <geom_column>` in your input query.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
| `zoom_min`               | Default: `0`. An `INTEGER` that defines the minimum zoom level at which tiles will be generated. Any zoom level under this level won't be generated.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
| `zoom_max`               | Default: `12`; maximum: `20`. An `INTEGER` that defines the maximum zoom level at which tiles will be generated. Any zoom level over this level won't be generated.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| `aggregation_resolution` | <p>Default: <code>6</code>. An <code>INTEGER</code> that specifies the resolution of the spatial aggregation.<br><br>Aggregation for zoom <code>z</code> is based on quadgrid cells at <code>z + resolution level</code>. For example, with resolution <code>6</code>, the <code>z0</code> tile will be divided into cells that match the <code>z6</code> tiles, or the cells contained in the <code>z10</code> tile will be the boundaries of the <code>z16</code> tiles within them. In other words, each tile is subdivided into <code>4^resolution</code> cells, which is the maximum number of resulting features (aggregated) that the tiles will contain.<br><br>Note that adding more granularity necessarily means heavier tiles which take longer to be transmitted and processed in the final client, and you are more likely to hit the internal memory limits.</p>                                                                                          |
| `aggregation_placement`  | <p>Default: <code>"cell-centroid"</code>. A <code>STRING</code> that defines what type of geometry will be used to represent the cells generated in the aggregation, which will be the features of the resulting tileset. There are currently four options:<br></p><ul><li><code>"cell-centroid"</code>: Each feature will be defined as the centroid of the cell, that is, all points that are aggregated together into the cell will be represented in the tile by a single point positioned at the centroid of the cell.</li><li><code>"features-any"</code>: The aggregation cell will be represented by any random point from the source data contained within it. That is, if 10 points fall inside a cell, the procedure will randomly choose the location of one of them to represent the aggregation cell.</li><li><code>"features-centroid"</code>: The feature will be defined as the centroid (point) of the collection of points within the cell.</li></ul> |
| `metadata`               | Default: `{}`. A JSON object to specify the associated metadata of the tileset. Use this to set the name, description and legend to be included in the [TileJSON](https://github.com/mapbox/tilejson-spec/tree/master/2.2.0). Other fields will be included in the object extra\_metadata.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
| `max_categories_limit`   | Default: `10`. A `INTEGER` that sets the maximum number of categories a property can have.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
| `properties`             | Default: `""`. A `STRING` that defines the properties that will be included associated with each cell feature. Each `property` is defined by using SQL syntax and should include a formula to be applied to the values of the points that fall under the cell. This formula can be any SQL formula that uses an [aggregate function](https://docs.databricks.com/en/sql/language-manual/sql-ref-functions-builtin.html#aggregate-functions) supported by Databricks. Note that every property different from Number will be casted to String. Different properties must be separated by `;`.                                                                                                                                                                                                                                                                                                                                                                             |

{% hint style="warning" %}
**FEATURES PER TILE LIMITS**

The value of `aggregation_resolution` sets an upper bound to how many features can be present in a tile. For a value of *n*, a maximum of 4^*n* (4 raised to n) features can be present in a tile. For example, for an aggregation resolution of 8, the maximum number of features (points) will be 65536 per tile. This value can be too high and produce tiles that are too large when either the aggregation resolution is high or many properties are included. In that case, to improve the performance of the map visualizations, the `max_tile_features` should be used to limit the size of the tiles to about 1MB.
{% endhint %}

**Result**

The generated tileset consists of a table with the following columns, where each row represents a tile:

* `Z`: zoom level of the tile.
* `X`: X-index of the tile (`0` to `2^Z-1`).
* `Y`: Y-index of the tile (`0` to `2^Z-1`).
* `DATA`: contents of the tile, encoded as a MVT gzipped binary. It will contain the resulting points (location of the aggregated features) and their attributes (as defined by `properties`).

Additionally, there is a row identified by `Z=-1` which contains metadata about the tileset in the `DATA` column in JSON format. It contains the following properties:

* `bounds`: geographical extents of the source as a string in `Xmin, Ymin, Xmax, Ymax` format.
* `center`: center of the geographical extents as `X, Y, Z`, where the `Z` represents the zoom level where a single tile spans the whole extents size.
* `zmin`: minimum zoom level in the tileset.
* `zmax`: maximum zoom level in the tileset.
* `tilestats`: stats about the feature's properties. In addition to its name (`attribute`) and `type`, it contains `min`, `max`, `average`, `sum` and `quantiles` for numeric attributes and `categories` for text attributes.

**Example**

```scala
import com.carto.analytics.toolbox.ATExecute

ATExecute.sql(
 """
  |CALL_CARTO carto_un.carto.CREATE_POINT_AGG_TILESET(
  |
  | '(SELECT * FROM database.schema.cities_table)',
  | 'database.schema.cities_tileset',
  | '{
  | "if_exists": "replace",
  | "geom_column": "geom",
  | "zoom_min": 0,
  | "zoom_max": 12,
  | "aggregation_resolution": 6,
  | "aggregation_placement": "cell-centroid",
  | "properties": "COUNT(*) AS num_cities; SUM(POPULATION) AS population_sum; CASE WHEN COUNT(*) <= 1 THEN ANY_VALUE(city_name) ELSE NULL END AS city_name; ANY_VALUE(date) AS date",
  | "metadata": {
  |   "name": "Population",
  |   "description": "Population in the cities"
  |   }
  | }'
  | );
  | """.stripMargin,
  spark
)
```

In the example above, for all features we would get a property `"num_cities"` with the number of points that fall in it and `"population_sum"` with the sum of the population in those cities. In addition to this, when there is only one point that belongs to this property (and only in that case) we will also get the column values from the source data in `"city_name"`.

## CREATE\_H3\_AGG\_TILESET <a href="#create_h3_agg_tileset" id="create_h3_agg_tileset"></a>

```sql
CREATE_H3_AGG_TILESET(input, output, options)
```

**Description**

Creates a tileset that uses a H3 spatial index, aggregating data from an input table that uses that same spatial index.

Aggregated data is computed for all levels between `resolution_min` and `resolution_max`. For each resolution level, all tiles for the area covered by the source table are added, with data aggregated at level `resolution + aggregation resolution`.

* `input`: `STRING` that can either contain a table name (e.g. `database.schema.tablename`) or a full query (e.g.`(SELECT * FROM database.schema.tablename)`).
* `output`: `STRING` of the format `database.schema.tablename` where the resulting tileset will be stored. The database and schema must exist and the caller needs to have permissions to create a new table in it.
* `options`: `STRING` containing a valid JSON with the different options. Valid options are described in the table below.

{% hint style="warning" %}
**warning**

If a query is passed in `input`, it might be evaluated multiple times to generate the tileset. Thus, non-deterministic functions, such as \[`ROW_NUMBER`] should be avoided. If such a function is needed, the query should be saved into a table first and then passed as `input`, to avoid inconsistent results.
{% endhint %}

| Option                   | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
| ------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `if_exists`              | Default: `"fail"`. A `STRING` that indicates if the process will fail if the table already exists, if set to `"fail"`. Or any existing table will be replaced, if set to `"replace"`.                                                                                                                                                                                                                                                                                                                                                                                                        |
| `resolution_min`         | Default: `0`. A `INTEGER` that defines the minimum resolution level for tiles. Any resolution level under this level won't be generated.                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
| `resolution_max`         | Default: `6`. A `INTEGER` that defines the maximum resolution level for tiles. Any resolution level over this level won't be generated.                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| `h3_column`              | Default: `h3`. A `STRING` that indicates the name of the H3 spatial index column that will be used.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| `h3_resolution`          | A `INTEGER` defining the resolution of the tiles in the input table.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| `aggregation_resolution` | Default: `4`. A `INTEGER` defining the resolution to use when aggregating data at each resolution level. For a given `h3_resolution`, data is aggregated at `resolution_level + aggregation resolution`.                                                                                                                                                                                                                                                                                                                                                                                     |
| `metadata`               | Default: `{}`. A JSON object to specify the associated metadata of the tileset. Use this to set the name, description and legend to be included in the [TileJSON](https://github.com/mapbox/tilejson-spec/tree/master/2.2.0). Other fields will be included in the object extra\_metadata.                                                                                                                                                                                                                                                                                                   |
| `max_categories_limit`   | Default: `10`. A `INTEGER` that sets the maximum number of categories a property can have.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
| `properties`             | Default: `""`. A `STRING` that defines the properties that will be included associated with each cell feature. Each `property` is defined by using SQL syntax and should include a formula to be applied to the values of the points that fall under the cell. This formula can be any SQL formula that uses an [aggregate function](https://docs.databricks.com/en/sql/language-manual/sql-ref-functions-builtin.html#aggregate-functions) supported by Databricks. Note that every property different from Number will be casted to String. Different properties must be separated by `;`. |

**Example**

```scala
import com.carto.analytics.toolbox.ATExecute

ATExecute.sql(
 """
  |CALL_CARTO carto_un.carto.CREATE_H3_AGG_TILESET(
  |
  | '(SELECT * FROM database.schema.input_table_h3_level10)',
  | 'your_database.your_schema.output_tileset_h3_level10',
  | '{
  | "if_exists": "replace",
  | "h3_column": "h3",
  | "h3_resolution": 10,
  | "resolution_min": 0,
  | "resolution_max": 6,
  | "aggregation_resolution": 4,
  | "properties": "SUM(population) AS population; ANY_VALUE(date) AS date",
  | "metadata": {
  |   "name": "Population",
  |   "description": "Population in the cities"
  |   }
  | }'
  | );
  | """.stripMargin,
  spark
)
```

## CREATE\_QUADBIN\_AGG\_TILESET <a href="#create_quadbin_agg_tileset" id="create_quadbin_agg_tileset"></a>

```sql
CREATE_QUADBIN_AGG_TILESET(input, output, options)
```

**Description**

Creates a tileset that uses a quadbin spatial index, aggregating data from an input table that uses that same spatial index.

Aggregated data is computed for all levels between `resolution_min` and `resolution_max`. For each resolution level, all tiles for the area covered by the source table are added, with data aggregated at level `resolution + aggregation resolution`.

* `input`: `STRING` that can either contain a table name (e.g. `database.schema.tablename`) or a full query (e.g.`(SELECT * FROM database.schema.tablename)`).
* `output`: `STRING` of the format `database.schema.tablename` where the resulting tileset will be stored. The database and schema must exist and the caller needs to have permissions to create a new table in it.
* `options`: `STRING` containing a valid JSON with the different options. Valid options are described in the table below.

{% hint style="warning" %}
**warning**

If a query is passed in `input`, it might be evaluated multiple times to generate the tileset. Thus, non-deterministic functions, such as \[`ROW_NUMBER`] should be avoided. If such a function is needed, the query should be saved into a table first and then passed as `input`, to avoid inconsistent results.
{% endhint %}

| Option                   | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
| ------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `if_exists`              | Default: `"fail"`. A `STRING` that indicates if the process will fail if the table already exists, if set to `"fail"`. Or any existing table will be replaced, if set to `"replace"`.                                                                                                                                                                                                                                                                                                                                                                                                        |
| `resolution_min`         | Default: `0`. A `INTEGER` that defines the minimum resolution level for tiles. Any resolution level under this level won't be generated.                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
| `resolution_max`         | Default: `12`. A `INTEGER` that defines the maximum resolution level for tiles. Any resolution level over this level won't be generated.                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
| `quadbin_column`         | Default: `quadbin`. A `STRING` that indicates the name of the quadbin spatial index column that will be used.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| `quadbin_resolution`     | A `INTEGER` defining the resolution of the tiles in the input table.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| `aggregation_resolution` | Default: `6`. A `INTEGER` defining the resolution to use when aggregating data at each resolution level. For a given `quadbin_resolution`, data is aggregated at `resolution_level + aggregation resolution`.                                                                                                                                                                                                                                                                                                                                                                                |
| `metadata`               | Default: `{}`. A JSON object to specify the associated metadata of the tileset. Use this to set the name, description and legend to be included in the [TileJSON](https://github.com/mapbox/tilejson-spec/tree/master/2.2.0). Other fields will be included in the object extra\_metadata.                                                                                                                                                                                                                                                                                                   |
| `max_categories_limit`   | Default: `10`. A `INTEGER` that sets the maximum number of categories a property can have.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
| `properties`             | Default: `""`. A `STRING` that defines the properties that will be included associated with each cell feature. Each `property` is defined by using SQL syntax and should include a formula to be applied to the values of the points that fall under the cell. This formula can be any SQL formula that uses an [aggregate function](https://docs.databricks.com/en/sql/language-manual/sql-ref-functions-builtin.html#aggregate-functions) supported by Databricks. Note that every property different from Number will be casted to String. Different properties must be separated by `;`. |

**Example**

```scala
import com.carto.analytics.toolbox.ATExecute

ATExecute.sql(
 """
  |CALL_CARTO carto_un.carto.CREATE_QUADBIN_AGG_TILESET(
  |
  | '(SELECT * FROM database.schema.input_table_quadbin_level14)',
  | 'your_database.your_schema.output_tileset_quadbin_level14',
  | '{
  | "if_exists": "replace",
  | "quadbin_column": "quadbin",
  | "quadbin_resolution": 14,
  | "resolution_min": 0,
  | "resolution_max": 8,
  | "aggregation_resolution": 4,
  | "properties": "SUM(population) AS population; ANY_VALUE(date) AS date",
  | "metadata": {
  |   "name": "Population",
  |   "description": "Population in the cities"
  |   }
  | }'
  | );
  | """.stripMargin,
  spark
)
```
