# clustering

This module contains functions that perform clustering on geographies.

## CREATE\_CLUSTERKMEANS <a href="#create_clusterkmeans" id="create_clusterkmeans"></a>

```sql
CREATE_CLUSTERKMEANS(input, output_table, geom_column, number_of_clusters)
```

**Description**

Takes a set of points as input and partitions them into clusters using the k-means algorithm. Creates a new table with the same columns as `input` plus a `cluster_id` column with the cluster index for each of the input features.

**Input parameters**

* `input`: `VARCHAR` name of the table or literal SQL query to be clustered.
* `output_table`: `VARCHAR(MAX)` qualified name of the output table, e.g. `<my-schema>.<my-output-table>`. The process will fail if the table already exists.
* `geom_column`: `VARCHAR` name of the column to be clusterd.
* `number_of_clusters`: `INT` number of clusters that will be generated.

{% hint style="warning" %}
**warning**

Keep in mid that due to some restrictions in the Redshift `VARCHAR` size, the maximum number of features (points) allow to be clustered is around 2500.
{% endhint %}

**Examples**

{% code overflow="wrap" lineNumbers="true" %}

```sql
CALL carto.CREATE_CLUSTERKMEANS('<my-schema>.<my-table>', '<my-schema>.<my-output-table>', 'geom', 5);
-- The table `<my-schema>.<my-output-table>` will be created
-- adding the column cluster_id to those in `<my-schema>.<my-table>`.
```

{% endcode %}

{% code overflow="wrap" lineNumbers="true" %}

```sql
CALL carto.CREATE_CLUSTERKMEANS('SELECT * FROM <my-schema>.<my-table>', '<my-schema>.<my-output-table>', 'geom', 5);
-- The table `<my-schema>.<my-output-table>` will be created
-- adding the column cluster_id to those returned in the input query.
```

{% endcode %}

## ST\_CLUSTERKMEANS <a href="#st_clusterkmeans" id="st_clusterkmeans"></a>

```sql
ST_CLUSTERKMEANS(geog [, numberOfClusters])
```

**Description**

Takes a set of points as input and partitions them into clusters using the k-means algorithm. Returns an array of tuples with the cluster index for each of the input features and the input geometry.

**Input parameters**

* `geog`: `GEOMETRY` points to be clustered.
* `numberOfClusters` (optional): `INT` number of clusters that will be generated. It defaults to the square root of half the number of points (`sqrt(<NUMBER OF POINTS>/2)`). The output number of cluster cannot be greater to the number of distinct points of the `geog`.

**Return type**

`SUPER`: containing objects with `cluster` as the cluster id and `geom` as the geometry in GeoJSON format.

**Examples**

{% code overflow="wrap" lineNumbers="true" %}

```sql
SELECT carto.ST_CLUSTERKMEANS(ST_GEOMFROMTEXT('MULTIPOINT ((0 0), (0 1), (5 0), (1 0))'));
-- {"cluster":0,"geom":{"type":"Point","coordinates":[0.0,0.0]}}
-- {"cluster":0,"geom":{"type":"Point","coordinates":[0.0,1.0]}}
-- {"cluster":0,"geom":{"type":"Point","coordinates":[5.0,0.0]}}
-- {"cluster":0,"geom":{"type":"Point","coordinates":[1.0,0.0]}}
```

{% endcode %}

{% code overflow="wrap" lineNumbers="true" %}

```sql
SELECT carto.ST_CLUSTERKMEANS(ST_GEOMFROMTEXT('MULTIPOINT ((0 0), (0 1), (5 0), (1 0))'), 2);
-- {"cluster":0,"geom":{"type":"Point","coordinates":[0.0,0.0]}}
-- {"cluster":0,"geom":{"type":"Point","coordinates":[0.0,1.0]}}
-- {"cluster":1,"geom":{"type":"Point","coordinates":[5.0,0.0]}}
-- {"cluster":0,"geom":{"type":"Point","coordinates":[1.0,0.0]}}
```

{% endcode %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.carto.com/data-and-analysis/analytics-toolbox-for-redshift/sql-reference/clustering.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
