clustering

This module contains functions that perform clustering on geographies.

CREATE_CLUSTERKMEANS

CREATE_CLUSTERKMEANS(input, output_table, geom_column, number_of_clusters)

Description

Takes a set of points as input and partitions them into clusters using the k-means algorithm. Creates a new table with the same columns as input plus a cluster_id column with the cluster index for each of the input features.

  • input: VARCHAR name of the table or literal SQL query to be clustered.

  • output_table: VARCHAR name of the output table.

  • geom_column: VARCHAR name of the column to be clusterd.

  • number_of_clusters: INT number of clusters that will be generated.

Examples

CALL carto.CREATE_CLUSTERKMEANS('my-schema.my-input-table', 'my-schema.my-output-table', 'geom', 5);
-- The table `my-schema.my-output-table` will be created
-- adding the column cluster_id to those in `my-schema.my-input-table`.
CALL carto.CREATE_CLUSTERKMEANS('select * my-schema.my-input-table', 'my-schema.my-output-table', 'geom', 5);
-- The table `my-schema.my-output-table` will be created
-- adding the column cluster_id to those returned in the input query.

ST_CLUSTERKMEANS

ST_CLUSTERKMEANS(geog [, numberOfClusters])

Description

Takes a set of points as input and partitions them into clusters using the k-means algorithm. Returns an array of tuples with the cluster index for each of the input features and the input geometry.

  • geog: GEOMETRY points to be clustered.

  • numberOfClusters (optional): INT number of clusters that will be generated. It defaults to the square root of half the number of points (sqrt(<NUMBER OF POINTS>/2)). The output number of cluster cannot be greater to the number of distinct points of the geog.

Return type

SUPER: containing objects with cluster as the cluster id and geom as the geometry in GeoJSON format.

Examples

SELECT carto.ST_CLUSTERKMEANS(ST_GEOMFROMTEXT('MULTIPOINT ((0 0), (0 1), (5 0), (1 0))'));
-- {"cluster":0,"geom":{"type":"Point","coordinates":[0.0,0.0]}}
-- {"cluster":0,"geom":{"type":"Point","coordinates":[0.0,1.0]}}
-- {"cluster":0,"geom":{"type":"Point","coordinates":[5.0,0.0]}}
-- {"cluster":0,"geom":{"type":"Point","coordinates":[1.0,0.0]}}
SELECT carto.ST_CLUSTERKMEANS(ST_GEOMFROMTEXT('MULTIPOINT ((0 0), (0 1), (5 0), (1 0))'), 2);
-- {"cluster":0,"geom":{"type":"Point","coordinates":[0.0,0.0]}}
-- {"cluster":0,"geom":{"type":"Point","coordinates":[0.0,1.0]}}
-- {"cluster":1,"geom":{"type":"Point","coordinates":[5.0,0.0]}}
-- {"cluster":0,"geom":{"type":"Point","coordinates":[1.0,0.0]}}

Last updated

Was this helpful?