clustering

This module contains functions that perform clustering on geographies.

CREATE_CLUSTERKMEANS

CREATE_CLUSTERKMEANS(input, output_table, geom_column, number_of_clusters)

Description

Takes a set of points as input and partitions them into clusters using the k-means algorithm. Creates a new table with the same columns as input plus a cluster_id column with the cluster index for each of the input features.

Input parameters

  • input: VARCHAR name of the table or literal SQL query to be clustered.

  • output_table: VARCHAR(MAX) qualified name of the output table, e.g. <my-schema>.<my-output-table>. The process will fail if the table already exists.

  • geom_column: VARCHAR name of the column to be clusterd.

  • number_of_clusters: INT number of clusters that will be generated.

circle-exclamation

Examples

CALL carto.CREATE_CLUSTERKMEANS('<my-schema>.<my-table>', '<my-schema>.<my-output-table>', 'geom', 5);
-- The table `<my-schema>.<my-output-table>` will be created
-- adding the column cluster_id to those in `<my-schema>.<my-table>`.
CALL carto.CREATE_CLUSTERKMEANS('SELECT * FROM <my-schema>.<my-table>', '<my-schema>.<my-output-table>', 'geom', 5);
-- The table `<my-schema>.<my-output-table>` will be created
-- adding the column cluster_id to those returned in the input query.

ST_CLUSTERKMEANS

Description

Takes a set of points as input and partitions them into clusters using the k-means algorithm. Returns an array of tuples with the cluster index for each of the input features and the input geometry.

Input parameters

  • geog: GEOMETRY points to be clustered.

  • numberOfClusters (optional): INT number of clusters that will be generated. It defaults to the square root of half the number of points (sqrt(<NUMBER OF POINTS>/2)). The output number of cluster cannot be greater to the number of distinct points of the geog.

Return type

SUPER: containing objects with cluster as the cluster id and geom as the geometry in GeoJSON format.

Examples

Last updated

Was this helpful?