clustering
This module contains functions that perform clustering on geographies.
CREATE_CLUSTERKMEANS
CREATE_CLUSTERKMEANS(input, output_table, geom_column, number_of_clusters)Description
Takes a set of points as input and partitions them into clusters using the k-means algorithm. Creates a new table with the same columns as input plus a cluster_id column with the cluster index for each of the input features.
input:VARCHARname of the table or literal SQL query to be clustered.output_table:VARCHARname of the output table.geom_column:VARCHARname of the column to be clusterd.number_of_clusters:INTnumber of clusters that will be generated.
warning
Keep in mid that due to some restrictions in the Redshift VARCHAR size, the maximum number of features (points) allow to be clustered is around 2500.
Examples
CALL carto.CREATE_CLUSTERKMEANS('my-schema.my-input-table', 'my-schema.my-output-table', 'geom', 5);
-- The table `my-schema.my-output-table` will be created
-- adding the column cluster_id to those in `my-schema.my-input-table`.CALL carto.CREATE_CLUSTERKMEANS('select * my-schema.my-input-table', 'my-schema.my-output-table', 'geom', 5);
-- The table `my-schema.my-output-table` will be created
-- adding the column cluster_id to those returned in the input query.ST_CLUSTERKMEANS
ST_CLUSTERKMEANS(geog [, numberOfClusters])Description
Takes a set of points as input and partitions them into clusters using the k-means algorithm. Returns an array of tuples with the cluster index for each of the input features and the input geometry.
geog:GEOMETRYpoints to be clustered.numberOfClusters(optional):INTnumber of clusters that will be generated. It defaults to the square root of half the number of points (sqrt(<NUMBER OF POINTS>/2)). The output number of cluster cannot be greater to the number of distinct points of thegeog.
Return type
SUPER: containing objects with cluster as the cluster id and geom as the geometry in GeoJSON format.
Examples
SELECT carto.ST_CLUSTERKMEANS(ST_GEOMFROMTEXT('MULTIPOINT ((0 0), (0 1), (5 0), (1 0))'));
-- {"cluster":0,"geom":{"type":"Point","coordinates":[0.0,0.0]}}
-- {"cluster":0,"geom":{"type":"Point","coordinates":[0.0,1.0]}}
-- {"cluster":0,"geom":{"type":"Point","coordinates":[5.0,0.0]}}
-- {"cluster":0,"geom":{"type":"Point","coordinates":[1.0,0.0]}}SELECT carto.ST_CLUSTERKMEANS(ST_GEOMFROMTEXT('MULTIPOINT ((0 0), (0 1), (5 0), (1 0))'), 2);
-- {"cluster":0,"geom":{"type":"Point","coordinates":[0.0,0.0]}}
-- {"cluster":0,"geom":{"type":"Point","coordinates":[0.0,1.0]}}
-- {"cluster":1,"geom":{"type":"Point","coordinates":[5.0,0.0]}}
-- {"cluster":0,"geom":{"type":"Point","coordinates":[1.0,0.0]}}Last updated
Was this helpful?
