Snowflake ML

Extension Package provided by CARTO

The Snowflake ML extension package for CARTO Workflows includes a variety of components that enable users to integrate machine learning workflows with geospatial data. These components allow for creating, evaluating, explaining, forecasting, and managing ML models directly within CARTO Workflows, utilizing Snowflake ML’s capabilities.

Get Model by Name

Description

This component imports a pre-trained model into the current workflow. If the name provided is not fully qualified, it will default to the connection's default database and the PUBLIC schema. The component assumes that the provided FQN points to an existing Snowflake ML model.

Settings

Model's FQN: Fully qualified name for the model to be imported.

Outputs

Output table: This component generates a single-row table with the FQN of the imported model.

Create Classification Model

Description

This component trains a classification model on the provided input data. If the name provided is not fully qualified, it will default to the connection's default database and the PUBLIC schema.

For more details, please refer to the SNOWFLAKE.ML.CLASSIFICATION official documentation in Snowflake.

Inputs

Input table: A data table that is used as input for the model creation.

Settings

Model's FQN: Fully qualified name for the model to be saved as.
ID Column: Column containing a unique identifier per sample.
Target Column: Column to be used as label in the training data.
Data Split: whether to perform or not a train/evaluation split on the data. Choosing SPLIT is required to use the Evaluate component with the resulting model.
Test Fraction: Fraction of the data to reserve for evaluation. A 0.2 will reserve 20% of the data for evaluation. Only applies if the Data Split is SPLIT.

Outputs

Output table: This component generates a single-row table with the FQN of the imported model.

Create Forecasting Model

Description

This component trains a forecasting model on the provided input data. If the name provided is not fully qualified, it will default to the connection's default database and the PUBLIC schema.

For more details, please refer to the SNOWFLAKE.ML.FORECAST official documentation in Snowflake.

Inputs

Input table: A data table that is used as input for the model creation.

Settings

Model's FQN: Fully qualified name for the model to be saved as.
Time Series ID Column: Column containing a unique identifier per time series.
Timestamp Column: Column containing the series' timestamp in DATE or DATETIME format.
Target Column: Column to be used as target in the training data.
Consider exogenous variables: whether to consider exogenous variables for forecasting. If checked, the future values for the variables must be provided when forecasting. All variables in the input will be considered except the specified time series ID, timestamp and target column.
Method: which method to use when fitting the model. It can be best or fast.
Sample Frequency: the frequency of the time series. It can be auto or manual.
Period: number of units to define the sampling frequency. Only applies when the Sample Frequency has been set to manual.
Time Unit: time unit used to define the frequency. It can be seconds, minutes, hours, days, weeks, months, quarters, or years. Only applies when Sampling Frequency has been set to manual.
Aggregation (categorical): aggregation function used for categorical columns if needed due to the sampling frequency. It can be mode, first, or last.
Aggregation (numeric): aggregation function used for numeric columns if needed due to the sampling frequency. It can be mean, median, mode, min, max, sum, first, or last.
Aggregation (target): aggregation function used for the target column if needed due to the sampling frequency. It can be mean, median, mode, min, max, sum, first, or last.

Outputs

Output table: This component generates a single-row table with the FQN of the imported model.

Predict

Description

This component uses a pre-trained classification model (using Get Model by Name or Create Classification Model components) to perform predictions on some given input data.

For more details, please refer to the !PREDICT function official documentation in Snowflake.

Inputs

Model table: the pre-trained classification model.
Input table: A data table that is used as input for inference.

Settings

Keep input columns: Whether to keep all the input columns in the output table.
ID Column: Column containing a unique identifier per sample. Only applies when Keep input columns is set to false.

Outputs

Output table: The model's predictions.

Forecast

Description

This component uses a pre-trained forecast model (using Get Model by Name or Create Forecasting Model components) to perform predictions on some given input data.

For more details, please refer to the !FORECAST function official documentation in Snowflake.

Inputs

Model table: the pre-trained classification model.
Input table: A data table that is used as input for inference. Only needed if the model has been trained using exogenous variables.

Settings

Consider exogenous variables: whether the model was trained to use exogenous variables or not.
Number of periods: number of periods to forecast per time series. Only applies if Consider exogenous variables is false.
Time Series ID Column: Column containing a unique identifier per time series. Only applies if Consider exogenous variables is true.
Timestamp Column: Column containing the series' timestamp in DATE or DATETIME format. Only applies if Consider exogenous variables is true.
Prediction Interval: Expected confidence of the prediction interval.
Keep input columns: Whether to keep all the input columns in the output table.

Outputs

Output table: The model's predictions.

Evaluate Classification

Description

This component returns some evaluation metrics for a pre-trained classification model using Get Model by Name or Create Classification Model components.

For more details, please refer to the !SHOW_EVALUATION_METRICS and !SHOW_GLOBAL_EVALUATION_METRICS functions official documentation in Snowflake.

Inputs

Model table: the pre-trained classification model.

Settings

Class level metrics: Whether to obtain the per-class evaluation metrics or the overall evaluation metrics.

Outputs

Output table: The model's evaluation metrics.

Evaluate Forecast

Description

This component returns some evaluation metrics for a pre-trained forecast model using Get Model by Name or Create Forecasting Model components.

For more details, please refer to the !SHOW_EVALUATION_METRICS function official documentation in Snowflake.

Inputs

Model table: the pre-trained classification model.
Input table (optional): additional out-of-sample data to compute the metrics on.

Settings

Compute metrics on additional out-of-sample data: When checked, the component will compute cross-validation metrics on additional out-of-sample data. Otherwise, the component will return the metrics generated at training time.
Time Series ID Column: Column containing a unique identifier per time series. Only applies when the metrics are being computed on additional out-of-sample data.
Timestamp Column: Column containing the series' timestamp in DATE or DATETIME format. Only applies when the metrics are being computed on additional out-of-sample data.
Target Column: Column to use as label in the input data. Only applies when the metrics are being computed on additional out-of-sample data.
Prediction Interval: Expected confidence of the prediction interval. Only applies when the metrics are being computed on additional out-of-sample data.

Outputs

Output table: The model's evaluation metrics.

Feature Importance (Classification)

This component displays the feature importances per variable of a pre-trained classification model.

For more details, please refer to the Snowflake's !SHOW_FEATURE_IMPORTANCE function documentation.

Inputs

Model table: the pre-trained classification model.

Outputs

Output table: a table with the feature importance per variable.

Feature Importance (Forecast)

This component displays the feature importances per variable of a pre-trained forecast model.

For more details, please refer to the Snowflake's !SHOW_FEATURE_IMPORTANCE function documentation.

Inputs

Model table: the pre-trained forecast model.

Outputs

Output table: a table with the feature importance per variable.

PreviousBigQuery ML NextGoogle Earth Engine

Last updated 6 months ago

Was this helpful?