Snowflake ML
Extension Package provided by CARTO
The Snowflake ML extension package for CARTO Workflows includes a variety of components that enable users to integrate machine learning workflows with geospatial data. These components allow for creating, evaluating, explaining, forecasting, and managing ML models directly within CARTO Workflows, utilizing Snowflake ML’s capabilities.
Get Model by Name
Description
This component imports a pre-trained model into the current workflow. If the name provided is not fully qualified, it will default to the connection's default database and the PUBLIC
schema. The component assumes that the provided FQN points to an existing Snowflake ML model.
Settings
Model's FQN: Fully qualified name for the model to be imported.
Outputs
Output table: This component generates a single-row table with the FQN of the imported model.
Create Classification Model
Description
This component trains a classification model on the provided input data. If the name provided is not fully qualified, it will default to the connection's default database and the PUBLIC
schema.
For more details, please refer to the SNOWFLAKE.ML.CLASSIFICATION
official documentation in Snowflake.
Inputs
Input table: A data table that is used as input for the model creation.
Settings
Model's FQN: Fully qualified name for the model to be saved as.
ID Column: Column containing a unique identifier per sample.
Target Column: Column to be used as label in the training data.
Data Split: whether to perform or not a train/evaluation split on the data. Choosing
SPLIT
is required to use the Evaluate component with the resulting model.Test Fraction: Fraction of the data to reserve for evaluation. A 0.2 will reserve 20% of the data for evaluation. Only applies if the Data Split is
SPLIT
.
Outputs
Output table: This component generates a single-row table with the FQN of the imported model.
Create Forecasting Model
Description
This component trains a forecasting model on the provided input data. If the name provided is not fully qualified, it will default to the connection's default database and the PUBLIC
schema.
For more details, please refer to the SNOWFLAKE.ML.FORECAST
official documentation in Snowflake.
Inputs
Input table: A data table that is used as input for the model creation.
Settings
Model's FQN: Fully qualified name for the model to be saved as.
Time Series ID Column: Column containing a unique identifier per time series.
Timestamp Column: Column containing the series' timestamp in
DATE
orDATETIME
format.Target Column: Column to be used as target in the training data.
Consider exogenous variables: whether to consider exogenous variables for forecasting. If checked, the future values for the variables must be provided when forecasting. All variables in the input will be considered except the specified time series ID, timestamp and target column.
Method: which method to use when fitting the model. It can be
best
orfast
.Sample Frequency: the frequency of the time series. It can be
auto
ormanual
.Period: number of units to define the sampling frequency. Only applies when the Sample Frequency has been set to
manual
.Time Unit: time unit used to define the frequency. It can be
seconds
,minutes
,hours
,days
,weeks
,months
,quarters
, oryears
. Only applies when Sampling Frequency has been set tomanual
.Aggregation (categorical): aggregation function used for categorical columns if needed due to the sampling frequency. It can be
mode
,first
, orlast
.Aggregation (numeric): aggregation function used for numeric columns if needed due to the sampling frequency. It can be
mean
,median
,mode
,min
,max
,sum
,first
, orlast
.Aggregation (target): aggregation function used for the target column if needed due to the sampling frequency. It can be
mean
,median
,mode
,min
,max
,sum
,first
, orlast
.
Outputs
Output table: This component generates a single-row table with the FQN of the imported model.
Predict
Description
This component uses a pre-trained classification model (using Get Model by Name or Create Classification Model components) to perform predictions on some given input data.
For more details, please refer to the !PREDICT
function official documentation in Snowflake.
Inputs
Model table: the pre-trained classification model.
Input table: A data table that is used as input for inference.
Settings
Keep input columns: Whether to keep all the input columns in the output table.
ID Column: Column containing a unique identifier per sample. Only applies when Keep input columns is set to false.
Outputs
Output table: The model's predictions.
Forecast
Description
This component uses a pre-trained forecast model (using Get Model by Name or Create Forecasting Model components) to perform predictions on some given input data.
For more details, please refer to the !FORECAST
function official documentation in Snowflake.
Inputs
Model table: the pre-trained classification model.
Input table: A data table that is used as input for inference. Only needed if the model has been trained using exogenous variables.
Settings
Consider exogenous variables: whether the model was trained to use exogenous variables or not.
Number of periods: number of periods to forecast per time series. Only applies if Consider exogenous variables is false.
Time Series ID Column: Column containing a unique identifier per time series. Only applies if Consider exogenous variables is true.
Timestamp Column: Column containing the series' timestamp in
DATE
orDATETIME
format. Only applies if Consider exogenous variables is true.Prediction Interval: Expected confidence of the prediction interval.
Keep input columns: Whether to keep all the input columns in the output table.
Outputs
Output table: The model's predictions.
Evaluate Classification
Description
This component returns some evaluation metrics for a pre-trained classification model using Get Model by Name or Create Classification Model components.
For more details, please refer to the !SHOW_EVALUATION_METRICS
and !SHOW_GLOBAL_EVALUATION_METRICS
functions official documentation in Snowflake.
Inputs
Model table: the pre-trained classification model.
Settings
Class level metrics: Whether to obtain the per-class evaluation metrics or the overall evaluation metrics.
Outputs
Output table: The model's evaluation metrics.
Evaluate Forecast
Description
This component returns some evaluation metrics for a pre-trained forecast model using Get Model by Name or Create Forecasting Model components.
For more details, please refer to the !SHOW_EVALUATION_METRICS
function official documentation in Snowflake.
Inputs
Model table: the pre-trained classification model.
Input table (optional): additional out-of-sample data to compute the metrics on.
Settings
Compute metrics on additional out-of-sample data: When checked, the component will compute cross-validation metrics on additional out-of-sample data. Otherwise, the component will return the metrics generated at training time.
Time Series ID Column: Column containing a unique identifier per time series. Only applies when the metrics are being computed on additional out-of-sample data.
Timestamp Column: Column containing the series' timestamp in
DATE
orDATETIME
format. Only applies when the metrics are being computed on additional out-of-sample data.Target Column: Column to use as label in the input data. Only applies when the metrics are being computed on additional out-of-sample data.
Prediction Interval: Expected confidence of the prediction interval. Only applies when the metrics are being computed on additional out-of-sample data.
Outputs
Output table: The model's evaluation metrics.
Feature Importance (Classification)
This component displays the feature importances per variable of a pre-trained classification model.
For more details, please refer to the Snowflake's !SHOW_FEATURE_IMPORTANCE
function documentation.
Inputs
Model table: the pre-trained classification model.
Outputs
Output table: a table with the feature importance per variable.
Feature Importance (Forecast)
This component displays the feature importances per variable of a pre-trained forecast model.
For more details, please refer to the Snowflake's !SHOW_FEATURE_IMPORTANCE
function documentation.
Inputs
Model table: the pre-trained forecast model.
Outputs
Output table: a table with the feature importance per variable.
Last updated
Was this helpful?