Snowflake ML
Extension Package provided by CARTO
Last updated
Was this helpful?
Extension Package provided by CARTO
Last updated
Was this helpful?
The Snowflake ML extension package for CARTO Workflows includes a variety of components that enable users to integrate machine learning workflows with geospatial data. These components allow for creating, evaluating, explaining, forecasting, and managing ML models directly within CARTO Workflows, utilizing Snowflake ML’s capabilities.
Description
This component imports a pre-trained model into the current workflow. If the name provided is not fully qualified, it will default to the connection's default database and the PUBLIC
schema. The component assumes that the provided FQN points to an existing Snowflake ML model.
Settings
Model's FQN: Fully qualified name for the model to be imported.
Outputs
Output table: This component generates a single-row table with the FQN of the imported model.
Description
This component trains a classification model on the provided input data. If the name provided is not fully qualified, it will default to the connection's default database and the PUBLIC
schema.
For more details, please refer to the official documentation in Snowflake.
Inputs
Input table: A data table that is used as input for the model creation.
Settings
Model's FQN: Fully qualified name for the model to be saved as.
ID Column: Column containing a unique identifier per sample.
Target Column: Column to be used as label in the training data.
Data Split: whether to perform or not a train/evaluation split on the data. Choosing SPLIT
is required to use the Evaluate component with the resulting model.
Test Fraction: Fraction of the data to reserve for evaluation. A 0.2 will reserve 20% of the data for evaluation. Only applies if the Data Split is SPLIT
.
Outputs
Output table: This component generates a single-row table with the FQN of the imported model.
Description
This component trains a forecasting model on the provided input data. If the name provided is not fully qualified, it will default to the connection's default database and the PUBLIC
schema.
Inputs
Input table: A data table that is used as input for the model creation.
Settings
Model's FQN: Fully qualified name for the model to be saved as.
Time Series ID Column: Column containing a unique identifier per time series.
Timestamp Column: Column containing the series' timestamp in DATE
or DATETIME
format.
Target Column: Column to be used as target in the training data.
Consider exogenous variables: whether to consider exogenous variables for forecasting. If checked, the future values for the variables must be provided when forecasting. All variables in the input will be considered except the specified time series ID, timestamp and target column.
Method: which method to use when fitting the model. It can be best
or fast
.
Sample Frequency: the frequency of the time series. It can be auto
or manual
.
Period: number of units to define the sampling frequency. Only applies when the Sample Frequency has been set to manual
.
Time Unit: time unit used to define the frequency. It can be seconds
, minutes
, hours
, days
, weeks
, months
, quarters
, or years
. Only applies when Sampling Frequency has been set to manual
.
Aggregation (categorical): aggregation function used for categorical columns if needed due to the sampling frequency. It can be mode
, first
, or last
.
Aggregation (numeric): aggregation function used for numeric columns if needed due to the sampling frequency. It can be mean
, median
, mode
, min
, max
, sum
, first
, or last
.
Aggregation (target): aggregation function used for the target column if needed due to the sampling frequency. It can be mean
, median
, mode
, min
, max
, sum
, first
, or last
.
Outputs
Output table: This component generates a single-row table with the FQN of the imported model.
Description
Inputs
Model table: the pre-trained classification model.
Input table: A data table that is used as input for inference.
Settings
Keep input columns: Whether to keep all the input columns in the output table.
ID Column: Column containing a unique identifier per sample. Only applies when Keep input columns is set to false.
Outputs
Output table: The model's predictions.
Description
Inputs
Model table: the pre-trained classification model.
Input table: A data table that is used as input for inference. Only needed if the model has been trained using exogenous variables.
Settings
Consider exogenous variables: whether the model was trained to use exogenous variables or not.
Number of periods: number of periods to forecast per time series. Only applies if Consider exogenous variables is false.
Time Series ID Column: Column containing a unique identifier per time series. Only applies if Consider exogenous variables is true.
Timestamp Column: Column containing the series' timestamp in DATE
or DATETIME
format. Only applies if Consider exogenous variables is true.
Prediction Interval: Expected confidence of the prediction interval.
Keep input columns: Whether to keep all the input columns in the output table.
Outputs
Output table: The model's predictions.
Description
Inputs
Model table: the pre-trained classification model.
Settings
Class level metrics: Whether to obtain the per-class evaluation metrics or the overall evaluation metrics.
Outputs
Output table: The model's evaluation metrics.
Description
Inputs
Model table: the pre-trained classification model.
Input table (optional): additional out-of-sample data to compute the metrics on.
Settings
Compute metrics on additional out-of-sample data: When checked, the component will compute cross-validation metrics on additional out-of-sample data. Otherwise, the component will return the metrics generated at training time.
Time Series ID Column: Column containing a unique identifier per time series. Only applies when the metrics are being computed on additional out-of-sample data.
Timestamp Column: Column containing the series' timestamp in DATE
or DATETIME
format. Only applies when the metrics are being computed on additional out-of-sample data.
Target Column: Column to use as label in the input data. Only applies when the metrics are being computed on additional out-of-sample data.
Prediction Interval: Expected confidence of the prediction interval. Only applies when the metrics are being computed on additional out-of-sample data.
Outputs
Output table: The model's evaluation metrics.
This component displays the feature importances per variable of a pre-trained classification model.
Inputs
Model table: the pre-trained classification model.
Outputs
Output table: a table with the feature importance per variable.
This component displays the feature importances per variable of a pre-trained forecast model.
Inputs
Model table: the pre-trained forecast model.
Outputs
Output table: a table with the feature importance per variable.
For more details, please refer to the official documentation in Snowflake.
This component uses a pre-trained classification model (using or components) to perform predictions on some given input data.
For more details, please refer to the function official documentation in Snowflake.
This component uses a pre-trained forecast model (using or components) to perform predictions on some given input data.
For more details, please refer to the function official documentation in Snowflake.
This component returns some evaluation metrics for a pre-trained classification model using or components.
For more details, please refer to the and functions official documentation in Snowflake.
This component returns some evaluation metrics for a pre-trained forecast model using or components.
For more details, please refer to the function official documentation in Snowflake.
For more details, please refer to the Snowflake's function documentation.
For more details, please refer to the Snowflake's function documentation.