Custom

Components that allow to create custom user-defined procedures and SQL queries in your workflows.

Call Procedure

Description

This component executes an arbitrary CALL SQL statement.

Use $a and $b as placeholders for the input table names (you can use one, both, or none of them), and $output as placeholder for the output table name.

The procedure that you are calling is responsible for creating the output table in the path stored in the $output variable.

The following is a valid example of code to use in this component:

CALL MY_PROCEDURE($a, 10, `$output`);

Please take into account that $a, $b and $outputwill be parsed directly as they are. In some cases, the table names that work as input/output need to be quoted differently depending on the data warehouse. Users need to ensure the correct quoting, like `$output` or "$a".

Due to Redshift limitations, calling stored procedures that have INOUT string parameters will cause an exception when running the workflow. Stored procedures with such parameters are not supported.

Additional placeholders

  • $isStoredProcedure

When using the Workflows API, your workflow is executed as a stored procedure within your data warehouse. This changes how tables are handled.

To manage this behavior, we provide the $isStoredProcedure placeholder. This will be replaced with true when the workflow is run as a stored procedure. This happens when running workflows via API.

In stored procedures:

  • Table names are not fully qualified

  • Tables are expected to be temporary and session-scoped

Your SQL code should account for this logic to avoid errors when running workflows in API mode.

Here is a sample code for that (notice that no CALL is made in this case, since the component can run any arbitrary SQL code):

Example:

IF ($isStoredProcedure) THEN
  -- $output will be a simple table name (i.e mytable)
  CREATE TEMPORARY TABLE $output AS SELECT * FROM $a LIMIT 10;
ELSE
  -- $output will be a fully qualified name (i.e database.schema.mytable)
  CREATE TABLE $output AS SELECT * FROM $a LIMIT 10;
END IF

  • $dryrun

Workflows use a dry run mode to infer and propagate table schemas across components. When executing queries in this mode, you may want to differentiate logic from a standard run.

To handle this, use the $dryrun placeholder, which is replaced by true when the workflow is being dry-run (e.g., when clicking "Validate schema").

Example:

IF ($dryrun) THEN
  -- During dry run: propagate schema so other components can recognize output
  CREATE TABLE `$output` AS SELECT * FROM `$a`;
ELSE
  -- During normal run: actually create output tables
  CREATE TABLE `$output` AS SELECT * FROM `$a`;

  -- During normal run: write to a specific target table for post-processing
  CREATE OR REPLACE TABLE `project.schema.specific_table` AS SELECT * FROM `$a`;
END IF;

Inputs

  • Source table [Table]

  • Source table [Table]

  • SQL CALL statement [String]

Outputs

  • Result table [Table]

Custom Python Code

Description

This component runs arbitrary Python code.

Inputs

  1. Source table: Use df_ain your code to reference a dataframe that contains the data in the input. Use $a to reference the FQN of the input table in your code.

  2. Source table: Use df_bin your code to reference a dataframe that contains the data in the input. Use $b to reference the FQN of the input table in your code.

  3. Source table: Use df_cin your code to reference a dataframe that contains the data in the input. Use $c to reference the FQN of the input table in your code.

Settings

  • Python code: Define your custom python code that will be executed by the component.

Outputs

  • Output table: Use df_outputin your code to reference a dataframe that contains the data in the output. Use $output to reference the FQN of the output table in your code.

Custom SQL Select

Description

This component executes an arbitrary SELECT SQL statement.

A new table will be created from the result of the entered statement

Use $a, $b and $c as placeholders for the input table names (you can use one, all, or none of them).

The following is a valid example of code to use in this component, which will generate a new table with an added area column (based on a column named 'geom'), with only the rows corresponding to the five largest geometries:

SELECT *, ST_AREA(geom) AS area FROM `$a` ORDER BY area LIMIT 5

Please take into account that $a, $b and $cwill be parsed directly as they are. In some cases, they will need to be quoted differently depending on the data warehouse. Users need to ensure the correct quoting, like '$a' or "$a".

When using Workflows' variables in your Custom SQL code, remember that they don't need to be wrapped into curly brackets like {{ @var_name}}. Variables can be used directly as @var_name inside your Custom SQL code.

Inputs

  • Source table [Table]

  • Source table [Table]

  • SQL SELECT statement [String]

Outputs

  • Result table [Table]

Last updated

Was this helpful?