Custom

Components that allow to create custom user-defined procedures and SQL queries in your workflows.

Call Procedure

Description

This component executes an arbitrary CALL SQL statement.

Use $a and $b as placeholders for the input table names (you can use one, both, or none of them), and $output as placeholder for the output table name.

The procedure that you are calling is responsible for creating the output table in the path stored in the $output variable.

The following is a valid example of code to use in this component:

CALL MY_PROCEDURE($a, 10, `$output`);

Please take into account that $a, $b and $outputwill be parsed directly as they are. In some cases, the table names that work as input/output need to be quoted differently depending on the data warehouse. Users need to ensure the correct quoting, like `$output` or "$a".

Due to Redshift limitations, calling stored procedures that have INOUT string parameters will cause an exception when running the workflow. Stored procedures with such parameters are not supported.

An additional placeholder _$isStoredProcedure$ will be replaced by true when the workflow is being executed as a stored procedure (which happens when is executed trhough the API or as a scheduled workflow). When running in a stored procedure, table names are not fully qualified, and are supposed to be used to create session-wide temporary tables, instead of regular tables. Your code should have the corresponding logic to handle this situation. Otherwise, errors might appear when running the workflow in API mode.

Here is a sample code for that (notice that no CALL is made in this case, since the component can run any arbitrary SQL code):

IF ($isStoredProcedure) THEN
  -- $output will be a simple table name (i.e mytable)
  CREATE TEMPORARY TABLE $output AS SELECT * FROM $a LIMIT 10;
ELSE
  -- $output will be a fully qualified name (i.e database.schema.mytable)
  CREATE TABLE $output AS SELECT * FROM $a LIMIT 10;
END IF

Inputs

  • Source table [Table]

  • Source table [Table]

  • SQL CALL statement [String]

Outputs

  • Result table [Table]

Custom Python Code

Description

This component runs arbitrary Python code.

Inputs

  1. Source table: Use df_ain your code to reference a dataframe that contains the data in the input. Use $a to reference the FQN of the input table in your code.

  2. Source table: Use df_bin your code to reference a dataframe that contains the data in the input. Use $b to reference the FQN of the input table in your code.

  3. Source table: Use df_cin your code to reference a dataframe that contains the data in the input. Use $c to reference the FQN of the input table in your code.

Settings

  • Python code: Define your custom python code that will be executed by the component.

Outputs

  • Output table: Use df_outputin your code to reference a dataframe that contains the data in the output. Use $output to reference the FQN of the output table in your code.

Custom SQL Select

Description

This component executes an arbitrary SELECT SQL statement.

A new table will be created from the result of the entered statement

Use $a, $b and $c as placeholders for the input table names (you can use one, all, or none of them).

The following is a valid example of code to use in this component, which will generate a new table with an added area column (based on a column named 'geom'), with only the rows corresponding to the five largest geometries:

SELECT *, ST_AREA(geom) AS area FROM `$a` ORDER BY area LIMIT 5

Please take into account that $a, $b and $cwill be parsed directly as they are. In some cases, they will need to be quoted differently depending on the data warehouse. Users need to ensure the correct quoting, like '$a' or "$a".

When using Workflows' variables in your Custom SQL code, remember that they don't need to be wrapped into curly brackets like {{ @var_name}}. Variables can be used directly as @var_name inside your Custom SQL code.

Please avoid trailing semi-colon ;in your Custom SQL code as it might cause unexpected behavior.

Inputs

  • Source table [Table]

  • Source table [Table]

  • SQL SELECT statement [String]

Outputs

  • Result table [Table]

Last updated