# Scheduling workflows

Once you have designed a workflow, you might want it to be executed periodically. This allows to have the result of a workflow automatically updated if the input data sources of a workflow have been modified between the last execution and the next one. For this purpose, CARTO offers a UI that helps creating a scheduled query in different Data Warehouses.

Keep reading this page to learn more about scheduling the execution of a workflow:

### Creating a schedule

Once a workflow has been executed successfully, you can click on the button that you will find next to 'Run'

<figure><img src="https://3029946802-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FybPdpmLltPkzGFvz7m8A%2Fuploads%2Fgit-blob-45cb9fdd13aec6b752abf9a5597f07d98dc460f2%2FScreenshot%202023-07-31%20at%2010.29.39.png?alt=media" alt="" width="175"><figcaption></figcaption></figure>

{% hint style="info" %}
Check the provider specific information to learn more about the required permission and pre-requisites that are needed to enable this button.
{% endhint %}

Clicking on that button will open a new modal window where you can select the frequency and other settings for the schedule:

<figure><img src="https://3029946802-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FybPdpmLltPkzGFvz7m8A%2Fuploads%2Fgit-blob-65636247663a5b719dbab02f0e171e04b7f6e1d5%2FScreenshot%202023-07-31%20at%2010.52.21.png?alt=media" alt=""><figcaption></figcaption></figure>

* **Hours:** The workflow will be executed every X hours, at o'clock times.
* **Days:** The workflow will be executed every day at a specific time.
* **Weeks:** The workflow will be executed weekly, on a specific day, at a specific time.
* **Months:** The workflow will be executed monthly, on a specific day, at a specific time.
* **Custom:** Use a custom expression to define the schedule. See below for more information.

{% hint style="info" %}
**Custom expressions**

Please note that CARTO Workflows supports [BigQuery syntax](https://cloud.google.com/appengine/docs/flexible/scheduling-jobs-with-cron-yaml#formatting_the_schedule) for BigQuery and CARTO Data Warehouse connections. For Snowflake, PostgreSQL, and Oracle, [Cron syntax](https://crontab.guru/) is supported. For Databricks connections, [Quartz cron syntax](https://www.quartz-scheduler.org/documentation/quartz-2.3.0/tutorials/crontrigger.html) is supported.
{% endhint %}

### Updating, modifying and deleting a schedule

Once a workflow has been scheduled, there are different actions that can be performed:

* If your workflow has been modified, after a successful execution you will find this small notification.

<figure><img src="https://3029946802-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FybPdpmLltPkzGFvz7m8A%2Fuploads%2Fgit-blob-22413b52b2aeaa8ec7666a804958ad76a2ad9d88%2FScreenshot%202023-07-31%20at%2013.43.51.png?alt=media" alt="" width="134"><figcaption></figcaption></figure>

* Click on the schedule icon to be able to sync the latest changes to the scheduled version of your workflow.

<figure><img src="https://3029946802-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FybPdpmLltPkzGFvz7m8A%2Fuploads%2Fgit-blob-31c01d075463fa9c0591c918044523f7b459f948%2FScreenshot%202023-07-31%20at%2013.44.39.png?alt=media" alt="" width="375"><figcaption></figcaption></figure>

* By clicking on the schedule icon at any time, you will be able to modify the frequency at which the workflow will be executed. Select a different setting and click on '*Save changes*'.
* Same as above, by clicking on the schedule icon at any time, you will find a '*Delete schedule*' button that will permanently disable the schedule.

## Permissions and pre-requisites <a href="#marked-bigquery" id="marked-bigquery"></a>

### BigQuery <a href="#marked-bigquery" id="marked-bigquery"></a>

In BigQuery, scheduled tasks are stored in the project. It allows the creation of tasks in different regions.

{% hint style="warning" %}
Users of [Workload Identity Federation](https://docs.carto.com/connections/bigquery#using-workload-identity-federation) **require to set up a service account** to run scheduled workflows, as federated identities cannot be used directly to run them. If you use WIF, you must configure service account impersonation and ensure that the service account has the appropriate BigQuery permissions.
{% endhint %}

#### Permissions <a href="#marked-permissions" id="marked-permissions"></a>

The connection requires the following roles:

* BigQuery User
* BigQuery Data Editor

The project used in the connection will need to have this API enabled:

* BigQuery Data Transfer API: <https://console.cloud.google.com/apis/library/bigquerydatatransfer.googleapis.com>

{% hint style="info" %}
In the [Connection settings](https://docs.carto.com/carto-user-manual/connections/creating-a-connection#connection-to-bigquery), if the selected billing project is different than the Service Account project, the user must run this operation to enable Cross-project Service Account Authorization:

<https://cloud.google.com/bigquery/docs/enable-transfer-service#cross-project_service_account_authorization>
{% endhint %}

#### Data Transfer Version Info <a href="#data-transfer-version-info" id="data-transfer-version-info"></a>

For connections created using 'Sign in with Google', a Data Transfer Version Info code needs to be provided.

{% hint style="info" %}
In order to create a Data Transfer Version Info code, you need to [**visit this page**](https://www.gstatic.com/bigquerydatatransfer/oauthz/auth?client_id=433065040935-hav5fqnc9p9cht3rqneus9115ias2kn1.apps.googleusercontent.com\&scope=https://www.googleapis.com/auth/bigquery%20https://www.googleapis.com/auth/drive\&redirect_uri=urn:ietf:wg:oauth:2.0:oob\&response_type=version_info)**.**
{% endhint %}

Once there, select your Google account and Allow access:

<figure><img src="https://3029946802-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FybPdpmLltPkzGFvz7m8A%2Fuploads%2Fgit-blob-bbe076a20993260ca4e5874ec5968bc6c480cfce%2FScreenshot%202024-03-21%20at%2012.22.07.png?alt=media" alt="" width="375"><figcaption></figcaption></figure>

After allowing, you will be provided a code. Copy it and paste into your Data Transfer Version Info setting in the connection's [Advanced Options](https://docs.carto.com/carto-user-manual/connections/bigquery#advanced-options).

#### Monitoring <a href="#marked-monitoring" id="marked-monitoring"></a>

The scheduled queries can be monitored from the BigQuery console, in the "Scheduled queries" section. Notice that the name of the scheduled query will be `"carto_scheduler.workflows.<id>"`. The id of the workflow is a UUID that can be found in the URL.

### Snowflake <a href="#marked-snowflake" id="marked-snowflake"></a>

In Snowflake, scheduled tasks are created in a schema. In this case, it uses the workflows\_temp schema. If the schema is removed, the task will be removed too.

Snowflake makes use of [**Tasks**](https://docs.snowflake.com/en/user-guide/tasks-intro) to allow scheduled execution of SQL code.\
Tasks can be created via SQL statements using the [CREATE TASK](https://docs.snowflake.com/en/sql-reference/sql/create-task) syntax.

#### Grant permissions <a href="#marked-permissions-1" id="marked-permissions-1"></a>

It requires access to the workflows\_temp schema, and also a global grant "EXECUTE TASK" to be able to start a task.

```sql
GRANT ALL PRIVILEGES ON SCHEMA <schema> TO ROLE <role>;
GRANT EXECUTE TASK ON ACCOUNT TO ROLE <role>;
```

#### **Revoke permissions**

```sql
REVOKE ALL PRIVILEGES ON SCHEMA <schema> FROM ROLE <role>;
REVOKE EXECUTE TASK ON ACCOUNT FROM ROLE <role>;
```

#### Monitoring <a href="#marked-monitoring-1" id="marked-monitoring-1"></a>

The created tasks can be monitored from "Data > Databases", "MYDATABASE > MYSCHEMA > Tasks. Notice that the name of the scheduled query will be `"carto_scheduler.workflows.<id>"`. The id of the workflow is a UUID that can be found in the URL.

### Databricks

When scheduling a CARTO Workflow on a Databricks connection, a[ native Workflow will be created in Databricks](https://docs.databricks.com/en/jobs/index.html), with a specific run schedule. In order to do so, your user will need the "Can Create" and "Can Manage" permission for Jobs.

#### **"Can Create" Permissions**

This permission allows a user to **create new jobs**. Specifically, users with this permission can:

* Define new jobs/workflows with tasks and configurations.
* Attach tasks to clusters they have access to.
* Set parameters like schedules, libraries, and task dependencies during job creation.

#### **"Can Manage" Permissions**

This permission provides **full control** over a job and is the highest level of access. Users with **"Can Manage"** permissions can:

* Create, edit, delete, and clone jobs.
* Modify schedules, task configurations, and parameters.
* Manage job runs (e.g., start, stop, or restart runs).
* Assign or modify permissions for the job, delegating "Can Edit" or "Can View" permissions to others.

### Oracle <a href="#marked-oracle" id="marked-oracle"></a>

For Oracle connections, scheduled tasks are created using the [DBMS\_SCHEDULER](https://docs.oracle.com/en/database/oracle/oracle-database/19/arpls/DBMS_SCHEDULER.html) package. When scheduling a workflow, a job is created in Oracle's scheduler that periodically executes the workflow's SQL.

#### Grant permissions <a href="#marked-permissions-oracle" id="marked-permissions-oracle"></a>

The `DBMS_SCHEDULER` package must be available in the database, and the user configured in the connection requires:

* The `CREATE JOB` privilege to create and manage scheduled jobs.
* Access to the `user_scheduler_jobs` dictionary view to list and monitor existing schedules.

```sql
GRANT CREATE JOB TO <user>;
```

{% hint style="info" %}
The `user_scheduler_jobs` view is accessible by default to any user that owns scheduler jobs. In locked-down environments where access to dictionary views has been restricted, you may need to grant `SELECT` on it explicitly.
{% endhint %}

#### **Revoke permissions**

```sql
REVOKE CREATE JOB FROM <user>;
```

#### Monitoring <a href="#marked-monitoring-oracle" id="marked-monitoring-oracle"></a>

The created jobs can be monitored by querying Oracle's scheduler views:

```sql
-- Existing jobs
SELECT * FROM user_scheduler_jobs;

-- Job execution history
SELECT * FROM user_scheduler_job_run_details;
```

Notice that the name of the scheduled query will be `"carto_scheduler.workflows.<id>"`. The id of the workflow is a UUID that can be found in the URL.

### PostgreSQL <a href="#marked-postgres" id="marked-postgres"></a>

For PostgreSQL, the [`pg_cron`](https://github.com/citusdata/pg_cron) extension provides the means to schedule the execution of a SQL query. Here are some resources to learn how to leverage `pg_cron` on different managed services based on PostgreSQL:

* Amazon RDS. Usage guide available [here](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/PostgreSQL_pg_cron.html).
* Amazon Aurora PostgreSQL supports `pg_cron` since [version 12](https://docs.aws.amazon.com/AmazonRDS/latest/AuroraPostgreSQLReleaseNotes/AuroraPostgreSQL.Extensions.html#AuroraPostgreSQL.Extensions.12).
* Google Cloud SQL. Extension setup instructions [here](https://cloud.google.com/sql/docs/postgres/extensions#pg_cron).
* Azure Databases for PostgreSQL supports `pg_cron` since [version 11](https://learn.microsoft.com/en-us/azure/postgresql/flexible-server/concepts-extensions#postgres-11-extensions) and provides [some usage examples](https://learn.microsoft.com/en-us/azure/postgresql/flexible-server/concepts-extensions#pg_cron).

#### Grant permissions <a href="#marked-permissions-2" id="marked-permissions-2"></a>

It requires the extension `pg_cron` to be installed and available. Also, access to the `cron` schema.

```sql
CREATE EXTENSION pg_cron;
GRANT USAGE ON SCHEMA cron TO postgres;
GRANT EXECUTE ON ALL FUNCTIONS IN SCHEMA cron TO postgres;
```

#### **Revoke permissions**

```sql
REVOKE USAGE ON SCHEMA cron FROM postgres;
REVOKE EXECUTE ON ALL FUNCTIONS IN SCHEMA cron FROM postgres;
DROP EXTENSION pg_cron;
```

#### Monitoring <a href="#marked-monitoring-2" id="marked-monitoring-2"></a>

The created jobs can be monitored by running SQL queries in the database:

```sql
-- Existing jobs
select * from cron.job;

-- Executed jobs
select * from cron.job_run_details;
```

Notice that the name of the scheduled query will be `"carto_scheduler.workflows.<id>"`. The id of the workflow is a UUID that can be found in the URL.

## Schedule a Workflow in Redshift

Redshift doesn't offer an API to handle scheduled queries, which prevents CARTO from being able to schedule the execution of a workflow directly from the UI. However, users can schedule workflows on Redshift by following these simple steps:

1. Go to the "SQL" tab of the results panel of your workflow.
2. Click on the "Copy" button at the top right of the SQL tab to copy the SQL code corresponding to your entire workflow.
3. Redshift allows creating a schedule to run SQL statements using the query editor.\
   There are more details about how to use this feature [**here**](https://docs.aws.amazon.com/redshift/latest/mgmt/query-editor-schedule-query.html).
