Orchestration API

Orchestration API enables Cube to work with data orchestration tools and let them push changes from upstream data sources to Cube, as opposed to letting Cube pull changes from upstream data sources via the scheduledRefresh configration option of pre-aggregations.

Orchestration API can be used to implement both embedded analytics (opens in a new tab) and internal or self-serve business intelligence (opens in a new tab) use cases. When implementing real-time analytics (opens in a new tab), consider pulling data from upstream data sources with lambda pre-aggregations.

Under the hood, the Orchestration API is exposed via the /v1/pre-aggregations/jobs endpoint of the REST API.

Supported tools

Orchestration API has integration packages to work with popular data orchestration tools. Check the following guides to get tool-specific instructions:

Apache Airflow

Dagster

Prefect

Configuration

Orchestration API is enabled by default but inaccessible due to the default API scopes configuration. To allow access to the Orchestration API, enable the jobs scope, e.g., by setting the CUBEJS_DEFAULT_API_SCOPES environment variable to meta,data,graphql,jobs.

Building pre-aggregations

Orchestration API allows to trigger pre-aggregation builds programmatically. It can be useful for data orchestration tools to push changes from upstream data sources to Cube or for any third parties to invalidate and rebuild pre-aggregations on demand.

You can trigger pre-aggregation builds and check build statuses using the /v1/pre-aggregations/jobs endpoint. It is possible to rebuild all pre-aggregations or specify the ones to be rebuilt:

Particular pre-aggregations.
Pre-aggregations that reference particular cubes.
Pre-aggregations that reference cubes from particular data sources.
For partitioned pre-aggregations, only partitions that contain data from a particular date range.

Reference Airflow