Querying concurrency

All queries to data APIs are processed asynchronously via a query queue. It allows to optimize the load and increase querying performance.

Query queue

The query queue allows to deduplicate queries to API instances and insulate upstream data sources from query spikes. It also allows to execute queries to data sources concurrently for increased performance.

By default, Cube uses a single query queue for queries from all API instances and the refresh worker to all configured data sources.

You can read more about the query queue in the this blog post (opens in a new tab).

Multiple query queues

You can use the context_to_orchestrator_id configuration option to route queries to multiple queues based on the security context.

If you're configuring multiple connections to data sources via the driver_factory configuration option, you must also configure context_to_orchestrator_id to ensure that queries are routed to correct queues.

Data sources

Cube supports various kinds of data sources, ranging from cloud data warehouses to embedded databases. Each data source scales differently, therefore Cube provides sound defaults for each kind of data source out-of-the-box.

Data source concurrency

By default, Cube uses the following concurrency settings for data sources:

Data source	Default concurrency
Amazon Athena	10
Amazon Redshift	5
Apache Pinot	10
ClickHouse	10
Databricks	10
Firebolt	10
Google BigQuery	10
Snowflake	8
All other data sources	5 or less, if specified in the driver (opens in a new tab)

You can use the CUBEJS_CONCURRENCY environment variable to adjust the maximum number of concurrent queries to a data source. It's recommended to use the default configuration unless you're sure that your data source can handle more concurrent queries.

Connection pooling

For data sources that support connection pooling, the maximum number of concurrent connections to the database can also be set by using the CUBEJS_DB_MAX_POOL environment variable. If changing this from the default, you must ensure that the new value is greater than the number of concurrent connections used by Cube's query queues and the refresh worker.

Refresh worker

By default, the refresh worker uses the same concurrency settings as API instances. However, you can override this behvaior in the refresh worker configuration.

Multiple data sources Multitenancy