Documentation
Data sources

Connecting to data sources

Choose a data source to get started with below.

Note that Cube also supports connecting to multiple data sources out of the box.

Data warehouses

Query engines

Transactional databases

Time series databases

Streaming

Other data sources

API endpoints

Cube is designed to work with data sources that allow querying them with SQL.

Cube is not designed to access data files directly or fetch data from REST, or GraphQL, or any other API. To use Cube in that way, you either need to use a supported data source (e.g., use DuckDB to query Parquet files on Amazon S3) or create a custom data source driver.

Data source drivers

Driver support

Most of the drivers for data sources are supported either directly by the Cube team or by their vendors. The rest are community-supported and will be highlighted as such in their respective pages.

You can find the source code (opens in a new tab) of the drivers that are part of the Cube distribution in cubejs-*-driver folders on GitHub.

Third-party drivers

The following drivers were contributed by the Cube community. They are not part of the Cube distribution, however, they can still be used with Cube:

You need to configure driver_factory to use a third-party driver.

Currently unsupported data sources

If you'd like to connect to a data source which is not yet listed on this page, please see the list of requested drivers (opens in a new tab) and file an issue (opens in a new tab) on GitHub.

You're more than welcome to contribute new drivers as well as new features and patches to existing drivers (opens in a new tab). Please check the contribution guidelines (opens in a new tab) and join the #contributing-to-cube channel in our Slack community (opens in a new tab).

You can contact us (opens in a new tab) to discuss an integration with a currently unsupported data source. We might be able to assist Cube Cloud users on the Enterprise Premier (opens in a new tab) product tier.

Concurrency and pooling

All Cube database drivers come with presets for concurrency and pooling that work out-of-the-box. The following information is included as a reference.

For increased performance, Cube uses multiple concurrent connections to configured data sources. The CUBEJS_CONCURRENCY environment variable controls concurrency settings for query queues and the refresh scheduler as well as the maximum concurrent connections.

For databases that support connection pooling, the maximum number of concurrent connections to the database can also be set by using the CUBEJS_DB_MAX_POOL environment variable; if changing this from the default, you must ensure that the new value is greater than the number of concurrent connections used by Cube's query queues and refresh scheduler.