Documentation
SQL API

SQL API

SQL API enables Cube to deliver data over the Postgres-compatible protocol (opens in a new tab) to certain kinds of data applications, including but not limited to the following ones:

Often, the SQL API is used to enable internal or self-serve business intelligence (opens in a new tab) use cases.

Please use this GitHub issue (opens in a new tab) to suggest tools of your interest and vote for already proposed ones.

Example request

You can use the psql utility (opens in a new tab) to connect to the SQL API:

# Cube Core
PGPASSWORD=password \
  psql -h localhost \
  -p 15432 \
  -U user cube
# Cube Cloud
PGPASSWORD=password \
  psql -h awesome-ecom.sql.gcp-us-central1.cubecloudapp.dev \
  -p 5432 \
  -U cube awesome-ecom

Then, you can run queries in the Postgres dialect, just like the following one:

SELECT
  users.state,
  users.city,
  orders.status,
  MEASURE(orders.count)
FROM orders
CROSS JOIN users
WHERE
  users.state != 'us-wa'
  AND orders.created_at BETWEEN '2020-01-01' AND '2021-01-01'
GROUP BY 1, 2, 3
LIMIT 10;

You can also introspect the data model in a Postgres-native way by querying tables in information_schema (opens in a new tab) or using backslash commands (opens in a new tab):

SELECT *
FROM information_schema.tables
WHERE table_schema = 'public';
\d

Check query format for details about supported queries. Also, check SQL API reference for the list of supported SQL commands, functions, and operators.

Fundamentals

In the SQL API, each cube or view from the data model is represented as a table. Measures, dimensions, and segments are represented as columns in these tables. SQL API can execute regular queries, queries with post-processing, and queries with pushdown that can reference these tables and columns.

Under the hood, the SQL API uses Apache DataFusion (opens in a new tab) as its query engine. It's responsible for query planning and execution. As part of query planning, the SQL API also uses egg (opens in a new tab) (an e-graph term rewriting (opens in a new tab) library) to analyze incoming SQL queries and find the best query plan out of a wide variety of possible plans to execute.

Overall, query planning is a seamless process. SQL API does its best to execute a query as a regular query or a query with post-processing. If that is not possible, then the query would be executed as a query with pushdown. There are trade-offs associated with each query type:

Query typeIn-memory cachePre-aggregationsSQL support
Regular queriesβœ… Usedβœ… Used🟑 Very limited
Queries with post-processingβœ… Usedβœ… Used🟑 Limited
Queries with pushdownβœ… Used❌ Not usedβœ… Extensive

Query pushdown in the SQL API is available in public preview. Read more (opens in a new tab) in the blog.

Query pushdown is disabled by default. You should explicitly enable it. In future versions, it will be enabled by default. Also, enabling query pushdown would affect how ungrouped queries are executed; check query format for details.

Configuration

Cube Core

SQL API is disabled by default. To enable the SQL API, set CUBEJS_PG_SQL_PORT to a port number you'd like to connect to with a Postgres-compatible tool.

CredentialEnvironment variable, etc.
HostHost you're running Cube at
PortSet via CUBEJS_PG_SQL_PORT
User nameSet via CUBEJS_SQL_USER
PasswordSet via CUBEJS_SQL_PASSWORD
DatabaseAny valid string, e.g., cube

You can also use checkSqlAuth, canSwitchSqlUser, and CUBEJS_SQL_SUPER_USER to configure custom authentication.

Example

The following Docker Compose file will run Cube with the SQL API enabled on port 15432, accessible using user as the user name, password as the password, and any string as the database name:

docker-compose.yml
services:
  cube:
    image: cubejs/cube:latest
    ports:
      - 4000:4000
      - 15432:15432
    environment:
      - CUBEJS_DEV_MODE=true
      - CUBEJS_API_SECRET=SECRET
 
      - CUBEJS_DB_USER=cube
      - CUBEJS_DB_PASS=12345
      - CUBEJS_DB_HOST=demo-db-examples.cube.dev
      - CUBEJS_DB_NAME=ecom
      - CUBEJS_DB_TYPE=postgres
 
      - CUBEJS_PG_SQL_PORT=15432      # SQL API credential
      - CUBEJS_SQL_USER=user          # SQL API credential
      - CUBEJS_SQL_PASSWORD=password  # SQL API credential
    volumes:
      - .:/cube/conf

After running it with docker compose up, you can finally connect and execute an example request.

Cube Cloud

SQL API is enabled by default. To find your SQL API endpoint and credentials in Cube Cloud, go to the Overview page, click API credentials, and choose the SQL API tab.

By default, the SQL API is enabled on port 5432, the user name is cube, and a random string is generated for the password. You can customize these with CUBEJS_PG_SQL_PORT, CUBEJS_SQL_USER, and CUBEJS_SQL_PASSWORD environment variables by navigating toΒ Settings β†’ Configration.

Query planning

By default, the SQL API executes queries as regular queries or queries with post-processing. Such queries support only a limited set of SQL functions and operators, and sometimes you can get the following error: Error during rewrite: Can't detect Cube query and it may be not supported yet.

You can use the CUBESQL_SQL_PUSH_DOWN environment variable to instruct the SQL API to execute such queries as queries with pushdown.

Query pushdown in the SQL API is available in public preview. Read more (opens in a new tab) in the blog.

Query planning is a resource-intensive task, and sometimes you can get the following error: Error during rewrite: Can't find rewrite due to 10002 AST node limit reached. Use the following environment variables to allocate more resources for query planning: CUBESQL_REWRITE_MAX_NODES, CUBESQL_REWRITE_MAX_ITERATIONS, CUBESQL_REWRITE_TIMEOUT.

Streaming

By default, query results are loaded in a single batch. However, a more effective streaming mode can be used for large result sets. To enable it, set the CUBESQL_STREAM_MODE environment variable to true.

When the streaming mode is enabled, the maximum row limit does not apply to SQL API queries. They can return an unlimited number of rows.

Session limit

Each concurrent connection to the SQL API consumes some resources and attempting to establish too many connections at once can lead to an out-of-memory crash. You can use the CUBEJS_MAX_SESSIONS environment variable to adjust the session limit.