SQL API

The SQL API enables Cube to deliver data over the Postgres-compatible protocol (opens in a new tab) to a wide range of data visualization tools. In general, if an application connects to PostgreSQL (opens in a new tab) database, it can connect to Cube as well.

Unlike DAX API, MDX API, and Semantic Layer Sync, the SQL API provides a more generic way to work with Cube, limited to what the SQL query language and the Postgres protocol can offer.

The SQL API works with the following kinds of data applications:

BI tools, e.g., ThoughtSpot, Sigma, Looker Studio, enabling internal or self-serve business intelligence (opens in a new tab) use cases.
Data notebooks, e.g., Jupyter, Hex, or Deepnote.
Reverse ETL tools, e.g., Census or Hightouch.

Please use this GitHub issue (opens in a new tab) to suggest tools of your interest and vote for already proposed ones.

See SQL API reference for the list of supported SQL commands, functions, and operators. Also, check query format for details about supported queries.

Transport

The SQL API supports the following transports:

Transport	Description	When to use
Postgres	Same protocol that is used by the `psql` utility and other Postgres clients	Use by default
HTTP	JSON-based protocol that is also used by the REST API	Use when you need to run a SQL API query from an embedded analytics application and REST API is not an option

Postgres protocol

You can use the psql utility (opens in a new tab) to connect to the SQL API:

# Cube Core
PGPASSWORD=password \
  psql -h localhost \
  -p 15432 \
  -U user cube

# Cube Cloud
PGPASSWORD=password \
  psql -h awesome-ecom.sql.gcp-us-central1.cubecloudapp.dev \
  -p 5432 \
  -U cube awesome-ecom

Then, you can run queries in the Postgres dialect, just like the following one:

SELECT
  users.state,
  users.city,
  orders.status,
  MEASURE(orders.count)
FROM orders
CROSS JOIN users
WHERE
  users.state != 'us-wa'
  AND orders.created_at BETWEEN '2020-01-01' AND '2021-01-01'
GROUP BY 1, 2, 3
LIMIT 10;

You can also introspect the data model in a Postgres-native way by querying tables in information_schema (opens in a new tab) or using backslash commands (opens in a new tab):

SELECT *
FROM information_schema.tables
WHERE table_schema = 'public';

\d

HTTP protocol

You can use the curl utility to run a SQL API query over the HTTP protocol:

curl \
  -X POST \
  -H "Authorization: TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"query": "SELECT 123 AS value UNION ALL SELECT 456 AS value UNION ALL SELECT 789 AS value"}' \
  http://localhost:4000/cubejs-api/v1/cubesql

See the /v1/cubesql endpoint reference for more details.

Fundamentals

In the SQL API, each cube or view from the data model is represented as a table. Measures, dimensions, and segments are represented as columns in these tables. SQL API can execute regular queries, queries with post-processing, and queries with pushdown that can reference these tables and columns.

Under the hood, the SQL API uses Apache DataFusion (opens in a new tab) as its query engine. It's responsible for query planning and execution. As part of query planning, the SQL API also uses egg (opens in a new tab) (an e-graph term rewriting (opens in a new tab) library) to analyze incoming SQL queries and find the best query plan out of a wide variety of possible plans to execute.

Overall, query planning is a seamless process. SQL API does its best to execute a query as a regular query or a query with post-processing. If that is not possible, then the query would be executed as a query with pushdown. There are trade-offs associated with each query type:

Query type	In-memory cache	Pre-aggregations	SQL support
Regular queries	✅ Used	✅ Used	🟡 Very limited
Queries with post-processing	✅ Used	✅ Used	🟡 Limited
Queries with pushdown	✅ Used	❌ Not used	✅ Extensive

Read more (opens in a new tab) about Query pushdown in the SQL API in the blog.

Configuration

Cube Core

SQL API is disabled by default. To enable the SQL API, set CUBEJS_PG_SQL_PORT to a port number you'd like to connect to with a Postgres-compatible tool.

Credential	Environment variable, etc.
Host	Host you're running Cube at
Port	Set via `CUBEJS_PG_SQL_PORT`
User name	Set via `CUBEJS_SQL_USER`
Password	Set via `CUBEJS_SQL_PASSWORD`
Database	Any valid string, e.g., `cube`

You can also use checkSqlAuth, canSwitchSqlUser, and CUBEJS_SQL_SUPER_USER to configure custom authentication.

Example

The following Docker Compose file will run Cube with the SQL API enabled on port 15432, accessible using user as the user name, password as the password, and any string as the database name:

docker-compose.yml

services:
  cube:
    image: cubejs/cube:latest
    ports:
      - 4000:4000
      - 15432:15432
    environment:
      - CUBEJS_DEV_MODE=true
      - CUBEJS_API_SECRET=SECRET
 
      - CUBEJS_DB_USER=cube
      - CUBEJS_DB_PASS=12345
      - CUBEJS_DB_HOST=demo-db-examples.cube.dev
      - CUBEJS_DB_NAME=ecom
      - CUBEJS_DB_TYPE=postgres
 
      - CUBEJS_PG_SQL_PORT=15432      # SQL API credential
      - CUBEJS_SQL_USER=user          # SQL API credential
      - CUBEJS_SQL_PASSWORD=password  # SQL API credential
    volumes:
      - .:/cube/conf

After running it with docker compose up, you can finally connect and execute an example request.

Cube Cloud

SQL API is enabled by default. To find your SQL API endpoint and credentials in Cube Cloud, go to the Overview page, click API credentials, and choose the SQL API tab.

By default, the SQL API is enabled on port 5432, the user name is cube, and a random string is generated for the password. You can customize these with CUBEJS_PG_SQL_PORT, CUBEJS_SQL_USER, and CUBEJS_SQL_PASSWORD environment variables by navigating to Settings → Configration.

Query planning

The SQL API executes queries as regular queries, queries with post-processing, or queries with pushdown.

Streaming

By default, query results are loaded in a single batch. However, a more effective streaming mode can be used for large result sets. To enable it, set the CUBESQL_STREAM_MODE environment variable to true.

When the streaming mode is enabled, the maximum row limit does not apply to SQL API queries. They can return an unlimited number of rows.

Session limit

Each concurrent connection to the SQL API consumes some resources and attempting to establish too many connections at once can lead to an out-of-memory crash. You can use the CUBEJS_MAX_SESSIONS environment variable to adjust the session limit.

Troubleshooting

`Can't find rewrite`

Query planning is a resource-intensive task, and sometimes you can get the following error: Error during rewrite: Can't find rewrite due to 10002 AST node limit reached. Use the following environment variables to allocate more resources for query planning: CUBESQL_REWRITE_MAX_NODES, CUBESQL_REWRITE_MAX_ITERATIONS, CUBESQL_REWRITE_TIMEOUT.

Queries Query format