Multi-fact queries

When a view includes measures from multiple root fact tables, Cube can automatically execute a multi-fact query. Instead of joining all fact tables together and risking row multiplication, Cube builds a separate aggregating subquery for each fact table and then joins the results on the common dimensions.

Multi-fact queries are powered by Tesseract, the next-generation data modeling engine . Tesseract is currently in preview. Use the CUBEJS_TESSERACT_SQL_PLANNER environment variable to enable it.

When a multi-fact query is triggered

A multi-fact query is triggered when a view has multiple root fact tables whose measures are queried together. Each distinct root fact table in the view becomes its own aggregating subquery, and the results are joined on the common dimensions shared across those facts.

Consider a data model with two fact cubes, orders and returns. Both are joined to two shared dimension tables: customers and a dates date spine:


cubes:
  - name: customers
    sql_table: customers
 
    dimensions:
      - name: id
        type: number
        sql: id
        primary_key: true
      - name: name
        type: string
        sql: name
      - name: city
        type: string
        sql: city
 
  - name: dates
    sql_table: dates
 
    dimensions:
      - name: date
        type: time
        sql: date
        primary_key: true
 
  - name: orders
    sql_table: orders
 
    joins:
      - name: customers
        relationship: many_to_one
        sql: "{orders}.customer_id = {customers.id}"
      - name: dates
        relationship: many_to_one
        sql: "DATE_TRUNC('day', {orders}.created_at) = {dates.date}"
 
    dimensions:
      - name: id
        type: number
        sql: id
        primary_key: true
      - name: customer_id
        type: number
        sql: customer_id
      - name: status
        type: string
        sql: status
 
    measures:
      - name: count
        type: count
      - name: total_amount
        type: sum
        sql: amount
 
  - name: returns
    sql_table: returns
 
    joins:
      - name: customers
        relationship: many_to_one
        sql: "{returns}.customer_id = {customers.id}"
      - name: dates
        relationship: many_to_one
        sql: "DATE_TRUNC('day', {returns}.created_at) = {dates.date}"
 
    dimensions:
      - name: id
        type: number
        sql: id
        primary_key: true
      - name: customer_id
        type: number
        sql: customer_id
 
    measures:
      - name: count
        type: count
      - name: total_refund
        type: sum
        sql: refund_amount


cube(`customers`, {
  sql_table: `customers`,
 
  dimensions: {
    id: {
      sql: `id`,
      type: `number`,
      primary_key: true
    },
 
    name: {
      sql: `name`,
      type: `string`
    },
 
    city: {
      sql: `city`,
      type: `string`
    }
  }
})
 
cube(`dates`, {
  sql_table: `dates`,
 
  dimensions: {
    date: {
      sql: `date`,
      type: `time`,
      primary_key: true
    }
  }
})
 
cube(`orders`, {
  sql_table: `orders`,
 
  joins: {
    customers: {
      relationship: `many_to_one`,
      sql: `${orders}.customer_id = ${customers.id}`
    },
 
    dates: {
      relationship: `many_to_one`,
      sql: `DATE_TRUNC('day', ${orders}.created_at) = ${dates.date}`
    }
  },
 
  dimensions: {
    id: {
      sql: `id`,
      type: `number`,
      primary_key: true
    },
 
    customer_id: {
      sql: `customer_id`,
      type: `number`
    },
 
    status: {
      sql: `status`,
      type: `string`
    }
  },
 
  measures: {
    count: {
      type: `count`
    },
 
    total_amount: {
      sql: `amount`,
      type: `sum`
    }
  }
})
 
cube(`returns`, {
  sql_table: `returns`,
 
  joins: {
    customers: {
      relationship: `many_to_one`,
      sql: `${returns}.customer_id = ${customers.id}`
    },
 
    dates: {
      relationship: `many_to_one`,
      sql: `DATE_TRUNC('day', ${returns}.created_at) = ${dates.date}`
    }
  },
 
  dimensions: {
    id: {
      sql: `id`,
      type: `number`,
      primary_key: true
    },
 
    customer_id: {
      sql: `customer_id`,
      type: `number`
    }
  },
 
  measures: {
    count: {
      type: `count`
    },
 
    total_refund: {
      sql: `refund_amount`,
      type: `sum`
    }
  }
})

You can then define a view where orders and returns are separate root fact tables. The shared dimension tables — customers and dates — are each included with their own root-level join paths, not nested under a specific fact like orders.customers. This makes their dimensions common to both facts so they can be used to join the subquery results. The prefix parameter disambiguates identically named members from different fact cubes:


views:
  - name: customer_overview
    cubes:
      - join_path: orders
        includes:
          - count
          - total_amount
        prefix: true
      - join_path: customers
        includes:
          - name
          - city
      - join_path: dates
        includes:
          - date
      - join_path: returns
        includes:
          - count
          - total_refund
        prefix: true


view(`customer_overview`, {
  cubes: [
    {
      join_path: orders,
      includes: [`count`, `total_amount`],
      prefix: true
    },
    {
      join_path: customers,
      includes: [`name`, `city`]
    },
    {
      join_path: dates,
      includes: [`date`]
    },
    {
      join_path: returns,
      includes: [`count`, `total_refund`],
      prefix: true
    }
  ]
})

This view has two root fact tables (orders and returns) and two shared dimension tables (customers and dates). Because each dimension table is included at its own root-level join path rather than scoped under a single fact, their dimensions are available as common join keys for both fact subqueries.

When you query measures from both facts — such as orders_count, orders_total_amount, returns_count, and returns_total_refund — grouped by common dimensions like name, city, and date, Cube detects the multiple roots and triggers a multi-fact query.

Join path requirements

To ensure correct join paths within a multi-fact view, follow these rules:

Within each root fact table, any join paths to related cubes (e.g., orders.line_items) should be listed explicitly in the view. This removes ambiguity about which tables are involved in each fact’s subquery.
Dimension tables that join to other, less granular dimension tables (e.g., customers joining to regions) should also declare those join paths explicitly in the view if those dimensions are needed.
Between root fact tables and root dimension tables, one-hop joins must be defined at the cube level (as shown in the orders and returns cubes above, each declaring a direct join to customers and dates). This allows the multi-fact view to unambiguously resolve how each fact reaches each common dimension table.

In the example above, both orders and returns declare direct joins to customers and dates. This means the view can build separate subqueries where each fact independently joins to the same dimension tables — without relying on transitive or implicit join paths.

How multi-fact queries work

Cube analyzes the join hints for each measure and groups them by their join key — the set of tables involved in the join path from the root to the measure’s cube. Measures that share the same join key are placed in the same group; measures with different join keys form separate groups. When there are two or more groups, the query is classified as multi-fact.

The query is then executed in the following stages:

1. Separate aggregating subqueries

For each group of measures, Cube builds an independent aggregating subquery. Each subquery joins only the tables needed for that group’s measures, applies all relevant filters and segments, and aggregates the results by the common dimensions.

For example, given a query for orders_count, orders_total_amount, returns_count, and returns_total_refund grouped by name, city, and date:

Subquery 1 (orders group): joins orders to customers and dates, computes COUNT(*) and SUM(amount), grouped by customers.name, customers.city, and dates.date.
Subquery 2 (returns group): joins returns to customers and dates, computes COUNT(*) and SUM(refund_amount), grouped by customers.name, customers.city, and dates.date.

2. Join on common dimensions

The results of the subqueries are joined with FULL JOIN semantics on all common dimension columns — in this case, name, city, and date. This ensures that all rows from both fact tables are represented, even when a customer has orders but no returns, or vice versa. The actual SQL implementation may vary depending on database capabilities.

3. Final result

The final SELECT pulls measures from their respective subqueries and dimensions from the joined result. Rows with data in only one fact table will show NULL for measures from the other.

For the customer_overview view, the result looks like:

name	city	date	orders_count	orders_total_amount	returns_count	returns_total_refund
Alice	New York	2025-01-15	2	200.00	0	NULL
Alice	New York	2025-02-10	2	225.00	1	100.00
Bob	Seattle	2025-01-20	3	550.00	2	130.00
Charlie	New York	2025-02-05	0	NULL	2	100.00
Diana	Boston	2025-03-01	1	400.00	0	NULL

Notice that Charlie has no orders and Diana has no returns — both are still included in the results with NULL values for the missing fact table.

More than two fact tables

Multi-fact queries are not limited to two root fact tables. If a view includes three or more fact tables, each one gets its own aggregating subquery, and all results are joined together on the common dimensions.

For instance, adding a reviews cube as a third root fact in the view and querying orders_count, returns_count, and reviews_count grouped by name, city, and date produces three separate subqueries, all joined on those common dimensions.

Every root fact table in a multi-fact view must be joinable to the same set of common dimension tables. The subquery results are joined on these common dimensions, so if a fact table cannot reach one of the dimension tables, the join will fail.

If a fact table does not naturally have a foreign key for one of the common dimension tables, you can create a synthetic join by selecting NULL for the missing foreign key in the cube’s sql definition:


cubes:
  - name: refunds
    sql: >
      SELECT *, NULL AS customer_id FROM refunds
    joins:
      - name: customers
        relationship: many_to_one
        sql: "{refunds}.customer_id = {customers.id}"
      - name: dates
        relationship: many_to_one
        sql: "DATE_TRUNC('day', {refunds}.created_at) = {dates.date}"
 
    dimensions:
      - name: id
        type: number
        sql: id
        primary_key: true
 
    measures:
      - name: count
        type: count
      - name: total_amount
        type: sum
        sql: amount


cube(`refunds`, {
  sql: `SELECT *, NULL AS customer_id FROM refunds`,
 
  joins: {
    customers: {
      relationship: `many_to_one`,
      sql: `${refunds}.customer_id = ${customers.id}`
    },
 
    dates: {
      relationship: `many_to_one`,
      sql: `DATE_TRUNC('day', ${refunds}.created_at) = ${dates.date}`
    }
  },
 
  dimensions: {
    id: {
      sql: `id`,
      type: `number`,
      primary_key: true
    }
  },
 
  measures: {
    count: {
      type: `count`
    },
 
    total_amount: {
      sql: `amount`,
      type: `sum`
    }
  }
})

In this example, the refunds table has no customer_id column. By selecting NULL AS customer_id in the cube’s SQL, the join to customers is syntactically valid. The customer_id will always be NULL, so refund rows will never match a specific customer, but the subquery can still participate in the multi-fact join on the full set of common dimensions.

Filters in multi-fact queries

Filters on common dimensions (like name, city, or date) are applied to every subquery, ensuring consistent filtering across all fact tables.

Filters on fact-specific dimensions (like orders.status) are applied only to the subquery for that specific fact table. Other fact table subqueries remain unaffected.

Measure filters (e.g., orders_count > 1) are applied as HAVING conditions after the subqueries are joined, filtering the combined result set.

Segments in multi-fact queries

Segments that belong to a specific fact table are applied only to that fact table’s subquery. For example, applying an orders.completed_orders segment filters only the orders subquery while leaving returns unaffected.

Was this page useful?