Skip to Content

Using dynamic union tables

Use case

Sometimes, you may have a lot of tables in a database, which actually relate to the same entity.

For example, you can have “per client” tables with the same data, but related to different customers: elon_musk_table, john_doe_table, steve_jobs_table, etc. In this case, it would make sense to create a single cube for customers, which should be backed by a union table from all customers tables.

Data modeling

You can use the sql parameter to define a cube over an arbitrary SQL query, e.g., a query that includes UNION or UNION ALL operators:

cubes: - name: customers sql: | SELECT *, 'Einstein' AS name FROM einstein_data UNION ALL SELECT *, 'Pascal' AS name FROM pascal_data UNION ALL SELECT *, 'Newton' AS name FROM newton_data measures: - name: count type: count dimensions: - name: name sql: name type: string
cube(`customers`, { sql: ` SELECT *, 'Einstein' AS name FROM einstein_data UNION ALL SELECT *, 'Pascal' AS name FROM pascal_data UNION ALL SELECT *, 'Newton' AS name FROM newton_data `, measures: { count: { type: `count` } }, dimensions: { name: { sql: `name`, type: `string` } } })

However, it can be quite annoying to write the SQL to union all tables manually. Luckily, you can use dynamic data modeling to generate necessary SQL based on a list of tables:

{%- set customer_tables = { "einstein_data": "Einstein", "pascal_data": "Pascal", "newton_data": "Newton" } -%} cubes: - name: customers sql: | {%- for table, name in customer_tables | items %} SELECT *, '{{ name | safe }}' AS name FROM {{ table | safe }} {% if not loop.last %}UNION ALL{% endif %} {% endfor %} measures: - name: count type: count dimensions: - name: name sql: name type: string
const customer_tables = [ { table: "einstein_data", name: "Einstein" }, { table: "pascal_data", name: "Pascal" }, { table: "newton_data", name: "Newton" } ] cube(`customers`, { sql: customer_tables .map(entry => `SELECT *, '${entry.name}' AS name FROM ${entry.table}`) .join(` UNION ALL `), measures: { count: { type: `count` } }, dimensions: { name: { sql: `name`, type: `string` } } })

Result

Querying count and name members of the dynamically defined customers cube would result in the following generated SQL:

SELECT "customers".name "customers__name", count(*) "customers__count" FROM ( SELECT *, 'Einstein' AS name FROM einstein_data UNION ALL SELECT *, 'Pascal' AS name FROM pascal_data UNION ALL SELECT *, 'Newton' AS name FROM newton_data ) AS "customers" GROUP BY 1 ORDER BY 2 DESC

Was this page useful?