Polymorphic cubes
In programming languages, polymorphism usually means the use of a single symbol to represent multiple different types. It can be quite common for a database and application to be designed in such a way that leverages a single database table for entities of different types that share common traits.
For example, you are working on an online education platform, where teachers
assign lessons to students. The database can contain only two tables: one for
users
and another one for lessons
. The users
table can contain a type
column, with possible values teacher
or student
. Here is how it could look:
id | type | name | school |
---|---|---|---|
1 | student | Carl Anderson | Balboa High School |
2 | student | Luke Skywalker | Balboa High School |
31 | teacher | John Doe | Balboa High School |
Lessons are assigned by teachers and completed by students. The lessons
table
has both teacher_id
and student_id
, which are actually references to the
user id
. The lessons
table can look like this:
id | teacher_id | student_id | name |
---|---|---|---|
100 | 31 | 1 | Multiplication and the meaning of the Factors |
101 | 31 | 2 | Division as an Unknown Factor Problem |
The best way to design such a data model is by using what we call Polymorphic
Cubes. It relies on the extends
feature and
prevents you from duplicating code, while preserving the correct domain logic.
Learn more about using extends
here.
The first step is to create a user
cube, which will act as a base cube for our
teachers
and students
cubes and will contain all common measures and
dimensions:
cubes:
- name: users
sql_table: users
measures:
- name: count
type: count
dimensions:
- name: name
sql: name
type: string
- name: school
sql: school
type: string
Then you can derive the teachers
and students
cubes from users
:
cube(`teachers`, {
extends: users,
sql: `
SELECT *
FROM ${users.sql()}
WHERE type = 'teacher'
`,
});
cube(`students`, {
extends: users,
sql: `
SELECT *
FROM ${users.sql()}
WHERE type = 'student'
`,
});
Currently, {cube.sql()}
function is not supported in YAML data models.
Please track this issue (opens in a new tab).
As a workaround, you can use JavaScript data models, put a SQL query in a Jinja variable, or load it from the template context.
Once we have those cubes, we can define correct joins from the lessons
cube:
cubes:
- name: lessons
sql_table: lessons
joins:
- name: students
relationship: many_to_one
sql: "{CUBE}.student_id = {students.id}"
- name: teachers
relationship: many_to_one
sql: "{CUBE}.teacher_id = {teachers.id}"