Context variables
You can use the following context variables within cube definitions:
CUBE
for referencing members of the same cube.FILTER_PARAMS
andFILTER_GROUP
for optimizing generated SQL queries.SQL_UTILS
for time zone conversion.COMPILE_CONTEXT
for creation of dynamic data models.
CUBE
You can use the CUBE
context variable to reference columns or members of
the current cube so you don't have to repeat the its name over and over.
It helps reference members while keeping the data model code DRY and easy to maintain.
cubes:
- name: users
sql_table: users
joins:
- name: contacts
sql: "{CUBE}.contact_id = {contacts.id}"
relationship: one_to_one
dimensions:
- name: id
sql: "{CUBE}.id"
type: number
primary_key: true
- name: name
sql: "COALESCE({CUBE.name}, {contacts.name})"
type: string
- name: contacts
sql_table: contacts
dimensions:
- name: id
sql: "{CUBE}.id"
type: number
primary_key: true
- name: name
sql: "{CUBE}.name"
type: string
FILTER_PARAMS
FILTER_PARAMS
context variable allows you to use filter
values from the Cube query during SQL generation.
This is useful for hinting your database optimizer to use a specific index or filter out partitions or shards in your cloud data warehouse so you won't be billed for scanning those.
Heavy usage of FILTER_PARAMS
is considered a bad practice. It usually
leads to hard-to-maintain data models. Good rule of thumb is to use
FILTER_PARAMS
only for predicate pushdown performance optimizations.
If you find yourself relying a lot on FILTER_PARAMS
, it might mean that
you need to rethink your approach to data modeling and potentially move
some transformations upstream. Also, you might reconsider the choice of the
data source.
FILTER_PARAMS
has to be a top-level expression in WHERE
and it has the
following syntax:
cubes:
- name: cube_name
sql: >
SELECT *
FROM table
WHERE {FILTER_PARAMS.cube_name.member_name.filter(sql_expression)}
dimensions:
- name: member_name
# ...
The filter()
function accepts sql_expression
, which could be either
a string or a function returning a string.
Example with string
See the example below for the case when a string is passed to filter()
:
cubes:
- name: order_facts
sql: >
SELECT *
FROM orders
WHERE {FILTER_PARAMS.order_facts.date.filter('date')}
measures:
- name: count
type: count
dimensions:
- name: date
sql: date
type: time
This will generate the following SQL...
SELECT COUNT(*) AS orders__count
FROM orders
WHERE
date >= '2018-01-01 00:00:00' AND
date <= '2018-12-31 23:59:59'
...for the ['2018-01-01', '2018-12-31']
date range passed for the
order_facts.date
dimension as in following query:
{
"measures": ["order_facts.count"],
"time_dimensions": [
{
"dimension": "order_facts.date",
"dateRange": ["2018-01-01", "2018-12-31"]
}
]
}
Example with function
You can also pass a function as a filter()
argument. This way, you can
add BigQuery shard filtering, which will reduce your billing cost.
cubes:
- name: events
sql: >
SELECT *
FROM schema.`events*`
WHERE {FILTER_PARAMS.events.date.filter(
lambda x, y: f"""
_TABLE_SUFFIX >= FORMAT_TIMESTAMP('%Y%m%d', TIMESTAMP({x})) AND
_TABLE_SUFFIX <= FORMAT_TIMESTAMP('%Y%m%d', TIMESTAMP({y}))
"""
)}
dimensions:
- name: date
sql: date
type: time
When a function is passed to filter()
, its arguments are passed as
strings from the data source driver and it's your responsibility to handle
type conversions in this case.
In the example above, the filter on a time dimension accepts two values: the lower and the upper bounds of a date range. If a filter accepts multiple values, they are passed to the function as individual parameters:
cube(`multi_filter`, {
sql: `
SELECT 123 AS value
-- Multiple values: ${FILTER_PARAMS.multi_filter.dummy.filter(
(...args) => JSON.stringify(args)
)}
`,
dimensions: {
dummy: {
sql: `1`,
type: `number`
}
}
})
FILTER_GROUP
If you use FILTER_PARAMS
in your query more than once, you must wrap them
with FILTER_GROUP
.
Otherwise, if you combine FILTER_PARAMS
with any logical operators other than
AND
in SQL or if you use filters with boolean operators
in your Cube queries, incorrect SQL might be generated.
FILTER_GROUP
has to be a top-level expression in WHERE
and it has the
following syntax:
cubes:
- name: cube_name
sql: >
SELECT *
FROM table
WHERE {FILTER_GROUP(
FILTER_PARAMS.cube_name.member_name.filter(sql_expression),
FILTER_PARAMS.cube_name.another_member_name.filter(sql_expression)
)}
dimensions:
- name: member_name
# ...
- name: another_member_name
# ...
Example
To understand the value of FILTER_GROUP
, consider the following data model
where two FILTER_PARAMS
are combined in SQL using the OR
operator:
cubes:
- name: filter_group
sql: >
SELECT *
FROM (
SELECT 1 AS a, 3 AS b UNION ALL
SELECT 2 AS a, 2 AS b UNION ALL
SELECT 3 AS a, 1 AS b
) AS data
WHERE
{FILTER_PARAMS.filter_group.a.filter("a")} OR
{FILTER_PARAMS.filter_group.b.filter("b")}
dimensions:
- name: a
sql: a
type: number
- name: b
sql: b
type: number
If the following query is run...
{
"dimensions": [
"filter_group.a",
"filter_group.b"
],
"filters": [
{
"member": "filter_group.a",
"operator": "gt",
"values": ["1"]
},
{
"member": "filter_group.b",
"operator": "gt",
"values": ["1"]
}
]
}
...the following (logically incorrect) SQL will be generated:
SELECT
"filter_group".a,
"filter_group".b
FROM (
SELECT *
FROM (
SELECT 1 AS a, 3 AS b UNION ALL
SELECT 2 AS a, 2 AS b UNION ALL
SELECT 3 AS a, 1 AS b
) AS data
WHERE
(a > 1) OR -- Incorrect logical operator here
(b > 1)
) AS "filter_group"
WHERE
"filter_group".a > 1 AND
"filter_group".b > 1
GROUP BY 1, 2
As you can see, since an array of filters has AND
semantics, Cube has
correctly used the AND
operator in the "outer" WHERE
. At the same time,
the hardcoded OR
operator has propagated to the "inner" WHERE
, leading to
a logically incorrect query.
Now, if the cube is defined the following way...
cubes:
- name: filter_group
sql: >
SELECT *
FROM (
SELECT 1 AS a, 3 AS b UNION ALL
SELECT 2 AS a, 2 AS b UNION ALL
SELECT 3 AS a, 1 AS b
) AS data
WHERE
{FILTER_GROUP(
FILTER_PARAMS.filter_group.a.filter("a"),
FILTER_PARAMS.filter_group.b.filter("b")
)}
# ...
...the following correct SQL will be generated for the same query:
SELECT
"filter_group".a,
"filter_group".b
FROM (
SELECT *
FROM (
SELECT 1 AS a, 3 AS b UNION ALL
SELECT 2 AS a, 2 AS b UNION ALL
SELECT 3 AS a, 1 AS b
) AS data
WHERE
(a > 1) AND -- Correct logical operator here
(b > 1)
) AS "filter_group"
WHERE
"filter_group".a > 1 AND
"filter_group".b > 1
GROUP BY 1, 2
You can also use boolean operators in the Cube query to express more complex filtering logic:
{
"dimensions": [
"filter_group.a",
"filter_group.b"
],
"filters": [
{
"or": [
{
"member": "filter_group.a",
"operator": "gt",
"values": ["1"]
},
{
"member": "filter_group.b",
"operator": "gt",
"values": ["1"]
}
]
}
]
}
With FILTER_GROUP
, the following correct SQL will be generated:
SELECT
"filter_group".a,
"filter_group".b
FROM (
SELECT *
FROM (
SELECT 1 AS a, 3 AS b UNION ALL
SELECT 2 AS a, 2 AS b UNION ALL
SELECT 3 AS a, 1 AS b
) AS data
WHERE
(a > 1) OR
(b > 1)
) AS "filter_group"
WHERE
"filter_group".a > 1 OR
"filter_group".b > 1
GROUP BY 1, 2
SQL_UTILS
convertTz
In case you need to convert your timestamp to user request timezone in cube or
member SQL you can use SQL_UTILS.convertTz()
method. Note that Cube will
automatically convert timezones for timeDimensions
fields in
queries.
Dimensions that use SQL_UTILS.convertTz()
should not be used as
timeDimensions
in queries. Doing so will apply the conversion multiple times
and yield wrong results.
In case the same database field needs to be queried in dimensions
and
timeDimensions
, create dedicated dimensions in the cube definition for the
respective use:
cubes:
- name: visitors
# ...
dimensions:
# Do not use in timeDimensions query property
- name: created_at_converted
sql: "{SQL_UTILS.convertTz(`created_at`)}"
type: time
# Use in timeDimensions query property
- name: created_at
sql: created_at
type: time
COMPILE_CONTEXT
COMPILE_CONTEXT
is evaluated only once per each key generated by context_to_app_id
.
The securityContext
defined in COMPILE_CONTEXT
doesn't change
its value for different users, however, it will change for
different tenants as defined in context_to_app_id
.
A global COMPILE_CONTEXT
contains securityContext
and any other variables provided by
extendContext
.
Use Jinja {{ }}
syntax to access COMPILE_CONTEXT
variable.
cubes:
- name: users
sql_table: "user_{{ COMPILE_CONTEXT.securityContext.deployment_id }}.users"
SECURITY_CONTEXT
SECURITY_CONTEXT
is deprecated and may be removed in the upcoming versions.
Use query_rewrite
instead.
SECURITY_CONTEXT
global variable holds a security context that is passed to Cube via API.
Please read the Security Context page for more information on how
to provide security context to Cube.
cube(`orders`, {
sql: `
SELECT *
FROM orders
WHERE ${SECURITY_CONTEXT.email.filter("email")}
`,
dimensions: {
date: {
sql: `date`,
type: `time`,
},
},
});
To ensure filter value presents for all requests requiredFilter
can be used:
cube(`orders`, {
sql: `
SELECT *
FROM orders
WHERE ${SECURITY_CONTEXT.email.requiredFilter("email")}
`,
dimensions: {
date: {
sql: `date`,
type: `time`,
},
},
});
You can access values of context variables directly in JavaScript in order to use it during your SQL generation. For example:
Use of this feature entails SQL injection security risk. Use it with caution.
cube(`orders`, {
sql: `
SELECT *
FROM ${
SECURITY_CONTEXT.type.unsafeValue() === "employee" ? "employee" : "public"
}.orders
`,
dimensions: {
date: {
sql: `date`,
type: `time`,
},
},
});