Skip to Content

Implementing funnel analysis

This functionality only works with data models written in JavaScript, not YAML. For more information, check out the Data Modeling Syntax page.

Funnels represent a series of events that lead users towards a defined goal. Funnel analysis is an approach commonly used in product, marketing and sales analytics.

Regardless of the domain, every funnel has the following traits:

  • The identity of the object moving through the funnel – e.g. user or lead
  • A set of steps, through which the object moves
  • The date and time of each step
  • The time to convert between steps

Since funnels have a pretty standard structure, they are good candidates for being extracted into reusable packages. Cube comes pre-packaged with a standard funnel package.

// First step is to require the Funnel package const Funnels = require(`Funnels`) cube(`PurchaseFunnel`, { extends: Funnels.eventFunnel({ userId: { sql: `user_id` }, time: { sql: `timestamp` }, steps: [ { name: `view_product`, eventsView: { sql: `select * from events where event = 'view_product'` } }, { name: `purchase_product`, eventsView: { sql: `select * from events where event = 'purchase_product'` }, timeToConvert: "1 day" } ] }) })

Cube will generate an SQL query for this funnel. Since funnel analysis in SQL is not straight forward, the SQL code itself is quite complicated, even for such a small funnel.

Show Funnel’s SQL

SELECT purchase_funnel.step "purchase_funnel.step", count(purchase_funnel.user_id) "purchase_funnel.conversions" FROM ( WITH joined_events AS ( select view_product_events.user_id view_product_user_id, purchase_product_events.user_id purchase_product_user_id, view_product_events.t FROM ( select user_id user_id, timestamp t from ( select * from events where event = 'view_product' ) e ) view_product_events LEFT JOIN ( select user_id user_id, timestamp t from ( select * from events where event = 'purchase_product' ) e ) purchase_product_events ON view_product_events.user_id = purchase_product_events.user_id AND purchase_product_events.t >= view_product_events.t AND ( purchase_product_events.t :: timestamptz AT TIME ZONE 'America/Los_Angeles' ) <= ( view_product_events.t :: timestamptz AT TIME ZONE 'America/Los_Angeles' ) + interval '1 day' ) select user_id, first_step_user_id, step, max(t) t from ( SELECT view_product_user_id user_id, view_product_user_id first_step_user_id, t, 'View Product' step FROM joined_events UNION ALL SELECT purchase_product_user_id user_id, view_product_user_id first_step_user_id, t, 'Purchase Product' step FROM joined_events ) as event_steps GROUP BY 1, 2, 3 ) AS purchase_funnel WHERE ( purchase_funnel.t >= '2018-07-01T07:00:00Z' :: timestamptz AND purchase_funnel.t <= '2018-07-31T06:59:59Z' :: timestamptz ) GROUP BY 1 ORDER BY 2 DESC LIMIT 5000

Funnel parameters

userId

A unique key to identify the users moving through the funnel.

userId: { sql: `user_id` }

nextStepUserId

In the situation where user_id changes between steps, you can pass a unique key to join two adjacent steps. For example, if a user signs in after having been tracked anonymously until that point in the funnel, you could use nextStepUserId to define a funnel where users are tracked by anonymous ID on the first step and then by an identified user ID on subsequent steps.

const Funnels = require(`Funnels`) cube(`OnboardingFunnel`, { extends: Funnels.eventFunnel({ userId: { sql: `id` }, time: { sql: `timestamp` }, steps: [ { name: `View Page`, eventsView: { sql: `select anonymous_id as id, timestamp from pages` } }, { name: `Sign Up`, eventsView: { sql: `select anonymous_id as id, user_id, timestamp from sign_ups` }, nextStepUserId: { sql: `user_id` }, timeToConvert: "1 day" }, { name: `Action`, eventsView: { sql: `select user_id as id from actions` }, timeToConvert: "1 day" } ] }) })

time

A timestamp of the event.

time: { sql: `timestamp` }

steps

An array of steps. Each step has 2 required and 1 optional parameters:

  • name (required) - Name of the step. It must be unique within a funnel.
  • eventsView (required) - Events table for the step. It must contain userId and time fields. For example, if we have defined the userId as user_id and time as timestamp, we need to have these fields in the table we’re selecting from.
  • timeToConvert (optional) - A time window during which conversion should happen. Set it depending on your funnel logic. If this is set to 1 day, for instance, it means the funnel will include only users who made a purchase within 1 day of visiting the product page.
steps: [ { name: `purchase_product`, eventsView: { sql: `select * from events where event = 'purchase_product'` }, timeToConvert: "1 day" } ]

Joining funnels

In order to provide additional dimensions, funnels can be joined with other cubes using user_id at the first step of a funnel. This will always use a many_to_one relationship, hence you should always join with the corresponding user cube. Here, by ‘user’ we understand this to be any entity that can go through a sequence of steps within funnel. It could be a real web user with an auto assigned ID or a specific email sent by an email automation that goes through a typical flow of events like ‘sent’, ‘opened’, ‘clicked’, and so on. For example, for our PurchaseFunnel we can add a join to another funnel as following:

cube(`PurchaseFunnel`, { joins: { Users: { relationship: `many_to_one`, sql: `${CUBE}.first_step_user_id = ${Users.id}` } }, extends: Funnels.eventFunnel({ // ... }) })

Using funnels

Cube is based on multidimensional analysis 

Funnel-based cubes have the following structure:

Measures

  • conversions - Count of conversions in the funnel. The most useful when broken down by steps. It’s the classic funnel view.
  • conversionsPercent - Percentage of conversions. It is useful when you want to inspect a specific step, or set of steps, and find out how a conversion has changed over time.

Dimensions

  • step - Describes funnels’ steps. Use it to break down conversions or conversionsPercent by steps, or to filter for a specific step.
  • time - time dimension for the funnel. Use it to filter your analysis for specific dates or to analyze how conversion changes over time.

Performance considerations

Funnel joins are extremely heavy for most modern databases and complexity grows in a non-linear way with the addition of steps. However, if the cardinality of the first event isn’t too high, very simple optimization can be applied: originalSql pre-aggregation.

It is best to use partitioned rollups to cache the steps instead. Add one to the PurchaseFunnel cube as follows:

cube(`PurchaseFunnel`, { extends: Funnels.eventFunnel({ // ... }), preAggregations: { main: { type: `originalSql` } } })

Was this page useful?