This tutorial teaches you how to build a performant dynamic dashboard from your Supabase data in less than 15 minutes using Cube and React.
You can try the live demo app and get the complete source code on GitHub.
Data visualization provides you with a graphical representation of your data. It can reveal trends and help you make data-driven decisions for your organizations, especially if you have massive datasets.
However, there are many challenges when aggregating and visualizing large datasets. This tutorial demonstrates how you can build a data visualization application with your data in Supabase and use Cube to help you overcome these challenges.
Configuring Supabase
Let’s use a massive dataset from Kaggle for our demo application. I am going to use the online Payments Fraud detection dataset.
We'll also use Supabase to store the data. Supabase is an open source Firebase alternative packed with features that include a dedicated PostgreSQL database for every project. So, with Supabase you can build apps with great developer experience and numerous features available out of the box.
Next, head over to Supabase and create a new database. Create a new table called fraud
and import the CSV data. You can import CSV data using the "Import data" option in your Supabase dashboard, as shown in the following picture.
After some time the data will be uploaded into a new fraud
table.
Exploring queries for data visualization
Next, we are going to create and run some queries. Head over to the SQL editor in your Supabase dashboard menu and run the following query.
This query gives us the count of records in the fraud table. Let’s analyze the query time for this SQL query. Open up the developer tool in your browser and analyze the query execution time.
For me, it took about 4.32s, which is slow but not bad considering we are working with a dataset that has over 6 million rows. However, we rarely work with simple queries such as this one for complex data analysis. Let’s create a more complex query.
The following query fetches relevant columns such as fraud type
, isfraud
, isflaggedfraudand
calculates the sum of fraud amount and counts frauds.
This query takes about 6 seconds to run. For a modern application, this is not a very good response time.
Imagine a dashboard application that takes about 6 seconds for a single reporting table to load. It will not deliver a robust user experience. According to Google's market research, users get more frustrated as page load time goes over 3 seconds. If you build a public-facing data application and have a very slow response time, users are most likely to bounce from your website.
Why Postgres can be slow
So what’s making the queries slow?
Supabase is actually not the bottleneck here. Supabase uses PostgreSQL under the hood, and Postgres is a traditional row-oriented database. Row-oriented databases store information in the disk row by row.
id | Name | City | Age |
---|---|---|---|
1 | Dave | Los Angeles | 29 |
2 | Matt | Toronto | 26 |
3 | Jeff | New York | 23 |
These types of databases are very efficient at reading and writing single rows. For instance, if I want to add a new record, I can add another row to the end of the table. If I know the id
of a record, I can look up the row and read all the columns.
This makes Postgres a great database for applications that heavily depend on reading and writing data.
However, row-oriented databases are not performant for analytical queries. For instance, if I want to get the sum of all the ages in the previous table, I must read each row and all its columns. Then I have to add up the age values. Even though we only need the age column, we read all the columns, which is not very memory efficient. Therefore Postgres has its shortcoming in data-intensive aggregations and analytics tasks.
You can learn more about how row-oriented databases work and their limitations in this blog post.
Column-oriented databases such as BigQuery and Snowflake is really good at aggregating data. However, it is often more challenging to manage and sync multiple databases of different paradigms when building applications.
Also, they wouldn't provide you the kind of developer experience that Supabase users love. So, what can we do?
How Cube can help
Cube is an open-source API-first headless business intelligence platform that connects to your data sources and makes queries fast, responsive, cost-effective, and consistent across your applications. Cube’s API layer is able to perform efficient aggregation on your data and serve it to applications. (Also, Cube has recently got 13,000 stars on GitHib — 3 times less than Supabase, but also not that bad, right?)
You can run a Cube instance, connect it to your Supabase project, and get a performant API that will serve analytical data to your application.
Getting started with Cube
The easiest way to get started with Cube is with Cube Cloud. It provides a fully managed, ready to use Cube cluster. However, if you prefer self-hosting, then follow this tutorial.
In this tutorial, you will create a new Cube deployment in Cube Cloud. You can select a cloud platform of your choice.
Next, select start from scratch to get started with a fresh instance.
Next, you will be asked to provide your database connection information. Select PostgreSQL.
Head back to your Supabase dashboard to retrieve the database connection information. From there please select the Database option and take note of the connection information.
Next, fill in the database connection information in Cube Cloud.
Cube can auto-generate a Data Schema from your SQL tables. It is used to model raw data into meaningful business definitions. The data schema is exposed through the querying API, allowing end-users to query a wide variety of analytical queries.
We will select the fraud
table for schema generation. It will take a couple of minutes for our Cube instance to get provisioned.
Use pre-aggregations to increase query performance
One of Cube’s most used features are pre-aggregations. They are used to reduce the execution time of a query and make it be well under a second (more likely like 200ms or less).
In Cube, pre-aggregations are condensed versions of the source data. They are materialized ahead of time and persisted as tables separately from the raw data. To learn more about pre-aggregations, plese follow this tutorial.
You can also check these in-depth video workshops on pre-aggregations:
In your Cube dashboard select Schema and then select Enter Development Mode. Select Fraud.js
in the files and add the following code to your schema.
Please save the changes and the pre-aggregation will be applied to your SQL queries.
Analyzing data with the Developer Playground
Select the developer playground option from your Cube dashboard. The Developer Playground is a tool that lets you experiment with your data and generate various data visualizations.
Let's create a new Query. Please select the measures and dimensions as shown in the following image and then select Run. It makes an identical query to our previous SQL query.
Notice that it takes only about 30 to 35 milliseconds to run the query and get the data back. That’s almost a 200x performance boost in the best-case scenario.
Autogenerate front-end code from Cube
Cube also gives us the ability to autogenerate part of our front-end code. For instance, if we want the table in the previous example as a React component, Cube can generate the necessary code. (However, use this feature with care. It's designed to get you something to start with, not to generate the complete code of your application.)
In your Chart menu select the Edit option and Cube will create a new table component in the codesandbox.
Next, let’s say we want to visualize the number of different types of frauds committed. We want to present this information as a pie chart. We can select Count as measures and Type as dimensions in the Cube dashboard to do this. We select the Pie chart option. We can also specify that we want React and the Chart.js library to generate our visualization.
Once the visualization is done you can open the front-end code by selecting Edit.
Putting it all together in a React App
Let’s put together a front-end React app for our data visualization. Create a new React app by running the following commands.
Next, add all the required npm dependencies to your project.
First of all we initialize cube by adding the following code to our App.js
file.
Next, go ahead and create two components one for showing the table view and the other for showing the Pie chart. Following is the code for the Table
component.
Following is the code for PieChart
component.
You can find the complete code for this demo application at this link.
Where to go from here
We have only covered the basics of Cube in this tutorial. Cube comes packed with features that can help you build data-intensive apps fast. Cube supports features such as multiple database connections, multi-tenancy, GraphQL API, SQL API, and more.
You can sign up for Cube Cloud for free if you would like to play around. To learn more about how Cube can help you to build your project head over to the official documentation page.
If you have questions, or feedback, we would love to hear what you have to say! Come join our Slack community. Click here to join!
That’s all for today. Feel free to leave Cube a ⭐ on GitHub if you liked this article. Happy hacking.