AWS Redshift
Prerequisites
- The hostname (opens in a new tab) for the AWS Redshift (opens in a new tab) cluster
- The username/password (opens in a new tab) for the AWS Redshift (opens in a new tab) cluster
- The name of the database to use within the AWS Redshift (opens in a new tab) cluster
If the cluster is configured within a VPC (opens in a new tab), then Cube must have a network route to the cluster.
Setup
Manual
Add the following to a .env
file in your Cube project:
CUBEJS_DB_TYPE=redshift
CUBEJS_DB_HOST=my-redshift-cluster.cfbs3dkw1io8.eu-west-1.redshift.amazonaws.com
CUBEJS_DB_NAME=my_redshift_database
CUBEJS_DB_USER=redshift_user
CUBEJS_DB_PASS=**********
Cube Cloud
In some cases you'll need to allow connections from your Cube Cloud deployment IP address to your database. You can copy the IP address from either the Database Setup step in deployment creation, or from Settings → Configuration in your deployment.
The following fields are required when creating an AWS Redshift connection:
Cube Cloud also supports connecting to data sources within private VPCs if dedicated infrastructure is used. Check out the VPC connectivity guide for details.
Environment Variables
Environment Variable | Description | Possible Values | Required |
---|---|---|---|
CUBEJS_DB_HOST | The host URL for a database | A valid database host URL | ✅ |
CUBEJS_DB_PORT | The port for the database connection | A valid port number | ❌ |
CUBEJS_DB_NAME | The name of the database to connect to | A valid database name | ✅ |
CUBEJS_DB_USER | The username used to connect to the database | A valid database username | ✅ |
CUBEJS_DB_PASS | The password used to connect to the database | A valid database password | ✅ |
CUBEJS_DB_SSL | If true , enables SSL encryption for database connections from Cube | true , false | ❌ |
CUBEJS_CONCURRENCY | The number of concurrent connections each queue has to the database. Default is 4 | A valid number | ❌ |
CUBEJS_DB_MAX_POOL | The maximum number of concurrent database connections to pool. Default is 16 | A valid number | ❌ |
CUBEJS_DB_EXPORT_BUCKET_REDSHIFT_ARN | ❌ |
Pre-Aggregation Feature Support
count_distinct_approx
Measures of type
count_distinct_approx
can
not be used in pre-aggregations when using AWS Redshift as a source database.
Pre-Aggregation Build Strategies
To learn more about pre-aggregation build strategies, head here.
Feature | Works with read-only mode? | Is default? |
---|---|---|
Batching | ❌ | ✅ |
Export Bucket | ❌ | ❌ |
By default, AWS Redshift uses batching to build pre-aggregations.
Batching
Cube requires the Redshift user to have ownership of a schema in Redshift to
support pre-aggregations. By default, the schema name is prod_pre_aggregations
.
It can be set using the pre_aggregations_schema
configration
option.
No extra configuration is required to configure batching for AWS Redshift.
Export bucket
AWS Redshift only supports using AWS S3 for export buckets.
AWS S3
For improved pre-aggregation performance with large datasets, enable export bucket functionality by configuring Cube with the following environment variables:
Ensure the AWS credentials are correctly configured in IAM to allow reads and writes to the export bucket in S3.
CUBEJS_DB_EXPORT_BUCKET_TYPE=s3
CUBEJS_DB_EXPORT_BUCKET=my.bucket.on.s3
CUBEJS_DB_EXPORT_BUCKET_AWS_KEY=<AWS_KEY>
CUBEJS_DB_EXPORT_BUCKET_AWS_SECRET=<AWS_SECRET>
CUBEJS_DB_EXPORT_BUCKET_AWS_REGION=<AWS_REGION>
SSL
To enable SSL-encrypted connections between Cube and AWS Redshift, set the
CUBEJS_DB_SSL
environment variable to true
. For more information on how to
configure custom certificates, please check out Enable SSL Connections to the
Database.