Documentation
Amazon Athena

AWS Athena

Prerequisites

Setup

Manual

Add the following to a .env file in your Cube project:

CUBEJS_DB_TYPE=athena
CUBEJS_AWS_KEY=AKIA************
CUBEJS_AWS_SECRET=****************************************
CUBEJS_AWS_REGION=us-east-1
CUBEJS_AWS_S3_OUTPUT_LOCATION=s3://my-athena-output-bucket
CUBEJS_AWS_ATHENA_WORKGROUP=primary
CUBEJS_AWS_ATHENA_CATALOG=AwsDataCatalog

Cube Cloud

ℹ️Allowing connections from Cube Cloud IP

In some cases you'll need to allow connections from your Cube Cloud deployment IP address to your database. You can copy the IP address from either the Database Setup step in deployment creation, or from Settings → Configuration in your deployment.

In Cube Cloud, select AWS Athena when creating a new deployment and fill in the required fields:

Cube Cloud AWS Athena Configuration Screen

Cube Cloud also supports connecting to data sources within private VPCs if dedicated infrastructure is used. Check out the VPC connectivity guide for details.

Environment Variables

Environment VariableDescriptionPossible ValuesRequired
CUBEJS_AWS_KEYThe AWS Access Key ID to use for database connectionsA valid AWS Access Key ID
CUBEJS_AWS_SECRETThe AWS Secret Access Key to use for database connectionsA valid AWS Secret Access Key
CUBEJS_AWS_REGIONThe AWS region of the Cube deploymentA valid AWS region (opens in a new tab)
CUBEJS_AWS_S3_OUTPUT_LOCATIONThe S3 path to store query results made by the Cube deploymentA valid S3 path
CUBEJS_AWS_ATHENA_WORKGROUPThe name of the workgroup in which the query is being startedA valid Athena Workgroup (opens in a new tab)
CUBEJS_AWS_ATHENA_CATALOGThe name of the catalog to use by defaultA valid Athena Catalog name (opens in a new tab)
CUBEJS_DB_SCHEMAThe name of the schema to use as information_schema filter. Reduces count of tables loaded during schema generation.A valid schema name
CUBEJS_CONCURRENCYThe number of concurrent connections each queue has to the database. Default is 5A valid number

Pre-Aggregation Feature Support

count_distinct_approx

Measures of type count_distinct_approx can be used in pre-aggregations when using AWS Athena as a source database. To learn more about AWS Athena's support for approximate aggregate functions, click here (opens in a new tab).

Pre-Aggregation Build Strategies

To learn more about pre-aggregation build strategies, head here.

FeatureWorks with read-only mode?Is default?
Batching
Export Bucket

By default, AWS Athena uses a batching strategy to build pre-aggregations.

Batching

No extra configuration is required to configure batching for AWS Athena.

Export Bucket

AWS Athena only supports using AWS S3 for export buckets.

AWS S3

For improved pre-aggregation performance with large datasets, enable export bucket functionality by configuring Cube with the following environment variables:

Ensure the AWS credentials are correctly configured in IAM to allow reads and writes to the export bucket in S3.

CUBEJS_DB_EXPORT_BUCKET_TYPE=s3
CUBEJS_DB_EXPORT_BUCKET=my.bucket.on.s3
CUBEJS_DB_EXPORT_BUCKET_AWS_KEY=<AWS_KEY>
CUBEJS_DB_EXPORT_BUCKET_AWS_SECRET=<AWS_SECRET>
CUBEJS_DB_EXPORT_BUCKET_AWS_REGION=<AWS_REGION>

SSL

Cube does not require any additional configuration to enable SSL as AWS Athena connections are made over HTTPS.