Redshift

Connect Kubit directly to Amazon Redshift so it queries your data in place. You grant Kubit's VPCs access to your cluster, create a read only user, and grant it SELECT on the data it should see. Nothing is copied out of your account.

Steps

1. Grant access to VPCs

Share the AWS region of your Redshift cluster with the Kubit team.
Kubit provides you with AWS account and VPC IDs.
Follow AWS's Granting access to a VPC guide to allow access from Kubit's VPCs to Redshift.

2. Create a user

In your sharing database, create a user for Kubit that meets AWS password requirements.


CREATE USER kubit WITH PASSWORD '<password>';

3. Grant permissions

Let the Kubit user read your analytical data. Either grant access to all tables in a schema:


GRANT SELECT ON ALL TABLES IN SCHEMA public TO kubit;

Or restrict it to a specific set of tables or views:


GRANT USAGE ON SCHEMA public TO kubit;
GRANT SELECT ON TABLE public.table_1 TO kubit;
GRANT SELECT ON TABLE public.table_2 TO kubit;

Best practices

Cluster type

Amazon Redshift Serverless and Provisioned (Classic) suit different workloads and cost structures. Serverless fits unpredictable, intermittent, or ad hoc workloads with pay as you go pricing (RPU hours). Provisioned fits consistent, high throughput, predictable workloads where reserving capacity gives better long term cost efficiency.

Because analytical workloads are unpredictable and ad hoc by nature, we recommend Amazon Redshift Serverless with Kubit.

Distribution and sort keys

Kubit analyzes time series event data, so for the best performance review AWS's Choose the best sort keys guide. For fact tables, use:

an event date column.
event name, since it is the most common filter.

In a dimensional model, also consider distribution keys and styles. A fact table can have only one distribution key but joins to multiple dimension tables, so consult AWS's Choose the best distribution style guide.

Next steps

Understand the model: Warehouse Native
Allow Kubit through your firewall: IP Whitelist