Configure Existing Redshift Cluster for Datacoral
As part of Datacoral installation, you could spin up a new Redshift cluster or choose to use an existing Redshift cluster as the warehouse. Click here for instructions on how to create a new Datacoral installation with your existing Redshift cluster directly through the Amazon Redshift Console.
Note that if you already have an existing redshift cluster that you want to use, you would have to make sure the networking configuration is setup appropriately. Services within the Datacoral VPC should be able to connect to the VPC that the redshift cluster is in. In addition, Datacoral will not be providing additional management capabilities like
- WLM Management
- Managed resizes
- Query management
To utilize an existing Redshift cluster with the Datacoral installation, please follow the steps below:
Step 1: Create datacoral user
Execute the following commands as master or a privileged user in Redshift
Please refer to this link to set the password according the Redshift password policy.
Step 2: Grant Privileges to datacoral user
Execute the following commands as master or a privileged user in Redshift
Step 3: Set up connectivity to your existing Redshift cluster
You can allow Datacoral to connect to your existing Redshift cluster using one of the options below:
- Add Datacoral's Elastic IP to your Redshift cluster's security group. Click here to see detailed instructions for this option
- VPC peering
- Setup VPC peering between the Datacoral VPC and the Redshift VPC
- Enable Outbound rules to the Redshift port from the Datacoral VPC
- Enable Inbound rules to the Redshift port from the Datacoral VPC
- Add route table entries to subnets and security groups as applicable
- For more advanced configuration of allowing access to your Redshift cluster through a copy role, please go through the steps here
Step 4: Follow along the Datacoral onboarding flow
Once the above steps are done, you can provide the credentials to your redshift cluster in the onboarding flow.
[Optional] Step 5: Enable Redshift Logging
All Redshift queries that are used for monitoring by Datacoral and their corresponding outputs can be logged in an S3 bucket. See the steps here for how to set this up yourself.