Configure Cloudera Data Warehouse for Datacoral

As part of setting up the Datacoral platform with Cloudera Data Warehouse, please follow the steps listed below

Step 1: Set up the S3 bucket Policy and KMS key policy

If you are onboarding into Datacoral using your own AWS account, follow the instructions below

  1. Login to your DWX UI and navigate to 'Environments'
  2. Select the CDW cluster in your environment that you wish to connect to
  3. Navigate to the CDW cluster configuration page
  4. Check the box that says 'Configure an external bucket'. Once you check the box, you will see a json blurb for the bucket policy. Copy the policy
  5. Login to your AWS account where you have/want Datacoral installation set up
  6. Go to the S3 service console, select the data bucket that is created by Datacoral for your data (It should be named as {installation}-datacoral)
  7. Change the bucket policy by pasting the json that you copied from #4 which should look like the json below
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "read-write-access-for-cdw-env-xyz",
"Effect": "Allow",
"Principal": {
"AWS": [
"arn:aws:iam::<accountid>:role/<role1>",
"arn:aws:iam::<accountid>:role/<role2>"
]
},
"Action": [
"s3:Get*",
"s3:ListBucket",
"s3:GetBucketLocation"
],
"Resource": [
"arn:aws:s3:::<installation>-datacoral",
"arn:aws:s3:::<installation>-datacoral/*"
]
}
]
}
  1. Since the Datacoral's S3 bucket is encrypted, we need to add the following permissions to the KMS key. Please replace role1 and role2 in the json below by the roles you see in the bucket policy above
{
"Sid": "Enable IAM User Permissions for CDW user",
"Effect": "Allow",
"Principal": {
"AWS": [
"<role1,
"<role2>"
]
},
"Action": "kms:*",
"Resource": "*"
}

If you are onboarding into Datacoral using Datacoral's hosted installation, follow the instructions below

  1. Login to your CDP console and navigate to 'Environments'
  2. Select the CDW cluster in your environment that you wish to connect to
  3. Navigate to the CDW cluster configuration page
  4. Check the box that says 'Configure an external bucket'. Once you check the box, you will see a json blurb for the bucket policy. Copy the bucket policy
  5. Email the bucket policy to support@datacoral.co with 'Configure CDW for {installation}' in the subject by replacing installation name with the one you chose.

Step 2: Add Elastic IP to your network policy

Look for the elastic ip of your Datacoral installation in the installation settings. You would need to whitelist this IP and add it to the network policy in the AWS account where the Cloudera Data Warehouse is hosted.

Step 3: Follow along the Datacoral onboarding flow

Once the above steps are done, you can provide the credentials to your Cloudera Data Warehouse in the onboarding flow.