Datacoral Documentation

Datacoral Documentation

  • Release Notes
  • Back to datacoral.com

›Getting Started

Getting Started

  • Home
  • Install CLI
  • Install Datacoral

CLI Reference

  • Collect Commands
  • Organize Commands
  • CLI Cheatsheet
  • Redshift Materialized Views Cheatsheet
  • Athena Materialized Views Cheatsheet
  • Troubleshooting Documentation

Collect Slices

  • Collect Overview
  • API Slices

    • Asana
    • Asana Premium
    • CloudWatch
    • Datadog
    • Delighted
    • Facebook
    • Fountain
    • FullStory
    • GitHub
    • Google Analytics
    • Google Adwords
    • Greenhouse
    • HubSpot
    • Intercom
    • JIRA
    • Launch Darkly
    • NetSuite
    • Non-Datacoral
    • Outreach
    • Phabricator
    • Pingdom
    • S3
    • Salesforce
    • Stripe
    • Zendesk
    • Zuora

    Database Slices

    • Database Collect Slices
    • Firebase
    • MongoDB
    • MySQL
    • MySQL CDC
    • PostgreSQL

    Events Slices

    • Events Overview
    • Android
    • Browser
    • Objective-C (iOS)
    • Pixel Tracking
    • nodeJS
    • Python
    • .Net
    • Ruby
    • Snowplow Events

Organize Slices

  • Data Organization
  • Managed Redshift
  • Managed Glue

Harness Slices

  • Harness Overview

Technical Documents

  • Timelabels
  • Definitions
  • Materialized Views
  • Security Architecture
  • Encrypt Password
  • Configuring Alerts
  • Existing Redshift

Install Datacoral

You can create your Datacoral Installation within your own AWS Account using our CLI. By following the steps below, you will be able to start seeing your data flowing within an afternoon!

Before you start

As part of our installation we deploy resources in your AWS account. To do this, Datacoral requires you to have the following:

PrerequisitesDetails
AWS CLIInstall and configure the AWS CLI on your laptop/workstation.
(Advanced) If you have multiple AWS profiles on your machine you will need to prefix the Datacoral CLI commands with AWS_PROFILE=<profile with admin credentials>, or you can reset the default profile. If you aren't sure what you have the credentials file containing your profiles is usually in ~/.aws/.
AWS Admin PrivilegesWhile Datacoral does not get admin privileges, you need them in order to prepare the AWS account for Datacoral, which creates a IAM roles for Datacoral software to use.

In addition, before you start creating the installation, you will need to have the following information handy:

Information neededDescription
AWS Region and Availability ZoneWe recommend deploying your Datacoral installation in the same region as your primary region to minimize network latencies in case data needs to be moved between your production systems and your Datacoral installation.
CIDR Block (Advanced)If you plan to use VPC peering instead of security groups to allow connections to your production databases, you would need to specify a non-overlapping CIDR block that you want the Datacoral VPC to use. Please use a 16 bit prefix, like 10.192.0.0/16.
Redshift Cluster Configuration (Advanced)Make sure you know how many and what type nodes you need for your Redshift cluster. If you don’t know, you can resize the cluster later, but you may incur some downtime.

Limitations

The self-service initialization works for most customers, however there are types of deployments that do not work for self service and must be guided by Datacoral customer success:

  • Deployments that do not use Redshift. The self-service install requires Redshift to be used - either an existing instance or new instance.
  • Connecting to existing Redshift clusters.

Pre-Prerequisites

Follow steps in Install Datacoral CLI to install the Datacoral CLI on your laptop/workstation.

Step 1

Prepare your AWS account

Create necessary roles, KMS Key, SSH Keypair

The following command gets your AWS account ready for a Datacoral Installation. In addition, this command fetches the CIDR blocks of the VPCs in your AWS account. The Datacoral Installation's VPC will use a CIDR block that does not overlap with your existing VPCs.

datacoral prepare-aws-account

This command will take about 10 minutes to run. Please keep your laptop/workstation running until this command completes.

Step 2

Initialize Datacoral Installation

Initialize your Datacoral Installation in your AWS account. This step provisions several core AWS resources needed for the Datacoral Installation. These resources include:

  • Separate VPC for all Datacoral resources
  • S3 buckets for data and monitoring
  • Cloudtrail and VPC logs
  • Lambda functions for monitoring and administration
  • New Redshift Cluster for all of your data. If you want to use your existing Redshift Cluster, please contact us at support@datacoral.co

You will get an email once the initialization is completed. This could take anywhere between one and two hours.

Create Redshift params

Create a json file with the following attributes:

{
  "nodetype": "<node type>",
  "nodecount": 1
}

<node type> can be one of:

  1. Dense compute - dc2.large and dc2.8xlarge
  2. Dense storage - ds2.xlarge and ds2.8xlarge

To learn more go to AWS Redshift Pricing

Datacoral uses a default CIDR block as "10.0.0.0/16". The allowed block size is between a /16(65536 IP addresses) netmask and /22 netmask(1024 IP addresses). If you would like to use another CIDR block, please identify the CIDR block by adding the following switch to the command below --vpc-cidr-block [VPC CIDR block]

Run the following Datacoral CLI command with the path to the parameters file created in the previous step

datacoral init-installation --add-managed-redshift /path/to/parameters-file.json

Step 3

Add Collect Slices

Once the Datacoral installation is initialized, you will receive a notification email. You can add slices to start ingesting data into the installation. Please refer to the supported slices

Step 4

Connect to Redshift

You may need to authorize access to the Redshift cluster by modifying the inbound rules based on your network settings. Follow the Redshift documentation here to continue to setting up your connection to Redshift.

Now, setup a connection to Redshift using a Postgres Client of your choice. See Redshift Documentation for details. Some examples are:

  • Postico on MAC
  • Postgres Command Line Client (psql) on MAC, Windows, or Linux

We also offer a hosted open source SQL editor and BI tool, Metabase, as part of the Datacoral Installation. Let us know at support@datacoral.co if you want it.

Step 5

Lock Down Redshift

When creating a new Redshift cluster, Datacoral create the following users in Redshift:

Master User

master user has admin privileges on the entire cluster. Datacoral does not need to ever use the credentials while running the installation. You can change the password of this user within your Postgres client and not share it with us.

ALTER USER master PASSWORD '<your new password>'

Note: Redshift passwords require one lowercase letter, one uppercase, a number and a special character. See Redshift ALTER USER Documentation for more details.

Datacoral User

Datacoral uses a dedicated user with read/write privileges to access the cluster. This user can create, alter, and drop schemas and tables. Datacoral Installation software connects to Redshift with this user in order to perform several operations like

  • inserting raw data from collect slices into tables
  • automatically updating schemas of raw data tables
  • implementing materialized views
  • monitoring for long running queries
  • analyzing and vacuuming tables

For added safety, you can alter the password of this user and then encrypt the password before sending it to us. Contact us at support@datacoral.co if you want to do this.

Step 6

Analyze Away!

Now, the installation is ready for you to use. You can use the Datacoral CLI to add collect slices and pull data from different datasources and load that data into redshift. See the Datacoral CLI Guide for more details See the Datacoral Security Architecture for more details.

Once the data from the collect slices lands in Redshift, you can transform that data by creating Materialized views. See Materialized Views.

← Install CLICollect Commands →
  • Before you start
  • Limitations
  • Pre-Prerequisites
    • Prepare your AWS account
    • Initialize Datacoral Installation
    • Create Redshift params
    • Add Collect Slices
    • Connect to Redshift
    • Lock Down Redshift
    • Analyze Away!
datacoral

Product

OverviewWhy Datacoral ?Slice Catalog

Customers

CustomersGreenhouse Case StudyFront Case Study

Company

AboutTeamCareersBlog

Legal

Privacy Policy
Copyright © 2019 Datacoral