Step 1: Identify your database credentials
Use an existing database user OR create a new database user for Datacoral. If creating a new user, please run the following statement on your database, replacing
datacoral-password with a password of your choice.
Encrypt password (recommended)
All customer data is encrypted at rest as well as in motion using customer managed KMS keys. Datacoral’s cross-account role does not have decrypt permissions on the KMS keys. However, we do recommend using pre-encrypted credentials, as it allows you to source control configuration files in order to automate installations for Data Ops and share configurations without compromising security.
Please refer to the instructions on encrypting your password before adding them to encrypt your passwords.
Step 2: Make sure your database is addressable from your Datacoral VPC
1. Database accessible via the internet
If your database is hosted on AWS and is not publicly accessible, you need to authorize ingress for Datacoral's elastic IP to the db security group. You can do this in the AWS RDS console or via AWS CLI:
2. Database accessible via VPC Peering
Setup VPC Peering between the Datacoral Installation's VPC and your database's VPC. See VPC peering for how to set it up, if you don't want any traffic from your database to go over the internet.
MySQL Database hosted in AWS:
Modify the subnet group and security groups so that network connections are allowed to the Datacoral VPC.
RDS subnet group - the network ACL should allow inbound and outbound connection to the Datacoral VPC.
RDS security group - we require inbound and outbound rule to and from the Datacoral VPC.
MySQL Database hosted outside AWS:
Please follow the appropriate networking setup to allow for ingress traffic for Datacoral to access your database.
Step 3: Specify the connector configuration
To get a template for the MySQL connector configuration save the output of the
describe --input-parameters command as follows:
Direct Connection to source DB
Fill the below mandatory fields.
- Database - name of your MySQL database
- Host - database host (FQDN)
- Port - database port, for more on ports refer this link
- Username - database user name
- Password - database password
Please specify the input parameters below: Connecting to source via SSH (optional)
- sshTunnelHost - SSH server host name
- sshUser - SSH server username
- privateKey - SSH server private key (We support only RSA keys)
- passphrase - SSH server passphrase
In order to generate SSH
passphrase, please run the following command in a terminal
Please provide passphrase when prompted
Optional input parameters:
|a comma separated list of tables you want to pull. You can also set it to [".%"] for all tables in the public schema or ["%"] for all tables from all the schemas. The default is set to [".%"]|
|list of tables that you do not want to pull|
|set the frequency of extraction in cron format|
Step 4: Specify advanced connector configuration
After a connector is added and bootstrapped, additional options are made available after schema discovery. You can describe the full connector along with the advanced options using the
These advanced configurations can be set the individual table level for fine grained optimization to match your use case.
columnList- List of columns to be replicated from the MySQL table. Helpful to Column-level SQL based transformation at the source for Snapshot & Incremental modes. Ex. conversion of timestamp columns to standard format
columnBlacklist- List of columns which need to be excluded from the MySQL table. The columns that are in this list will not be a part of the data warehouse table. Example:"columnBlacklist" :["password"]
extractionQuery- Custom data extraction query.
Only one of columnList, columnBlacklist and extractionQuery parameters can be provided.
Step 5: Add the connector
connector-nameName of your connector. A schema with your connector-name is automatically created in your warehouse
params-fileFile path to your input parameters file. Ex. mysql_parameters_file.json