MySQL Connector JSON Setup Guide
Another option to update/add MySQL connector is through the JSON editor. Here is a quick guide on how to add and update the connector
Add a connector
Go to the JSON editor tab of the connector and fill in the fields and click on Add connector
- Database - name of your MySQL database
- Host - database host (FQDN)
- Port - database port, for more on ports refer this link
- Username - database user name
- Password - database password
Please specify the input parameters below: Connecting to source via SSH (optional)
- sshTunnelHost - SSH server host name
- sshUser - SSH server username
- privateKey - SSH server private key (We support only RSA keys)
- passphrase - SSH server passphrase
Updating the connector
Step 1: Update the connector
Click on update in the top right corner of the loadunit list page
Step 2: Edit the configuration
You will be redirected to the JSON editor page. Here is a guide on how to edit this file.
A loadunit can be updated in one of the two ways
1) Extraction configuration 2) Load configuration
Use cases for configuration
Use case | Extract mode | Load mode |
---|---|---|
Need only the latest data | Snapshot | Replace |
Need all instances of past data | Snapshot | Append |
Need all the data for analysis | Incremental | Append |
Need all the data with the latest changes | Incremental | Merge |
Note :
- Extraction modes can be paginated in case of higher volumes of data.
- Merge load mode will necessarily need a primary key, but its recommended to provide the timestamp column as well
- If the table is high volume, snapshot is not a feasible option
Extract configuration:
This pertains to the extraction mode, the frequency of extraction and pagination. The extraction configuration of the loadunit is found in inputParams
> loadunits
The following parameters can be added or edited.
The extractionmode
parameter can take two values - snapshot and incremental
Parameter | Description | S | S(P) | I | I(P) |
---|---|---|---|---|---|
timestampCol | Timestamp column name | m | m | ||
schedule | Parameter will take input in cron format, set | o | o | o | o |
paginate | Takes boolean input, true when paginating | m | m | ||
pageSize | Takes numeric input indicating size of page | m | |||
pageSortCol | Unique key (usually primary key) | m | |||
pageCount | Number of pages | m | |||
dataformat | The format of the extracted data | o | o | o | o |
columnlist | Comma-separated column names that needs to be included | o | o | o | o |
columnBlacklist | Comma-separated column names that needs to be excluded | o | o | o | o |
extractionquery | A SQL query to extract at the source | o | o | o | o |
verifyrecords | Takes a boolean input to verify if the number of records fetched is equal to the number of records in the source | o | o | o | o |
lagInMinutes | Lag in minutes before starting the loading of data | o | o | o | o |
redshiftTimestampCol | Parameter used to compare latest value of timestampCol at the source and redshift warehouse | o | o | o | o |
Legend:
- S - snapshot mode
- S(P) - snapshot mode with pagination
- I - incremental mode
- I(P) - incremental mode with pagination
- m - mandatory parameter
- o - optional parameter
note
- Changing the schedule is likely to impact downstream dependencies
- It is recommended to move to incremental mode from snapshot paginate in case the data volume increases
- Dataformat will depend on the connector, for mySQL it is csv
- It is recommended to keep
lagInMinutes
as a multiple of schedule interval - Exection query includes where clause as well
Load configuration:
This pertains to load configuration onto the warehouse. The load configuration of the loadunit is found in loaderConfig
> loadunit
The following parameters can be added or edited. There are three load modes : merge, update and append to learn more please visit features and capabilities tab.
Parameter | Description | Merge | Update | Append |
---|---|---|---|---|
targetTable | Name the target table in the warehouse in the format <connector-name>.<target-table-name> | o | o | o |
dataFormat | Format of the loaded data, ex:parquet, json, avro, orc, csv (warehouse specific) | o | o | o |
copyOptions | Click redshift, snowflake to learn more | o | o | o |
TruncateTargetTable | Takes boolean input to indicate load mode. It is true for replace,false for append and merge modes | m | m | m |
IsUpdate | Boolean, Only for merge load mode | m | ||
PrimaryKeyCol | Primary key of the source | m | ||
UpdateTSCol | Timestamp column which captures the updated time | m | ||
DeleteTSCol | Timestamp column which indicates record deletion | o | ||
softDeletes | Takes boolean value, should be set as true to capture soft deletes | o |
Legend:
- m - mandatory parameter
- o - optional parameter
note
- The new loadunit should be added in both inputParams and loaderConfig with all the mandatory fields
- Timestamp column should be indexed and auto incremented
- Extraction query is applicable only for database connectors
Ask a question
If you have questions or feedback, feel free to reach out at hello@datacoral.co or Request a demo