MySQL Connector JSON Setup Guide

Another option to update/add MySQL connector is through the JSON editor. Here is a quick guide on how to add and update the connector

Add a connector

Go to the JSON editor tab of the connector and fill in the fields and click on Add connector

  • Database - name of your MySQL database
  • Host - database host (FQDN)
  • Port - database port, for more on ports refer this link
  • Username - database user name
  • Password - database password

    Connecting to source via SSH (optional)

    Please specify the input parameters below:
  • sshTunnelHost - SSH server host name
  • sshUser - SSH server username
  • privateKey - SSH server private key (We support only RSA keys)
  • passphrase - SSH server passphrase

Updating the connector

Step 1: Update the connector

Click on update in the top right corner of the loadunit list page

Step 2: Edit the configuration

You will be redirected to the JSON editor page. Here is a guide on how to edit this file.

A loadunit can be updated in one of the two ways

1) Extraction configuration 2) Load configuration

Use cases for configuration

Use caseExtract modeLoad mode
Need only the latest dataSnapshotReplace
Need all instances of past dataSnapshotAppend
Need all the data for analysisIncrementalAppend
Need all the data with the latest changesIncrementalMerge

Note :

  • Extraction modes can be paginated in case of higher volumes of data.
  • Merge load mode will necessarily need a primary key, but its recommended to provide the timestamp column as well
  • If the table is high volume, snapshot is not a feasible option

Extract configuration:

This pertains to the extraction mode, the frequency of extraction and pagination. The extraction configuration of the loadunit is found in inputParams > loadunits

The following parameters can be added or edited. The extractionmode parameter can take two values - snapshot and incremental

ParameterDescriptionSS(P)II(P)
timestampColTimestamp column namemm
scheduleParameter will take input in cron format, setoooo
paginateTakes boolean input, true when paginatingmm
pageSizeTakes numeric input indicating size of pagem
pageSortColUnique key (usually primary key)m
pageCountNumber of pagesm
dataformatThe format of the extracted dataoooo
columnlistComma-separated column names that needs to be includedoooo
columnBlacklistComma-separated column names that needs to be excludedoooo
extractionqueryA SQL query to extract at the sourceoooo
verifyrecordsTakes a boolean input to verify if the number of records fetched is equal to the number of records in the sourceoooo
lagInMinutesLag in minutes before starting the loading of dataoooo
redshiftTimestampColParameter used to compare latest value of timestampCol at the source and redshift warehouseoooo

Legend:

  • S - snapshot mode
  • S(P) - snapshot mode with pagination
  • I - incremental mode
  • I(P) - incremental mode with pagination
  • m - mandatory parameter
  • o - optional parameter
note
  • Changing the schedule is likely to impact downstream dependencies
  • It is recommended to move to incremental mode from snapshot paginate in case the data volume increases
  • Dataformat will depend on the connector, for mySQL it is csv
  • It is recommended to keep lagInMinutes as a multiple of schedule interval
  • Exection query includes where clause as well

Load configuration:

This pertains to load configuration onto the warehouse. The load configuration of the loadunit is found in loaderConfig > loadunit

The following parameters can be added or edited. There are three load modes : merge, update and append to learn more please visit features and capabilities tab.

ParameterDescriptionMergeUpdateAppend
targetTableName the target table in the warehouse in the format <connector-name>.<target-table-name>ooo
dataFormatFormat of the loaded data, ex:parquet, json, avro, orc, csv (warehouse specific)ooo
copyOptionsClick redshift, snowflake to learn moreooo
TruncateTargetTableTakes boolean input to indicate load mode. It is true for replace,false for append and merge modesmmm
IsUpdateBoolean, Only for merge load modem
PrimaryKeyColPrimary key of the sourcem
UpdateTSColTimestamp column which captures the updated timem
DeleteTSColTimestamp column which indicates record deletiono
softDeletesTakes boolean value, should be set as true to capture soft deleteso

Legend:

  • m - mandatory parameter
  • o - optional parameter
note
  • The new loadunit should be added in both inputParams and loaderConfig with all the mandatory fields
  • Timestamp column should be indexed and auto incremented
  • Extraction query is applicable only for database connectors

Ask a question

If you have questions or feedback, feel free to reach out at hello@datacoral.co or Request a demo