Datacoral generated column names
MySQL CDC adds a 4 extra columns to the tables in your warehouse. This is so that you can effectively track the changes to your data. The two columns are:
__dc_cdc_modified_at: When a row is read for the first time from source or is modified in source, then we update this column with the UTC timestamp of when that row was synced
__dc_cdc_deleted_at: When a row gets deleted at source and we update this column with the UTC timestamp. If soft deletes is not enabled, the entire row will be deleted in the warehouse.
The above 2 columns are of type BIGINT in the warehouse and are populated with the UTC timestamp with nanoseconds precision at the time of sync.
__dc_timelabel: The timelabel at which the row was synced. This will be of string type.
__dc_load_time: When a row is read for the first time from source or is modified in source, then we update this column with the UTC timestamp of when that row was synced. This will be of timestamp type in the warehouse.
Datacoral Slot Table
As part of the normal running of the MySQL CDC connector, Datacoral creates a table in Athena in which data is being replicated. The purpose of this table is to contain all the records (inserts, updates or deletes) that the connector has read from the MySQL binlogs. This table is useful for maintaining an audit log, or analyzing deletes that usually get replicated in the warehouses. In Athena, this is a partitioned table, where it is partitioned based on the timelabel for which data was read from the binlogs.
Feel free to create an Athena Materialized View to analyze the data in this table!