Non-Datacoral Collect Slice
Overview
There might be cases where tables are not populated into warehouses by Datacoral like when the tables are being loaded by other tools or manually. In such cases, Datacoral allows you to include these tables as dependencies in materialized views by first declaring such tables as non-datacoral tables. The way to do that is by adding a nondatacoral slice corresponding to the schema where such tables are being populated.
Note that there needs to be one nondatacoral slice per schema since it allows for logical separation of tables in different schemas. Each non-datacoral table becomes a loadunit within the slice.
A schedule can be specified at a slice level or at a loadunit level. Once the slice is added, it just creates SUCCESS timelabels for those loadunits based on the timelabels. Given that the slice has no visibility into the actual data within the tables, even though there is a SUCCESS timelabel, there are no guarantees on the freshness of the data in the tables.
So, be aware of how often the table is being updated by an external tool or manually and specify the appropriate schedule to the non-datacoral slice.
Steps to add this slice to your installation
The steps to launch your slice are:
- Add the schedule
- Add list of tables as loadunits, specify datalayout
- Add the Non-Datacoral slice
Setup instructions
Create parameters_file.json with all loadunits
To get a template for the Non-Datacoral slice configuration save the output of the
describe --input-parameters
command as follows:
Input parameters:
schedule
- in cron format (note: you can specify different schedules for different loadunits)
Modify the nondatacoral_parameters_file.json
file to add the loadunits you need.
3. Add the Slice
slice-name
Name of your slice. A schema with your slice-name is automatically created in your warehouseparams-file
File path to your input parameters file. Ex. nondatacoral_parameters_file.json
Supported load units
Notes
By default, the slice runs daily. If desired, you can change the slice configuration and specify different schedules for the revisions and revisions loadunits.
Slice output
No data stored in AWS S3
AWS Redshift: Schema - schema name will be same as a slice-name. Tables - will be the list of loadunits specified in the slice. Note that these tables already exist and are readable by the datacoral user
Questions? Interested?
If you have questions or feedback, feel free to reach out at hello@datacoral.co or Request a demo