There might be cases where tables are not populated into warehouses by Datacoral like when the tables are being loaded by other tools or manually. In such cases, Datacoral allows you to include these tables as dependencies in materialized views by first declaring such tables as non-datacoral tables. The way to do that is by adding a nondatacoral slice corresponding to the schema where such tables are being populated.
Note that there needs to be one nondatacoral slice per schema since it allows for logical separation of tables in different schemas. Each non-datacoral table becomes a loadunit within the slice.
A schedule can be specified at a slice level or at a loadunit level. Once the slice is added, it just creates SUCCESS timelabels for those loadunits based on the timelabels. Given that the slice has no visibility into the actual data within the tables, even though there is a SUCCESS timelabel, there are no guarantees on the freshness of the data in the tables.
So, be aware of how often the table is being updated by an external tool or manually and specify the appropriate schedule to the non-datacoral slice.
Steps to add this slice to your installation
The steps to launch your slice are:
- Add the schedule
- Add list of tables as loadunits, specify datalayout
- Add the Non-Datacoral slice
Create parameters_file.json with all loadunits
To get a the starting template save the output of the
describe --input-parameterscommand as follows:
datacoral collect describe --slice-type nondatacoral \ --input-parameters > nondatacoral_parameters_file.json
schedule- in cron format (note: you can specify different schedules for different loadunits)
nondatacoral_parameters_file.json file to add the loadunits you need.
3. Add the Slice
datacoral collect add --slice-type nondatacoral --slice-name <slice-name> --parameters-file <params-file>
slice-nameName of your slice. A schema with your slice-name is automatically created in your warehouse
params-fileFile path to your input parameters file. Ex. nondatacoral_parameters_file.json
Supported load units
By default, the slice runs daily. If desired, you can change the slice configuration and specify different schedules for the revisions and revisions loadunits.
No data stored in AWS S3
AWS Redshift: Schema - schema name will be same as a slice-name. Tables - will be the list of loadunits specified in the slice. Note that these tables already exist and are readable by the datacoral user