How to Add Transformations
Create a Materialized View
Step 1. Populate the Configuration
There are two options for populating the configuration of a materialized view. Both work well and will be explained in the following sections.
Populate a DPL File
A DPL file represents the parameter configurations for a materialized view. Each file will have the format shown below. There are more optional paramters discussed in the Organize Commands Referrence too.
schema-name
: Name of an existing schema that the Datacoral platform has permission to manageinterval-in-minutes
: The interval in minutes must be an element of the following set → 5, 10, 15, 20, 30, 60, 120, 180, 240, 360, 480, 720, or 1440dist-key
: Optional expression to specify the dist key for the materialized view. Example:diststyle key distkey(column1)
sort-key(s)
: Optional expression to specify sort keys for the materialized view. Example:compound sortkey(column1, column2)
A list of existing schemas in a Datacoral installation can be found by logging into the Datacoral app and clicking Organize in the left navigation bar.
Example
The following are fake values in an example view_test.dpl
DPL file.
Utilize CLI Arguments
The CLI command to create a materialized view also has options to pass the necessary paramters during creation of the view. An example of the formatting will be shown in the next step, but the same values must be gathered as in the above DPL file.
Step 2. Create the Materialized View
The new materialized view will be created in your installation by running one of the following commands.
Creating from DPL File
The command to create a materialized view from a DPL configuration file is seen below.
mv_file_name
: The path to the file in which the paramters were specified during step 1
Example
Using our example DPL file from the previous step,
Creating with CLI Arguments
The command to create a materialized view by passing arguments to the parameters is seen below.
Example
Using the same parameter values as in our DPL file, but altering the name of the materialized view so the two examples are unique, results in the command seen below.
Step 3. Verify the Materialized View
After creating a materialized view, execute the following command to verify the view was successfully created.
schema-name
: Name of the schema used to create the materialized view
Example
For each materialized view created as examples thus far (both via DPL and via CLI args) the commends would respectively be as follows.
Troubleshooting
Failure in fetching dependencies
We have built a SQL parser to automatically detect data dependencies to simplify the cognitive load for analysts while creating transformations within a pipeline. This parser is built to handle all the SQL dialects that exist in different warehouses. However, we are coming across some cases where our parser fails even though the SQL is valid in the dialect of the underlying warehouse. In order to allow users to make progress even in these cases, we have offered a way for analysts to specify the table names that the SQL query includes. The requirement to specify these dependencies will be removed over time as we enhance the parser. In such cases, you can specify the list of such table dependencies using the dependencies field.
Example
The SQL parser for the above query will fail. To be able to create the materialized view with the same query, we can specify the dependencies like below
Update a Materialized View
The file-based management flow allows updating of one or more materialized views with a single command.
Step 1. Fetch the Materialized View Configuration
The existing parameters for a materialized view can be saved to a file by running the following command.
schema-name
: Name of the schema used to create the materialized viewview-name
: Name given that uniquely identifies the viewparameter-file
: The path to the file in which the parameters will be saved
Example
Again, the materialized view created earlier will be used.
Step 2. Modify the Configurations
There are a few rules as to how a materialized view can be updated:
view-type
must be "regular" in order to be updated- Only the SQL query can be edited or manipulated
- New columns can be added after the last existing column in the query
- Existing columns and column types cannot be edited or removed
Example
Below are the modifications to the original view_test1
materialized view. Notice the addition of the age
column.
Step 3. Create an Update Set
An update set is represented by a manifest file stored on a local disk. This manifest file contains two pieces of information for each materialized view in the update set:
- The action type to be performed on the materialized view
- The file path to the DPL file in which the updates are saved
NOTES:
- The only action type currently available is "update"
- Each update set manifest can only contain up to one action for a given materialized view
- An update set manifest can contain actions for multiple different materialized views
The modified materialized view configuration can be added to the update set by running the following command.
mv_file_name
: The path to the file in which the configuration was saved during step 1
Example
In the example of the configuration in step 3, the following command will add the modified configuration to the update set.
Step 4. Stage the Updates
The materialized view updates will be staged in your installation by running the following command.
Example
Rounding out the example of updating a materialized view is the command to stage the modifications.
Please note that in this example, the timelabel during which to stage these updates will be automatically computed. For more information regarding optional control parameters, please read the Organize CLI referrence.
Delete a Materialized View
A materialized view can be deleted from an installation by running the following command
schema-name
: Name of the schema used to create the materialized viewview-name
: Name that uniquely identifies the materialized view
Example
In order to delete the materialized views created over the course of this page, the commands below are run.