Configuring a Union
This page demonstrates the steps needed to create a union
Last updated
This page demonstrates the steps needed to create a union
Last updated
Union operation can be performed by following the steps below.
Click Data Prep from the top menu
Select the type of Data Preparation that is intended (Union in this example)
Union operator would show on the canvas.
Now select the dataset from the left panel, drag and drop it to the canvas on the right. Repeat this step for the 2nd dataset.
Both datasets would show on the canvas along with the Union operator
Now connect the 2 datasets with the Union node.
Click the edge connecting any one of the datasets with Union node, then the screen below would pop-up. Records can be selected either based on the date range or the time interval. Also desired columns in the result set can be selected on this screen
Repeat the above step for the edge connecting 2nd dataset with the Union node.
Click the Union node. Panel at the bottom would show up with a superset of the columns from both the datasets. When any one of the datasets has a column that is missing in the other one, an entry in the action column called "Add missing data" would show up. This action is optional. In default mode rows from the dataset lacking this column would result in null values for the same.
Alternatively by clicking "Add missing data" a window would pop-up. It allows adding a value to the field
Click Save button on the top right of the canvas. This would prompt for output parameters that determine target dataset to which result set of the join is written.
Select the workspace and collection
Select either an existing dataset in the collection or create a new by right click
Select the directory format (Parquet/CSV)
Select the transfer mode (where applicable)
append
upsert
truncate and insert
drop & create
Click Next
Select name for the job that is being executed
Job can be executed either one-time or be schedule as recurring one that runs at the specified frequency.
Click Submit to execute the job
Submitted job can be monitored on the Job Monitoring screen