Dataset Transformations

These transformations apply to the full dataset.

Group By

The Group By is used to combine rows with identical value of a field into summary row like “count of countries in each continent”. It allows application of aggregate functions (like COUNT/MIN/AVG etc.) to group the results based on one or more columns.

Filter Row

Query

Filter data using valid spark sql queries.Multiple filter conditions can be combined using AND operator as mentioned in the description pane

UI Query

Filter data using UI model to select column and to apply filter/where conditions.

Duplicate Row

UI Query

Filter data using UI model to select column and filter/where conditions and then provide replacement value of each column using mapping section

Query

Filter data using valid spark sql queries and then provide replacement value of each column using mapping section. Multiple filter conditions can be combined using AND operator as mentioned in the description pane

Case

A case function gives an ability to choose different outcomes based on certain conditions. It requires a WHEN statement (condition) at the beginning. If the condition is true, it performs the action specified in the THEN clause. Additional WHEN/THEN clause can be added using ELSE/ELSE-IF clause

UI Query

Editor Query

CASE statement can be specified in text as described in the Description pane

Last updated