String Calculation Operations

Here are the 'Calculation' operations for string type transformations

Suffix

Adds a suffix to the chosen column. The value of the suffix can be configured to be another column or a fixed value/constant. The separator between the two column values can be chosen to be one of the following -

  1. None

  2. Space

  3. The character “-”

  4. The character “~”

  5. The character “_”

Prefix

Adds a prefix to the chosen column. The value of the prefix can be configured to be another column or a fixed value/constant. The separator between the two column values can be chosen to be one of the following -

  1. None

  2. Space

  3. The character “-”

  4. The character “~”

  5. The character “_”

Padding

Pads the string to the left or right with the specified character to the specified length. Only the first character specified for padding is used.

Concat

Concatenates the value in the selected column with another value from another column or a constant value into a single attribute.

The separator between the two column values can be chosen to be one of the following -

  1. None

  2. Space

  3. The character “-”

  4. The character “~”

  5. The character “_”

Length

Returns the length of the value in the selected column.

Remove Special Characters

Name

Character

Dot

.

Comma

,

Backward slash

\

Forward slash

/

Pound/Hash

#

Exclamation

!

Dollar

$

Percentage

%

Caret

^

Ampersand

&

Asterisk

*

Semicolon

;

Colon

:

Open brace

{

Close brace

}

Equal

=

Hyphen

-

Underscore

_

Back Quote

`

Tilda

~

Open bracket

(

Close bracket

)

Extract domain from URL

Extracts the value of the domain from the URL in the string.

Generate Stemming

Reduce words in the text to their base or root form. It helps in eliminating variations of a word, such as different tenses, plurals, or derivations, which can improve text analysis and information retrieval tasks.

For example, the words “programming,” “programmer,” and “programs” can all be reduced down to the common word stem “program.” In other words, “program” can be used as a synonym for the prior three inflection words.

Remove Diacritics

Cleans the text from all types of diacritical marks and returns the text consisting of only standard Latin letters.

Diacritical mark - A sign, such as an accent or cedilla, which when written above or below a letter indicates a difference in pronunciation from the same letter when unmarked or differently marked.

Split Value

Splits an input string into two values.

  • Delimiter specified in the “Split By” textbox.

  • Start from the beginning or the end.

  • Choose the occurrence - first or last.

  • Include pattern in the result

Pattern Matcher

Returns boolean true or false based on whether an input string matches the provided pattern.

Substring

Returns a substring of an input string according to specified start and end positions

This transformation returns a value (substring) beginning at the specified start position (‘Start From Index’) and extending to include the character before the specified end position (End At Index), with position counting beginning with index 0.

Replace all occurrence

Replaces all occurrences of input text

Last updated