Cleanse

Prev Next

Description

The Cleanse node cleanses text columns in a data set.


Configuration Options

Basic Configuration Options

Setting Description\Parameters
Text Columns to Cleanse Columns from the parent node that should have text cleansing applied to them.
Trim Leading/Trailing Spaces Removes all leading and trailing spaces from the text field.
Datetime Breakout Breaks a datetime data type column out into six string columns representing year/month/day/hour/minute/second.
Case Conversion Options include Upper or Lower.
Replace Common Names Replaces a list of common names with the designated text string.
Replace Stop Words Replaces a list of stop words with the designated text string.
Encryption Options include Encrypt or Decrypt to encrypt/decrypt a field.

Advanced Configuration Options

Regular Expression Replacement

Setting Description
Description Description of the replacement.
Search for RegEx Regular expression used for pattern matching.
Replace With Replacement text.

Stop Words and Common Names

Setting Description
Stop Words a, able, about, across, after, all, almost, also, am, among, an, and, any, are, as, at, be, because, been, but, by, can, cannot, could, dear, did, do, does, either, else, ever, every, for, from, get, got, had, has, have, he, her, hers, him, his, how, however, i, if, in, into, is, it, its, just, least, let, like, likely, may, me, might, most, must, my, neither, no, nor, not, of, off, often, on, only, or, other, our, own, rather, said, say, says, she, should, since, so, some, than, that, the, their, them, then, there, these, they, this, tis, to, too, twas, us, wants, was, we, were, what, when, where, which, while, who, whom, why, will, with, would, yet, you, your

CommonNames.txt


Actions

Action Description
Add Regex Adds a new regular expression which allows you to perform text replacement using regular expressions.
Preview Once the node is configured, the combined result set can be previewed at any time.