Partition

Updated on Jan 13, 2023
Published on Jan 28, 2022

1 minute(s) read

Description

The Partition node splits a data set randomly, using a predefined seed number. A new 'ROLE' column is created with the default values 'TRAINING' or 'SCORING' for each row.

Configuration Options

Basic Configuration Options

Setting	Description\Parameters
`Partition Column Name`	Name of the new column to be created. Data is partitioned into two groups called and called the Role.
`Partition Mode`	`Partition By Fraction` splits data into training/scoring records by fraction, i.e., 80% of records become training, and 20% become scoring. `Partition By Record Count` assigns an exact number of records to training/scoring, i.e., assign the first ten records to training, or the last five records to scoring.
`Percent Training`	Percent of the data set that has the `Training Value` assigned to it.
`Percent Scoring`	Percent of the data set that has the `Scoring Value` assigned to it.
`Training Value`	Text value assigned to the `Percent Training` fraction.
`Scoring Value`	Text value assigned to the `Percent Scoring` fraction.
`Row Assignment`	Select `Training Records` to set the exact number of training records; the rest become scoring records. Select `Scoring Records` to set the exact number of scoring records; the rest become training records.
`Partition Records`	Number of records assigned the selected `Row Assignment` setting.
`Sample Randomly`	`Sample Randomly` assigns training/scoring rows at random throughout the data set. `Split Time Series Data` splits rows into training/scoring while maintaining order.
`Random Seed`	If `Static`, use the same seed value each time this node is run. If `Random`, use a new seed value each time.
`Seed Value`	Number seeds the random split between the scoring and training fraction. Changing this number alters the random distribution.
`Split Columns`	The selected column(s) will be used to split the data set over. Each new subset is assigned to training/scoring.
`Sort By Columns`	Sort data rows by these columns.

Actions

Action	Description
`Preview`	Once the node is configured, the combined result set can be previewed at any time.

Was this article helpful?