Partition
  • 06 Jan 2023
  • 1 Minute to read
  • Dark
    Light

Partition

  • Dark
    Light

Article summary

Description

The Partition node is used to split a data set randomly, using a predefined seed number. A new 'ROLE' column will be created with the default values 'TRAINING' or 'SCORING' for each row.


Configuration Options

Basic Configuration Options

SettingDescription\Parameters
Partition Column NameName of the new column to be created. Data is partitioned always into two groups. We call those groups the role.
Partition ModePartition By Fraction splits data into training/scoring records by fraction, i.e. 80% of records become training, and 20% become scoring. Partition By Record Count assigns an exact number of records to training/scoring, i.e., assign the first 10 records to training, or the last 5 records to scoring.
Percent TrainingPercent of the data set that will have the Training Value assigned to it.
Percent ScoringPercent of the data set that will have the Scoring Value assigned to it.
Training ValueText value to be assigned to the Percent Training fraction.
Scoring ValueText value to be assigned to the Percent Scoring fraction.
Row AssignmentSelect Training Records to set the exact number of training records; the rest will become scoring records. Select Scoring Records to set the exact number of scoring records; the rest will become training records.
Partition RecordsNumber of records to assign to the selected Row Assignment setting.
Sample RandomlySample Randomly assigns training/scoring rows at random throughout the data set. Split Time Series Data splits rows into training/scoring while maintaining order.
Random SeedIf Static, use the same seed value each time this node is run. If Random, use a new seed value each time.
Seed ValueNumber used to seed the random split between the scoring and training fraction. Changing this number will change the random distribution.
Split ColumnsThe selected column(s) will be used to split the data set over. Each new subset will be assigned to training/scoring.
Sort By ColumnsSort data rows by these columns.

Actions

ActionDescription
PreviewAfter configuring the node, the combined result set can be previewed by clicking the Preview button.

Was this article helpful?

What's Next