Ensemble of Resampled Datasets - Yousef's Notes
Ensemble of Resampled Datasets

Ensemble of Resampled Datasets

  • Randomly chunk majority examples into H subsets and create H training sets.
  • After training H models, we make predictions by averaging (or taking the majority) of the outputs of H models.
    Figure: H=4
  • Imbalanced binary learning problem
  • Chunk the examples of the majority class into four subsets.
  • Copy four times all the examples of the minority class.
  • Simple and scalable approach.
  • We can train and run our models in parallel on different GPU.
  • Ensemble tends to produce a better prediction than the individual