Problem: Trees can grow too noisy. (= high variance)
Solutions: Bagging, Boosting, Random Forests
General performance: Boosting > Random Forests > Bagging > Single Tree
1. Bagging (= bootstrap aggregation)
- Random samples of equal size -> generate a fitted tree for each
- Average them!
2. Random Forests
- De-correlate! How?
- At each point of split, pick sqrt(number of features) random features as candidates
3. Boosting (= stage-wise additive modeling)
- Adaboost
- Gradient Boosting Machine
References: