Why use Random Forests? The simple reason is to build a large, indeed very large, number of weak learners and aggregate their results to form an extremely strong learner. If the weak learners are highly correlated then this is not likely to end well. The key trick in developing a random forest is to shuffle the training data a little per tree and per split of each tree where new branches are created.
Hit '>Play' above to see how Random Forests effortlessly handle missing data and outliers
Random forests can be shown to be very robust to outliers in data. Random forests and the decision trees of which they are comprised are local models. Fitting a simple model to each subspace. These local models tend to crowd out outliers by isolating them in small leaves.
The power of Random Forests derives from being composed of uncorrelated decision trees. The video above explains how 'bagging' and 'feature extraction' allow us to build uncorrelated trees in our forests.
The video also describes additional desirable properties of Forests, such as dealing with missing data, both in our training dataset and with live samples.
Random Forests can also easily identify which records are most similar to others and identify which features are most important in generating decision boundaries.
As discussed Forests are very robust to outliers, building local models to crowd out and isolate idiosyncratic data points.
