I have studied Random Forest and RainForest papers, but they are a bit confusing! In summary, I understand following steps for these algorithms. Could you help me to find out if I am right or not?
I appreciate your help.
In Random forest first:
- define number of trees
- partition data by bootstrapping
- on each partition construct trees (in each node a sub sample of features is selected)
- label leaf nodes
- for classifying a new instance vote over all trees.
In RainForest:
- Partition dataset
- Build AVC-set of a partition
- Build tree over the partition by computing a purity criterion (like gini-index) over AVC-sets