Random forest--random forests

Last Update:2016-04-17 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

[Basic algorithm] Random forests

August 9, 2011

Random Forest (s), stochastic forest, also called Random trees[2][3], is a combined prediction model composed of multiple decision trees, which can be used as a fast and effective multi-class classification model. As shown, each decision tree in RF consists of a number of split and node: Split directs the output direction (left or right) by the input test value, and node is the leaf node, which determines the final output of the single decision tree, the probability distribution of the genus in the classification problem, or the maximum probability class genus, The function is evaluated in the regression problem. The output of the entire RT is determined by a number of decision trees, Argmax or Avg.

Node Test
Node test is usually very simple, but many of the simple twists and turns are incredibly powerful, and the joint predictive model is something like that. Node test is different from the app. For example [1] The application is based on the depth map of human body parts recognition, the use of node test is based on the depth of the pixel x comparison test:

To put it simply, the difference in the depth of the pixel of Pixel x on the u and v displacements is greater than a certain threshold value. The u and v displacements are divided by the X depth value to make the depth difference independent of the depth of the x itself, regardless of the distance the body is from the camera. This node test is meaningless at first glance, and in fact it doesn't make much sense, and the result of a single test can be just a little more than a random classification. But just as the Haar feature is a very weak feature, the key to work is the strength of subsequent boosting or bagging--effective unions that can be combined.

Training
RF belongs to the bagging class model, so the general training process and bagging similar, the key is the random selection of samples to avoid the overfitting problem of the model. Each decision tree in RF is trained separately and is not associated with each other. For each decision tree, a subset of samples is formed before training, and some samples may appear multiple times in this subset, while others may not occur at one time. Next, is the sequential decision tree Training algorithm, for the sample subset of the single decision tree training.
The creation of a single decision tree follows roughly the following process:
1) randomly generate a subset of samples;
2) Divide the current node into the left and right nodes, compare all the optional splits, select the best person;
3) repeat 2) until the maximum node depth is reached, or the current node classification accuracy is met.
This process is greedy.
Of course, for different applications, there will be differences in the details of the training process, such as the generation of sample subsets and the definition of optimal segmentation.
In [1], the actual sample of the decision tree is actually the pixel x in the image, and the value of the variable is the node test mentioned above. However, for a fixed-size picture, the desirable pixel x is a large number, the desirable displacement (u,v) and depth difference threshold is almost infinite. Therefore, [1] in the training of a single decision tree, the sample subset to be done in fact involves random generation of the x set of pixels, displacement (u,v) and depth difference threshold combination of random generation, and finally the training depth map of the collection itself randomly generated.
Optimal splitting is typically defined as the classification that maximizes information increment, such as the definition in [1]:

H refers to entropy, which is calculated by the distribution of the parts of the split subset.

Reference:
[1] J. Shotton, A. Fitzgibbon, M. Cook, T. Sharp, M. Finocchio, R. Moore, A. Kipman, and a. Blake. real-time Human Pose recognition in Parts from a single Depth Image. In CVPR 2011.
[2] L. Breiman. Random forests. Mach. Learning, 45 (1): 5–32, 2001.
[3] T. Hastie, R. Tibshirani, J. H. Friedman. The Elements of statistical learning . ISBN-13 978-0387952840, 2003, Springer.
[4] v. Lepetit, p. Lagger, and P. Fua. randomized trees for real-time keypoint recognition. In Proc. CVPR, pages 2:775–781, 2005

Transfer from http://lincccc.com/?p=47

from:http://blog.csdn.net/yangtrees/article/details/7488937

Random forest--random forests

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Random forest--random forests

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Random forest--random forests

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support