Opencv random forest Parameters

Source: Internet
Author: User

Opencv random forest Parameters

[Original source]: http://blog.csdn.net/sangni007/article/details/7488727

Thank you for your translation.

In opencv2.3Inheritance Structure:

API:

Cvrtparams Defines the extension subclass of the parameter cvdtreeparams for R. T. Training, but does not use all the parameters required by cvdtreeparams (single decision tree. For example, R. T. Usually does not require pruning, so pruning parameters are not used.
Max_depth
Maximum depth a single tree can reach
Min_sample_countMinimum number of samples for continuous split of a tree node. That is to say, a node smaller than this number will not continue to split and become a leaf.
Regression_accuracyThe termination condition of the regression tree. If the precision of all nodes reaches the requirement, it is stopped.
Use_surrogatesWhether to use proxy for splitting. It is usually false, which is true when there is a defect in data or when calculating the importance of the variable. For example, the variable is color, and some areas in the image are completely black because the light
Max_categoriesClustering all possible values to a finite class to ensure the computing speed. The tree will grow in the form of suboptimal split. Only valid for trees with two or more values
PriorsSet the priority to set some classes or values that you are particularly concerned about, so that the training process pays more attention to their classification or regression accuracy. Usually not set
Calc_var_importanceSet whether to obtain the important value of the variable. Generally, set it to true.
Nactive_varsEach node in the tree randomly selects the number of variables and finds the best split based on these variables. If the value is set to 0, the square root of the sum of variables is automatically obtained.
Max_num_of_trees_in_the_forestThe maximum number of trees that may exist in R. T.
Forest_accuracyAccuracy (as the condition for termination)
Termcrit_typeTermination condition settings
--Cv_termcrit_iterTake the number of trees as the termination condition, max_num_of_trees_in_the_forest takes effect
--Cv_termcrit_epsWith the accuracy as the condition for termination, forest_accuracy takes effect
--Cv_termcrit_iter | cv_termcrit_epsBoth are termination conditions
Cvrtrees: Train Training R. T.
Return boolTraining successful?
Train_dataTraining data: Sample (a sample is defined by a fixed number of variables) stored in mat format and arranged in columns or rows. It must be in the cv_32fc1 format.
TflagTraindata arrangement structure
--Cv_row_sampleRow Arrangement
--Cv_col_sampleColumn Arrangement
ResponsesTraining data: The sample value (output) is stored as a one-dimensional mat. It corresponds to traindata and must be in the cv_32fc1 or cv_32sc1 format. For classification problems, responses are class labels; for Regression Problems, responses are the function values to be approached
Var_idxDefines the variables of interest. Some of the variables indicate that null represents all
Sample_idxDefines samples of interest. Some of the samples are null, indicating all
Var_typeDefine the responses type
--Cv_var_categoricalCATEGORY tag
--Cv_var_ordered(Cv_var_numerical) Value for Regression
Missing_maskDefines the missing data, which is as big as the eight-bit mat of train_data.
Params Training parameters defined by cvrtparams
Cvrtrees: Train Train the R. T. (short version of the train function)
Return boolTraining successful?
DataTraining data: cvmldataformat, which can be read from external .csv files and stored in mat format internally. It is also similar to value/responses/missing mask.
Params Training parameters defined by cvrtparams
Cvrtrees: predict Prediction (classification or regression) of a group of Input Samples)
Return doublePrediction Result
SampleInput sample, in the same format as train_data of cvrtrees: Train
Missing_maskDefine missing data

Example:

  1. # Include <cv. h>
  2. # Include <stdio. h>
  3. # Include
  4. # Include <ml. h>
  5. # Include <map>
  6.  
  7. Void print_result (floattrain_err, floattest_err,
  8. Constcvmat * _ var_imp)
  9. {
  10. Printf ("Train Error % F \ n", train_err );
  11. Printf ("test error % F \ n", test_err );
  12.  
  13. If (_ var_imp)
  14. {
  15. CV: matvar_imp (_ var_imp), sorted_idx;
  16. CV: sortidx (var_imp, sorted_idx, cv_sort_every_row +
  17. Cv_sort_descending );
  18.  
  19. Printf ("variable importance: \ n ");
  20. Int I, n = (INT) var_imp.total ();
  21. Int type = var_imp.type ();
  22. Cv_assert (type = cv_32f | type = cv_64f );
  23.  
  24. For (I = 0; I <n; I ++)
  25. {
  26. Intk = sorted_idx.at <int> (I );
  27. Printf ("% d \ t % F \ n", K, type = cv_32f?
  28. Var_imp.at <float> (k ):
  29. Var_imp.at <double> (k ));
  30. }
  31. }
  32. Printf ("\ n ");
  33. }
  34.  
  35. Int main ()
  36. {
  37. Const char * filename = "data. xml ";
  38. Int response_idx = 0;
  39.  
  40. Cvmldata data;
  41. Data. read_csv (filename); // read data
  42. Data. set_response_idx (response_idx); // set response index
  43. Data. change_var_type (response_idx,
  44. Cv_var_categorical); // set response type
  45. // Split train and test data
  46. Cvtraintestsplitspl (0.5f );
  47. Data. set_train_test_split (& Spl );
  48. Data. set_miss_ch ("? "); // Set Missing Value
  49.  
  50. Cvrtrees rtrees;
  51. Rtrees. Train (& Data, cvrtparams (10, 2, 0, false,
  52. 16, 0, true, 0,100, 0, cv_termcrit_iter ));
  53. Print_result (rtrees. calc_error (& Data, cv_train_error ),
  54. Rtrees. calc_error (& Data, cv_test_error ),
  55. Rtrees. get_var_importance ());
  56.  
  57. Return 0;
  58. }

 

References:
[1] opencv 2.3 online documentation: http://opencv.itseez.com/modules/ml/doc/random_trees.html
[2] Random forests, Leo breiman and Adele Cutler: http://www.stat.berkeley.edu/users/breiman/RandomForests/cc_home.htm
[3] T. Hastie, R. tibshirani, J. H. Friedman.The
Elements of Statistical Learning
.ISBN-13 978-0387952840,200 3, Springer.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.