Read this blog should understand that the feature selection code implementation should consist of 3 parts:
- Search algorithm;
- evaluation function;
- Data
Therefore, the general form of the code is:
attributeselection Attsel = new Attributeselection ();//Create and initiate a new Attributeselection instance
Ranker search = new Ranker ();//Choose a search Method
principalcomponents eval = new principalcomponents ();//Choose an evaluation method
Attsel.setevaluator (eval);//Set Evaluation method
Attsel.setsearch (search);//Set Search method
Attsel. Selectattributes (data); Set the data to is used for attribute selection
Where the search method and the evaluation function are different:
Property Evaluation methods:
Cfssubseteval: Evaluates the ability to predict each feature in a subset of attributes and the correlation between them.
Gainratioattributeeval: evaluated based on the gain ratio of each attribute associated with the classification.
Infogainattributeeval: evaluated based on the information gain of each attribute associated with the classification.
Chisquaredattributeeval: evaluated based on the Chi-square value of each attribute associated with the classification.
Symmetricaluncertatrributeeval: evaluated based on the symmetry instability of each attribute associated with the classification.
Classifiersubseteval: Evaluates a subset of attributes based on data other than the training set or test set.
Consistencysubseteval: To evaluate the consistency of the classification values obtained by using the subset of attributes.
Costsensitiveattributeeval: Depending on how the base subset evaluates the cost sensitivity, the variance selects the subset evaluation method.
Costsentitivesubseteval: The same way.
Filteresattributeeval: Any property evaluation that runs on data after any filter.
Filteredsubseteval: The same way.
Latensemanticanalysis: evaluated based on the potential semantic analysis and transformation of the data, combined with random search.
Onerattributeeval: Evaluates attributes based on the Oner classifier.
Principalcomponents: evaluated based on the main component analysis and conversion of the data.
Relieffattributeeval: Evaluated by repeatedly testing an instance and the attribute values on the nearest instance of its homogeneous or non-homogeneous class.
Significanceattributeeval: Calculates the value of the probabilistic meaning of the bidirectional function evaluation attribute.
Symmetricaluncertatrributeseteval: evaluated based on the symmetric instability of each property associated with the other attribute set.
Wrappersubseteval: An attribute set is evaluated using a learning pattern.
Search algorithm:
Bestfirst: Traceable Greedy search expansion, the best priority principle.
Exhaustivesearch: Exhaustive search, starting from the empty set.
Fcbfsearch: Feature Selection method based on correlation analysis. Relevance matching search.
Geneticsearch:goldberg (1989) presents a simple genetic algorithm.
Greedystepwise: A single step forward or backward search.
Linearforwardselection: Linear forward search.
Racesearch: A cross-validation error condition that compares a subset of features.
Randomsearch: Random Search.
Ranker: Sorts the attribute values.
Ranksearch: Select an evaluator to sort the attributes.
ScatterSearchV1: Discrete search.
Subsetsizeforwardselection: Forward linear search by feature subset size, which is an extension of linear search.
Tabusearch: Taboo Search.
subset Search Methods:
1. Bestfirst
2. Greedystepwise
3. Fcbfsearch (ASU)
subset Evaluation Methods:
1. Cfssubseteval
2. Symmetricaluncertattributeseteval (ASU)
individual Search Methods:
1. Ranker
individual Evaluation Methods:
1. Correlationattributeeval
2. Gainratioattributeeval
3. Infogainattributeeval
4. Onerattributeeval
5. Principalcomponents (used with a rander search to perform PCA and data transform
6. Relieffattributeeval
7. Symmetricaluncertattributeeval
Code styles can be consulted: http://java-ml.sourceforge.net/content/feature-subset-selection
Introduction to "Machine learning" wekaの Feature Selection