A classic way to consider a typical text categorization is to
??
- Participle, scan all features, create a feature dictionary
- Re-scan all features, use feature dictionaries to map features to feature space numbers to get eigenvectors
- Learning Parameters W
- Storage Learning parameter W, storage feature mapping dictionary
- Predictive truncation Load Learning parameter w, load feature map dictionary
- Scan data, map all features using feature map dictionaries to feature space numbers to get eigenvectors
- Using the Learning parameter w to predict the dot product of the obtained eigenvector
??
??
What do Feature hashing do?
Without the use of the feature dictionary, you do not have to consider the extra space of the storage dictionaries and hash the features directly.
A conflict? It will show that the effect is not very significant!
??
??
??
With the same memory footprint we can store more weights!
? ?
Feature Hashing Related-1