NLP Technical Classification
NLP technology modules can be grouped into the following categories:
1. Classification algorithm: SVM, naive Bayesian, K nearest neighbor, decision Tree, integrated learning (principle and application)
2, Clustering algorithm: Kmeans, hierarchical clustering, density clustering (principles and applications)
3, Probability graph model Hmm, CRF (principle and application)
4. LDA, pLSA, LSA (principle and application)
5. Semi-supervised learning (algorithmic thinking)
6, new Word Discovery (technical implementation Details)
7, Word measurement standard: frequency, TFIDF;
Distance calculation formula: Euclidean distance, cosine distance;
Feature Selection algorithm: Card-side statistics, mutual information, information gain;
8, the text representation method: vector space model, Word2vec, Doc2vec
9. Depth Learning Algorithm cnn,rnn,lstm,seq2seq (principle and application)
10, keyword Extraction pagerank, textrank (method and principle)
11, the sentence similarity degree Simhash (method and principle)
12. Relationship extraction (principle and application)
13, the process of large data ETL