Feedforward Neural Network Language Model (NNLM) C + + core code implementation

Last Update:2015-03-30 Source: Internet

Author: User

Tags network function

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This article from Csdn Blog, reproduced please indicate the source: http://blog.csdn.net/a635661820/article/details/44730507
references: A neural probabilistic Language Model
referring to my other NNLM study introduced the blog, this is a brief implementation of NNLM, I simplified some, the input layer to the output layer is not connected (plus the straight edge of the original paper did not significantly improve), and there is no parallel algorithm. Paste some of your core code below. Overall, I used C + + object-oriented design of the algorithm, roughly divided into 6 classes, as follows:

Clex class: A class for working with text
Cinput Class: Input layer class, containing the relevant amount
Chidden class: Hidden layer class, containing the relevant amount
Coutput Class: Output layer class, containing related variables
Calgothrim class: Algorithm class, involving forward algorithm and reverse update algorithm, string up the first 3 classes
Cforecast class: Test class, training the model after the test of the relevant operation of the model

The parameters of each manual setting in the network are macroDefinition.h, including the number of hidden neurons, the dimension of eigenvector and so on. The accompanying code here only shows the core code of the Code, namely Cinput, Chidden, Coutput, Calgothrim.
network manually set parameters in MacroDefinition.h, defined as macros, the initial settings are set according to the paper:

<span style= "Font-family:microsoft yahei;font-size:14px;" >//The following is the macro definition of each variable, left to do interface, convenient debugging # define M 100//a word corresponding to the dimensions of the vector m#define N 3//input layer of the number of words, the first n+1 is the prediction Word # define HN 60//hidden Number of hidden layer neurons # define EPSILON 0.001//Neural network Learning rate # define converge_threshold-4//manually set cumulative logarithmic probability convergence value # define Foreword_num 5// The top 5 words of the highest output probability when predicting a model </span>

inside the Input.h file, the structure of the class:

<span style= "Font-family:microsoft yahei;font-size:14px;" The structure of the >//input layer is defined by class Cinput  {public:bool loadcorpusfile (char *corpusfile);//The corpus file is read into bool Computex (int k);// Computes the x vector of a sentence input word Getwordbyid (unsigned long ID);//Returns the word unsigned long Getidbyword (string word) According to the ID, or returns its idbool according to the word; nextsentence (int i);//read into the next sentence from the Corpus file Cinput ();//Generate dictionary, two-dimensional matrix virtual ~cinput ();//release the word map matrix, release the dictionary Clex *pchnlex;// Pointer to the Chinese dictionary float **vec;//The mapping matrix for the input word vector<string> sentence;//a line of sentences vector<string> corpus;//Corpus float *x;// Save input layer Word feature vector long unsigned expectedid;//training sentences nowadays an output word id};</span>

The implementation of the class is in Input.cpp:

<span style= "Font-family:microsoft yahei;font-size:14px;" >cinput::cinput () {//constructor//Generate dictionary, two-dimensional matrix Pchnlex = new clex;//Dynamically assign Dictionary object if (NULL = = Pchnlex)//allocation failed {printf ("Dynamic Allocation dictionary failed! \ n "); exit (1);} if (!pchnlex->loadlexicon_null ("OUTPUT.VOC"))//The pointer to the Chinese dictionary is connected to the dictionary text library {printf ("Loading the Chinese dictionary failed!") \ n ");//Load Failure output error message to screen}unsigned long Sizev = pchnlex->ulvocsizec;//the size of the record dictionary Srand (Time (NULL));//random number seed Vec = new float *[ sizev];//begins assigning a word map to a two-dimensional matrix if (NULL = = VEC)//allocation failed {printf ("Assigning a word mapping matrix failed! \ n "); exit (1);} For (unsigned long i=1; i<sizev; i++)//Initialize two-dimensional matrix {vec[i] = new float[m];//assign satisfies each word corresponding to M-dimension if (NULL = = Vec[i]) {printf ("Assign Word map to Volume failed! \ n "); exit (1);} for (int j=0; j<m; j + +)//assignment -0.5-0.5 decimal {vec[i][j] = (float) (((rand ()/32767.0) *2-1)/2);}} x = new float[m*n];//input layer word eigenvector if (NULL = = x) {cerr << "input layer Word feature vector assignment failed" << endl;exit (1);} memset (x, 0, m*n*sizeof (float));//x Vector clear 0}cinput::~cinput () {//destructor//release word map matrix//release Chinese dictionary unsigned long Sizev = pchnlex- >ulvocsizec;//Record dictionary size for (unsigned long i=1; i<sizev; i++)//release Word map matrix {Delete [] vec[I];vec[i] = NULL;} Delete []vec;vec = null;delete pchnlex;//release Chinese dictionary pchnlex = null;delete [] x;//release xx = NULL;} BOOL Cinput::nextsentence (int i) {//Read the first sentence from anticipation//and the words in the sentence are separated by a space into the container string line;string word;if (i >= corpus.size ()) Error {CERR << "!!!!! Error subscript "<< Endl;return false;} line = corpus[i];//I sentence stringstream instring (line);//The character stream is associated with the sentence sentence.clear ();//Clear the previous sentence while (instring >> Word) {sentence.push_back (Word);//The words in the sentence are separated by spaces Sentence}return true;} unsigned long Cinput::getidbyword (string word) {//returns its Idreturn pchnlex->findword (word) according to the word;} String Cinput::getwordbyid (unsigned long ID) {//returns the word return Pchnlex->getlexiconbyid (ID) by ID;} BOOL Cinput::computex (int k) {//Calculate a sentence input word x vector//k used to control multiple training a sentence long unsigned id;long unsigned vsize = pchnlex-> ulvocsizec;//dictionary size int i, J, T;for (i=k,t=0; i<n+k; i++,t++) {if (I >= sentence.size ())//The sentence is finished training {return false;} id = Getidbyword (sentence[i]),//Get input layer I word idif (id >= vsize) {cerr << "input sentence is wrong (probably due to the current training corpus is too small, the thesaurus is too few)" &LT;&LT Endl;exit (1);} for (j=0; j<m; J + +) {X[m*t+j] = Vec[id][j];}} if (K+n >= sentence.size ())//The sentence is finished training {Expectedid = 0;//will not appear ID number callout return false;} Expectedid = Getidbyword (Sentence[k+n]);//Get training sentence output word id}bool cinput::loadcorpusfile (char *corpusfile) {// Read the corpus file into memory Ifstream infile (corpusfile);//The file to be read in if (!infile)//read in failed {return false;} String line;string word;while (getline (infile, line))//read a sentence from the corpus into the container {corpus.push_back (lines);} Infile.close ();//close file return true;} </span>

The structure of the hidden layer is defined inside the Hidden.h:

<span style= "Font-family:microsoft yahei;font-size:14px;" >//the structure of the hidden layer class Chidden  {Public:chidden ();//generate the input layer to the hidden layer of the weight matrix, the output vector of the hidden layer neuron, the biased vector virtual ~chidden ();//Release weight matrix H, vector a , Dvoid Outhidden (float *x);//calculation of the output of the hidden layer float **h;//from the input layer to the hidden layer of the weight matrix (variable name in accordance with the paper, convenient control) float *a;//the output vector of the hidden layer, float *d;//The bias vector of the hidden layer} ;</span>

The implementation of the hidden layer in Hidden.cpp:

<span style= "Font-family:microsoft yahei;font-size:14px;" >chidden::chidden () {///constructor//Generate input layer to hidden layer weight matrix//Generate hidden layer neuron output vector//generate hidden layer offset vector H = new float *[hn];//input layer to hidden layer weight matrix HIF (NULL = = H) {Cerr << input layer to hidden weight matrix allocation failed! "<< endl;exit (0);} int I, j;for (i=0; i
the structure of the output layer is defined in Output.h:<span style= "Font-family:microsoft yahei;font-size:14px;" The structure of the >//output layer defines class Coutput  {public:void Initialize (long unsigned lexiconsize), and/or generates a weight matrix for the hidden layer to the output layer, a bias vector, an output vector, Output word probability vector void SoftMax (void),//sotfmax regression layer, normalized, calculate probability void output (float *a, long unsigned expectedid);//Compute output layer output coutput ( Virtual ~coutput ();//Release weight matrix U, vector b,y,pfloat **u;//hidden layer to output layer weight matrix float *y;//output layer output float *b;//output layer neuron offset vector float *p;// Output word probability vector long unsigned vsize;//corpus dictionary size float l;//cumulative logarithmic probability};</span>

The implementation of the output layer is in the Output.cpp file:<span style= "Font-family:microsoft yahei;font-size:14px;" >coutput::coutput () {//do nothing}coutput::~coutput () {//Release weight matrix u, vector b,y,plong unsigned i;for (i=1; i<vsize; i++)/ /release H{delete [] u[i]; U[i] = NULL;} delete []u; U = null;delete []b;//release Ddelete []y;//release Adelete []p;//release PB = Null;y = Null;p = NULL;} void Coutput::output (float *a, long unsigned expectedid) {//COMPUTE output Layer output//a,expectedid is the output vector of the hidden layer, the output word of the sentence idlong unsigned i ; int j;float zigma;for (i=1; i<vsize; i++)//According to Y<-b + Ua {zigma = 0.0;for (j=0; j
the structure of the algorithm class is defined in Algothrim.h:<span style= "Font-family:microsoft yahei;font-size:14px;" The structure of the >//algorithm layer is defined by class Calgothrim  {Public:bool saveparameters ();//save network All parameters to file Weight.txt void Printoverinfo (long unsigned count, int sentnum);//Tip Training ends void Printstartinfo (char *corpusfile);//print information about the neural network version training bool Writeresult ();// Outputs the probability vector result of the output layer to file bool Run (char *corpusfile);//model Training framework function void Initialize ();//generate intermediate gradient vector void UpdateAll ();// The inverse algorithm updates all parameters of the network void Printparametersafterupate ();//After the reverse update parameter prints all parameters of the network void Updateinput ();//update input layer vecvoid Updatehidden (); /update the hidden layer's h,dvoid printparametersbeforeupdate ();//output all parameters of the entire network void Updateout () before the reverse update,//update the output layer parameters U,bcalgothrim (Cinput * PIn, Chidden *phi, Coutput *pou),//3 parameters represent the input layer, the hidden layer, the output layer pointer virtual ~calgothrim ();//Release bias Guide Amount cinput *pinput;// Pointer to input layer Chidden *phidden;//pointer to hidden layer coutput *pout;//pointer to output layer float *lpa;//l bias direction to a float *lpx;//l offset to x float *lpy;// L deviation direction of y};</span>

The implementation of the algorithm class is in Algothrim.cpp, which is the core part of the program:<span style= "Font-family:microsoft yahei;font-size:14px;" ><pre name= "code" class= "CPP" >calgothrim::calgothrim (cinput *pin, Chidden *phi, Coutput *pOu) {//constructor// 3 parameters represent input layer, hidden layer, output layer pointer//get each layer pointer pinput = Pin;phidden = Phi;pout = POu;} Calgothrim::~calgothrim () {//Release bias guide Delete []lpa;//release Lpalpa = null;delete []lpx;//Release lpxlpx = Null;delete []lpy;//] Release lpylpy = NULL;} void Calgothrim::updateout () {//Update output layer parameters u,b//Calculation lpylong unsigned vsize = pinput->pchnlex->ulvocsizec;//Dictionary size long unsigned j;long unsigned wt = pinput->expectedid;//desired word idint k;memset (Lpa, 0, hn*sizeof (float)),//lpa 0for (j=1; j <Vsize; J + +)//update {if (j = = wt)//According to Lpy<-1 (J==WT)-PJ calculation lpy{lpy[j] = 1-pout->p[j];} ELSE{LPY[J] = 0-pout->p[j];} POUT-&GT;B[J] + = epsilon*lpy[j];//According to BJ <-BJ +ε*lpy update bfor (k=0; k
 Feedforward Neural Network Language Model (NNLM) C + + core code implementation

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More