The manager asked me to convert the bin file from Word2vec training to a TXT file, and I don't know what the TXT file is for. In fact, Word2vec training corpus can choose the training department out of the bin file or txt file, but the process is too long to train the bin file, I am afraid to directly train the TXT file is also slow, so I still try to do this thing.
I used the Gensim, this need to install their own, my computer installed this is quite troublesome.
#-*-coding:utf-8-*-ImportGensimImportCodecsdefMain (): Path_to_model='Result.bin'output_file='file.txt'bin2txt (Path_to_model, output_file)defbin2txt (Path_to_model, output_file): Output= Codecs.open (Output_file,'W','Utf-8') Model= Gensim.models.KeyedVectors.load_word2vec_format (Path_to_model, binary=True)Print('Done loading word2vec!') Vocab=Model.vocab forIteminchVocab:vector=list () forDimensioninchModel[item]: Vector.append (str (dimension)) Vector_str=",". Join (vector) line= Item +"\ t"+vector_str Output.writelines ( line+"\ n") #本来用的是write () method, but the result is not a newline effect. Changed to Writelines () method has not been tried. Output.close ()if __name__=="__main__": Main ()
Python implementation Word2vec Training results bin file to TXT file