The simple question and answer has been achieved, then the problem has arisen, I am not sure the problem must be "your name", it may be "who you Are", "your name" and so on, which leads to another technology in artificial intelligence:
Natural language Processing (NLP): The general meaning is to let the computer understand a sentence to express the meaning, NLP is equivalent to the computer in the thinking of what you say, let the computer know "who you are", "Your Name", "Your name" is a meaning
This is going to be done: semantic similarity
Next we use Python Dafa to implement a simple natural language processing
And now we're going to use Python's powerful three-party library.
The first is a library called Jieba, which is a Chinese character string.
Pip Install Jieba
We usually call this library stutter participle is really stuttering participle, and this thesaurus is made in the China, the basic use of this stuttering participle:
import Jiebakey_word = " What's your name " # Define a sentence, based on this sentence, word breaker Cut_word = jieba.cut (key_word) # Use the Cut method in stuttering participle to "what's your Name" word breaker print (Cut_word) # <generator object Tokenizer.cut at 0x03676390> does not understand the generator, it is ignored here cut_word_list = List (Cut_word) # If you don't understand the generator, remember to make the Generator object list print (cut_word_list) # [' You ', ' call ', ' What ', ' name ']
The test code is very obvious, it is very clear to the Chinese string into a list of the store up
The second one is a language training library called Gensim.
Pip Install Gensim
This training library is very powerful, which encapsulates a lot of machine learning algorithms, is currently the mainstream application of artificial intelligence library, this is not very good understanding, the need for certain Python data processing skills
ImportJiebaImportGensim fromGensimImportCorpora fromGensimImportModels fromGensimImportSIMILARITIESL1= ["What's your name?","How old are you this year?","How tall are you, how big your breasts are.","how big is your chest ?"]a="How old are you this year?"all_doc_list= [] forDocinchl1:doc_list= [Word forWordinchJieba.cut (DOC)] All_doc_list.append (doc_list)Print(all_doc_list) doc_test_list= [Word forWordinchJieba.cut (a)]#Production CorpusDictionary = corpora. Dictionary (All_doc_list)#make a word bag#the understanding of the word bag#A word bag is a dictionary of many, many words that are arranged to form a word (key) with a flag bit (value)#For example: {' What ': 0, ' You ': 1, ' the name ' £ 2, ' is ': 3, ' ': 4, ' Up ': 5, ' This year ': 6, ' How old ': 7, ' many ': 8, ' yes ': 9, ' chest ': 10, ' High ': one}#as to what it is for, take a question and look downPrint("Token2id", Dictionary.token2id)Print("Dictionary", dictionary, type (dictionary)) Corpus= [Dictionary.doc2bow (DOC) forDocinchAll_doc_list]#Corpus:#This is to match the words in each list in the all_doc_list with the key in the dictionary#get a matching result, such as [' You ', ' This year ', ' How old ', ' up ']#can be obtained [(1, 1), (5, 1), (6, 1), (7, 1)]#1 is for you 1 delegates appear once, 5 represents a 1 representative appeared once, and so on 6 = this year, 7 = how oldPrint("Corpus", Corpus, type (corpus))#will need to find the similarity of the word list made corpus Doc_test_vecDoc_test_vec =Dictionary.doc2bow (doc_test_list)Print("Doc_test_vec", Doc_test_vec, type (DOC_TEST_VEC))#using LSI models to train corpus corpora (primary knowledge Corpus)LSI =models. Lsimodel (Corpus)#Here's just need to learn LSI model to understand that here do not elaboratePrint("LSI", LSI, type (LSI))#The training result of corpus corpusPrint("Lsi[corpus]", Lsi[corpus])#Get corpus Doc_test_vec vector representation in the training results of Corpus corpusPrint("Lsi[doc_test_vec]", Lsi[doc_test_vec])#Text Similarity#The sparse matrix similarity corpus The training result of the master corpus as the initial valueindex = similarities. Sparsematrixsimilarity (Lsi[corpus], num_features=Len (Dictionary.keys ()))Print("Index", index, type (index))#The matrix similarity calculation is made by the vector representation of Corpus Doc_test_vec in the training result of corpus corpus and the vector representation of corpus corpus.SIM =Index[lsi[doc_test_vec]]Print("Sim", SIM, type (SIM))#a sort of subscript and similarity results with the highest similarity results#cc = sorted (Enumerate (SIM), Key=lambda item:item[1],reverse=true)CC = sorted (Enumerate (SIM), key=LambdaItem:-ITEM[1])Print(cc) Text=L1[cc[0][0]]Print(A,text)
High Energy ahead
Python Ai Road-fourth: Jieba Gensim better not split the most simple similarity implementation