Question and Answer system

Last Update:2016-08-02 Source: Internet

Author: User

Tags knowledge base

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Question and answer system on Knowledge Base: Entity, text and system viewpoint

Editor: This is a keynote speech from Dr. Trivanyun of Fudan University in the Deep Learning Meetup hosted by Ctrip Technology Center to share the QA system based on Knowledge Atlas developed by Fudan University. Follow Ctrip Technology Center Ctriptech to learn more about technology sharing. At the end of the article can download the lecture ppt.

The QA system is used to answer questions raised in the form of natural language, which has achieved significant success in the areas of Internet, communications, and healthcare. IBM's Watson system, which won the first $1 million award for the Human Challenge, was successfully run on the iphone, changing the way people communicate with the iphone, and many other companies have successfully developed text or voice QA systems, Google Now, Amazon's Alexa, and Microsoft's Cortana, as well as the health Care,qa system in medicine, also help physicians and patients get a timely interaction.

QA System According to its answer corpus can be divided into two categories, one is a common form of plain text, such as Web documents, questions and answers community content, search engine results, encyclopedia data and so on. The other is the knowledge map, usually in the form of RDF ternary structure representation. Because of the structural characteristics, the QA system can often provide more accurate and concise results than the pure text corpus. On the other hand, a large number of 1 billion or more large-scale knowledge maps emerged in recent years, including Wolframalpha, Google knowledge Graph, Freebase, etc. The emergence of these knowledge maps is guaranteed to be based on the coverage of their question and answer systems. So at present, the open field QA system based on the knowledge map is feasible.

First, the system architecture

The QA system is divided into three-tier architectural models, namely, entity, language, and application layer, as shown in.

The bottom layer is the solid level, which provides the most basic computing unit for the upper model, including semantic community search, semantic disambiguation and the co-existing network module; The middle layer is the language level, as a bridge between the entity layer and the application layer, which contains a short text with certain semantic information; The top is an integrated QA system. Includes problem templates and deep learning modules.

1. Research on solid layer model

1.1 Semantic Community Search

As shown, the node represents the semantics of the word in the semantic Community network, the edge is the relationship between words and words, the model can find the community where the word resides, and the similarity between the words, as shown in pot and bowl for the same semantic community, there is a high similarity; pot and plate for different semantic communities , of which two have two words in the intersection, for the medium similarity; pot and tube are different semantic communities, with only one word intersection, which is lower similarity;

1.2 Semantic disambiguation

2. Language Layer Model Research

2.1 Verbs semantic Template

According to the correlation between verb and noun, the theory of verb semantic template is proposed, including conceptual verb template such as verb $cconcept, and fixed verb template such as verb $iobject. Verb semantic template is mainly used to deal with the conceptualization of language entities, so it is necessary to ensure that it has the characteristics of universality and particularity. Based on the theory of minimum description distance, we propose a template for dynamic words which satisfies the above two characteristics, namely

3. Application Layer Model Research

As shown, the QA system from the problem through the language entity recognition, language template extraction, prediction Index establishment and finally find the answer to the question. Where is the focus on extracting the right entity attributes from the problem? The problem template solves the problem by translating the entities in the problem into their corresponding concepts, such as the meaning of Honolulu's concept as city.

So how does the problem template find the corresponding attribute from the entity? We propose a method based on probability graph, so that the answer to the question is the closest to the predicted answer, as shown in. First the entity is identified by the problem entity, then the problem is conceptualized to get the problem template, then the corresponding attribute is found according to the template, and the value is looked up according to the property.

Ii. Results of the study

Based on the above three-tier architecture model, 27,126,355 problem templates were trained, covering 2,782 problem intent groups, and the QA system was successfully developed, as shown in 1. An entity-based question-and-answer success rate of up to 59%, as shown in 2, is a quiz based on the Knowledge Atlas in CGF. High accuracy was also obtained in the Qald test, as shown in 3.

Third, the QA research based on deep learning

First, why is deep learning suitable for entity attribute search? Because deep learning has a natural advantage over sequential problems, our problems are generally sequential.

1. CNN

As the simplest CNN network, the bottom-level is the problematic entity extraction layer, which first serializes successive problems into a single entity. Then, convolution operations are performed on each entity. Finally, the maximum probability of inbinding is obtained, thus the entity attribute value is obtained. At the same time, a bidirectional lstm model which can better understand the context of the problem is presented.

2, KB Based QA + deep Learning

To enhance the characteristics of the above-mentioned CNN network, we propose the following model. The model is similar to CNN, with 3 CNN networks, each of which independently predicts the properties and finally gets the maximum RMS value. In addition to the answer path attribute, the properties of the answer context and answer type are increased in comparison to a single CNN network. Where Answer context represents the information around the candidate answer, Answer type represents the kind of candidate answer.

Iv. the thinking of QA system

For the QA system, we now have the following problems:

1, lack of high-quality training data sets, such as only 3,778 network problems of QA, and for Qald only 100 QA pairs;

2, the knowledge map itself data is imperfect;

At the same time, kb-based-based QA has the properties of limited contact and accurate answers, and the ir-based-based QA has an infinite number of links and fuzzy answers, so how do you combine two models to get a broader and more accurate answer to the question? This is a problem that we are working on and have good prospects for.

(This paper is organized by Ctrip Technical Center he June)

Presentation ppt Download:

Question and answer system on the Knowledge Base: Entity, Text and system View-Trivanyun

Note: This article by Ctrip Technical Center original, if need to reprint please mail niuq#ctrip.com (#改为 @).

Deep Learning Meetup Series:

The application of deep learning in the Ctrip strategy community

Application of deep learning in Sogou wireless search advertisement

Question and answer system on Knowledge Base: Entity, text and system viewpoint

A deep learning model for user online ad click Behavior Prediction

The inference technology in the Knowledge Atlas and its application in the college entrance examination robot

Tags: deep learning, Ctrip Technology Center

Question and Answer system

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More