Papers to be tasted | Using knowledge map to enhance neural network to solve the task of natural language processing

Source: Internet
Author: User
Tags knowledge base


citation: K. M. Annervaz, Somnath Basu Roy Chowdhury, and Ambedkardukkipati. Learning beyond Datasets:knowledge graph augmented neural networksfor natural language processing. CoRR, abs/1802.05930, 2018.

URL: https://arxiv.org/pdf/1802.05930.pdf

Motivation

Machine learning has been a typical solution for many AI problems, but the learning process is still heavily dependent on specific training data. Some learning models can combine the prior knowledge in Bayesian setup, but these learning models do not have the ability to access any structured external knowledge as needed. The goal of this paper is to develop a deep learning model that can extract relevant prior knowledge from the knowledge map according to the task using attention mechanism. The purpose of this paper is to prove that when the deep learning model accesses the structured knowledge in the form of knowledge map, it can be trained with a small number of marker training data, thus reducing the dependence of the traditional deep learning model on the specific training data.

Model

The input of a model is a word vector sequence of words in a set of sentences x=[x_1, X_2,..., x_t], after a LSTM unit gets the hidden layer state of each word vector h_t = f (x_t, h_{t-1}), and then the resulting hidden layer state vector plus and averaged to get o = 1/t (\su m_{t=1}^{t}h_t). The context vector C=relu (o^t W) can be computed according to. The context vectors corresponding to the entities and relationships are multiplied by the vectors of the entities and relationships, and the Softmax operation calculates the weight of each entity and relationship \alpha_{e_i}, \alpha_{r_i}. The vectors of entities and relationships are calculated by DKRL model (a learning model based on the knowledge map of the text description, and the link https://aaai.org/ocs/index.php/AAAI/AAAI16/paper/view/12216/12004).

All entities and relationships in the text are then weighted evenly based on the weights previously calculated, resulting in vector E, R for all entities and relationships in the text.

According to Transe hypothesis, the construction of the fact tuple will train the input lstm model and get the result of the text classification.

The original model schema for entity and relationship representations in the calculated text is shown in the following figure.

The model for calculating entities and relationships is trained in conjunction with the LSTM module for text categorization, as shown in the following diagram.

The number of entities and relationships in the text is large, and it is expensive to calculate weights for each entity and relationship separately. In order to reduce the attention space, this paper uses the K-means algorithm to cluster entities and relational vectors, and introduces a convolution-based model to learn the representation of the Knowledge Atlas entities and relationship sets.

Experiments

This article uses the News20,dbpedia data set to solve the task of text categorization, using the Stanford Natural Language Inference (Snli) dataset for natural language inference tasks. Freebase (fb15k) and WordNet (WN18) are also used as the relevant knowledge base input.

Figure (a), figure (b) shows the accuracy and loss function values trained on the Snli dataset, respectively. In the experiment, we compare 100% datasets, 70% Datasets, and 70% datasets +kg The results of three case inputs. It can be found that the introduction of kg not only reduces the dependence of the deep learning model on the training data, but also can significantly improve the accuracy of the prediction results. In addition, the method proposed in this paper is highly extensible for the processing of a large amount of prior information and can be applied to any common NLP task.


Notes collation: Deng Shumin, Zhejiang University Computer College, 2017 straight Berson, research direction for the knowledge map and text joint expression learning, time series prediction.



openkg.cn


The Chinese Open Knowledge Atlas (openkg.cn) aims to promote the openness and interconnection of Chinese Knowledge Atlas data, and promote the popularization and wide application of Knowledge atlas and semantic technology.

Reprint NOTICE: Reprint need to indicate the source "openkg.cn", the author and the original link. If you want to change the title, please specify the original title.


Click to read the original and go to openkg blog.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.