When the Knowledge Atlas "meets" The deep study

Source: Internet
Author: User
Tags map vector knowledge base
When the Knowledge Atlas "meets" The depth study knowledge Atlas depth study reads 2714

Author: Xiao, Fudan University School of Computer Science and technology, associate professor, Doctoral tutor, Shanghai Internet Large Data Engineering Technology Center Deputy director. The main research direction is large data management and mining, knowledge base and so on.

The arrival of the large data age has brought unprecedented data dividends to the rapid development of artificial intelligence. In the "feeding" of large data, artificial intelligence technology has achieved unprecedented progress. Its progress is manifested in the knowledge engineering represented by the Knowledge Atlas and the machine learning and other related fields represented by the depth learning. The ceiling for deep learning model effects is looming as the depth of learning is depleted of large data bonuses. On the other hand, a large number of knowledge Atlas is emerging, which contains a large number of human prior knowledge of the Treasury has not been effective use of deep learning. The integration of knowledge map and depth learning has become one of the important ideas to further enhance the effect of depth learning model. The symbolism represented by the Atlas of Knowledge, the coupling doctrine represented by the deep learning, is increasingly separated from the original track of independent development and embarked on a new road of synergy. The historical background of the fusion of Knowledge Atlas and deep learning

Large data for machine learning, especially in depth learning brings unprecedented data dividends. Thanks to the large-scale annotation data, the deep neural network can acquire the effective hierarchical feature representation, so as to achieve excellent results in the field of image recognition. But as the data dividend disappears, the depth learning is becoming more and more limited, especially in the aspect of relying on large scale tagging data and the difficulty of using prior knowledge effectively. These limitations hinder the further development of deep learning. On the other hand, in the practice of deep learning, more and more people find that the results of deep learning models are often in conflict with the prior knowledge or expert knowledge. How to get the deep learning out of dependence on large-scale samples. How to make the depth learning model take advantage of the existence of Apriori knowledge effectively. How to make the results of the depth learning model consistent with prior knowledge has become an important problem in the field of deep learning.

At present, human society has accumulated a lot of knowledge. In particular, in recent years, with the help of knowledge mapping technology, various online knowledge maps of machine-friendly have sprung up. The Knowledge Atlas is a kind of semantic network, which expresses various kinds of entities, concepts and their semantic relations. Relative to the traditional representation of knowledge (such as ontology, traditional semantic network, the Knowledge Atlas has the advantages of high entity/concept coverage, diverse semantic relationships, structure friendliness (usually represented as RDF format) and high quality, which makes the knowledge map become the most important knowledge representation in the era of large data and artificial intelligence. It has become one of the most important problems to study the depth learning model by using the knowledge contained in the Knowledge map to guide the learning of the deep neural network model and improve the performance of the model.

At present, the method of applying depth learning technology to knowledge Atlas is more direct. A large number of depth learning models can effectively accomplish end-to-end entity recognition, relationship extraction and relationship complement congruent tasks, which can be used to construct or enrich the knowledge map. This paper mainly discusses the application of the Knowledge atlas in the depth learning model. According to the current literature, there are two main ways. The first is to input the semantic information from the knowledge map into the depth learning model, and to express the discrete knowledge map as a continuous vector, so that the prior knowledge of the knowledge map can be input into the depth learning. The second is to use knowledge as the constraint of the optimization goal, to guide the learning of the depth learning model, usually to express the knowledge in the knowledge map as the optimization objective. The former research work has already had many literatures, and becomes the current research hot spot. As an important feature, knowledge map vector representation is effectively applied in practical tasks such as question and answer and recommendation. The latter research is just beginning, this paper will focus on the first-order predicate logic as a constraint of the depth learning model. Knowledge map as input to deep learning

The Knowledge Atlas is the typical representative of the recent progress of artificial intelligence symbolism. The entities, concepts and relationships in the knowledge map are expressed by discrete and explicit symbolic representations. These discrete symbolic representations are difficult to apply directly to neural networks based on continuous numerical representations. In order to make the neural network use the symbolic knowledge in the Knowledge Atlas effectively, the researchers put forward a lot of knowledge graph representation learning method. The representation learning of knowledge maps aims to quantify the real value of the constituent elements (nodes and edges) of the knowledge map. These continuous quantitative representations can be used as input of neural networks, so that the neural network model can make full use of the Apriori knowledge in the knowledge map. This trend has spawned a great deal of research into the representation of knowledge maps. This chapter begins with a brief review of the representation learning of the knowledge map, and further introduces how these vectors can be applied to various practical tasks based on the depth learning model, especially the practical applications of questions and answers and recommendations.

1. The representation of Knowledge map learning

The representation learning of Knowledge Atlas aims to study the quantitative representation of entities and relationships, and its key is to rationally define the loss function ƒr (h,t) of facts (Ternary < h,r,t >) in the Knowledge map, and the quantitative representation of H and t of two entities in the ternary group. Typically, when the fact < H,r,t > is established, the expectation is minimized ƒr (h,t). Considering the facts of the entire knowledge map, you can minimize the quantification of entities and relationships, where O represents a collection of all the facts in the knowledge map. Different expression learning can define the corresponding loss function by using different principles and methods. This paper introduces the basic ideas of knowledge Atlas representation based on distance and translation model [1].

A model based on distance. Its representative work is the SE model [2]. The basic idea is that when two entities belong to the same ternary group < h,r,t >, their vectors represent that they should be close to each other in the space after the projection. Therefore, the loss function is defined as the distance after the vector projection, wherein the matrix wr,1 and wr,2 are used for projection operations of the H and the tail entity T of the head entity in the ternary group. However, since SE introduced two separate projection matrices, it is difficult to capture semantic dependencies between entities and relationships. Socher for this problem, the third-order tensor is used to replace the linear transformation layer in traditional neural networks to characterize the scoring function. Bordes and others put forward the energy matching model, by introducing the Hadamard product of multiple matrices to capture the interaction between the entity vector and the relation vector.

The expression of learning based on translation. The representative working Transe model depicts the correlation between entity and relationship by vector translation of vector space [3]. The model assumes that if the < H,r,t > is established, the embedding representation of the tail entity T should approach the embedded representation of the head entity H plus the relation vector r, i.e. h+r≈t. Therefore, Transe is used as a scoring function. When the ternary composition immediately, the score is low, conversely score higher. Transe is very effective at dealing with simple 1-1 relationships (that is, the ratio of the number of entities connected to each end of the relationship is 1:1), but the performance is significantly reduced when dealing with complex relationships between N-1, 1-n, and N-n. In response to these complex relationships, Wang proposes that the TRANSH model can acquire different representations of entities under different relationships by projecting entities into the hyperplane of the relationship. Lin proposed that the TRANSR model project the entity into the relational subspace through the projection matrix, thus acquiring different entity representations under different relationships.

In addition to the two types of typical knowledge graphs, there are many other learning models. For example, Sutskever and others use tensor factorization and Bayesian clustering to learn about relational structures. Ranzato and others have introduced a three-way limit Boltzmann machine to study the quantitative representation of the knowledge map and parameterize it by a tensor.

There are still a variety of problems in the current mainstream knowledge Atlas, such as the lack of a better description of the semantic relevance between entities and relationships, the inability to handle complex relationships well, the complexity of the model due to the introduction of a large number of parameters, and the low computational efficiency that is difficult to extend to large-scale knowledge maps In order to provide a priori knowledge for machine learning or deep learning, the representation study of Knowledge Atlas is still a long way to study.

Application of Knowledge map to quantitative representation

Application 1 question and answer system. Natural language question and answer is an important form of human-computer interaction. Deep learning makes it possible to generate questions based on question and answer corpus. However, most of the in-depth question and answer models are still difficult to use a lot of knowledge to achieve accurate answers. Yin et. Aiming at the simple fact class problem, this paper proposes a depth learning question and answer model based on Encoder-decoder framework, which can make full use of knowledge Atlas [4]. In a deep neural network, the semantics of a problem are often expressed as a vector. Problems with similar vectors are considered to have similar semantics. This is a typical way of joining the doctrine. On the other hand, the knowledge representation of the knowledge Atlas is discrete, that is, there is not a gradual relationship between knowledge and knowledge. This is the typical way of symbolism. By quantifying the knowledge map, the problem can be matched with the ternary group (i.e. to compute its vector similarity), so as to find the best ternary group matching from the knowledge base for a particular problem. The matching process is shown in Figure 1. For question Q: "How Tallis Yao Ming?", first the word in question is represented as a vector array HQ. Further look for candidate ternary groups in the knowledge map that can match. Finally, we compute the semantic similarity of the problem and the different attributes for these candidate ternary groups respectively. It is determined by the following similarity formula: here, S (q,τ) represents the similarity between the question Q and the candidate ternary tau; XQ the vector representing the problem (derived from the HQ), uτ the vector of the ternary group representing the knowledge map, M is the parameter to be studied.


Fig. 1 Neural generation question and answer model based on knowledge map

Application 2 recommendation System. Personalized recommendation system is one of the most important intelligent services of social media and electric quotient websites of Internet. With the wide application of knowledge Atlas, a lot of research work realizes that knowledge map can be used to improve the description of user and project content (feature) in content-based recommender system, thus enhance the recommendation effect. On the other hand, the recommendation algorithm based on depth learning is more and more superior to the traditional recommendation model based on collaborative filtering [5]. However, it is still rare to integrate the knowledge map into the research of personalized recommendation in the framework of depth learning. Zhang and others have made such an attempt. The author made full use of three kinds of typical knowledge such as structured knowledge (knowledge map), text knowledge and visualization knowledge (picture) [6]. The author obtains the quantitative representation of structured knowledge through network embedding (network embedding) respectively, and then uses Sdae (stacked denoising Auto-encoder) and cascade convolution self-encoder (Stackedconvolution-autoencoder) Extract text knowledge features and picture knowledge features, and finally integrate three kinds of features into the collaborative integration Learning framework, and use the integration of three kinds of knowledge features to realize personalized recommendation. Based on the experiment of Film and book DataSet, the author proves that the proposed algorithm of fusion depth learning and knowledge Atlas has good performance. Knowledge Atlas as a constraint of deep learning

Hu et. A model of merging first-order predicate logic into a deep neural network is proposed and successfully used to solve the problems of emotion classification and named entity recognition [7]. Logic rule is a flexible representation of higher-order cognition and structured knowledge, and also a typical knowledge representation form. It is very important to use the human intention and domain knowledge to guide the neural network model by introducing all kinds of logic rules that people have accumulated into the deep neural network. Other research attempts to introduce logic rules into probabilistic graph models, which represent Markov logic networks [8], but few work can introduce logic rules into deep neural networks.

The programme framework proposed by HU and others can be summed up as "Teacher-student network", as shown in Figure 2, including two parts teacher Network Q (y|x) and Student Network pθ (Y|X). The teacher network is responsible for modeling the knowledge represented by the logic rules, student network using the reverse communication method and teacher network constraints to realize the learning of the logic rules. This framework can introduce logic rules for most tasks with deep neural network model, including affective analysis, named entity recognition and so on. By introducing the logic rules, the results are improved on the basis of the deep neural network model.


Fig. 2 The "Teacher-student Network" model of introducing logical rules into deep neural networks

The learning process consists of the following steps:

Using soft logic, the logical rule is expressed as a continuous value between [0, 1].

Based on the method of posterior regularization (posterior regularization), the logic rules are used to limit the teacher network, while the Teacher network and student network are as close as possible. The final optimization function is:

Among them, the ΞL,GL is a relaxation variable, L is the number of rules, GL is the grounding number of the L rule. The KL function (Kullback-leibler divergence) partially guarantees that the teacher network and student network acquisition models are as consistent as possible. A subsequent regular item expresses a constraint from a logical rule.

The student network is trained to ensure that teacher network prediction results and Student Network prediction results are as good as possible, and the optimization function is as follows:

In which, T is the training rounds, L is the loss function in different tasks (for example, in the classification problem, L is the cross entropy), σθ is the predictive function, and SN (t) is the teacher network prediction result.

Repeat the 1~3 process until it converges. Conclusion

With the further study of depth learning, how to effectively use a large number of existing prior knowledge, and then reduce the model for large-scale tagging samples of dependence, gradually become the mainstream research direction. The representation learning of Knowledge Atlas lays the necessary foundation for the exploration in this direction. Some groundbreaking work that has recently emerged that integrates knowledge into the deep neural network model is also instructive. But generally speaking, the method of using prior knowledge is still very limited, and the academia still faces great challenge in this direction. These challenges are mainly embodied in two aspects: how to obtain high-quality continuous representation of all kinds of knowledge. At present, the expression learning of knowledge atlas, no matter what learning principle is based on, will inevitably produce semantic loss. Once the symbolic knowledge is quantified, a great deal of semantic information is discarded, which can only express the very vague semantic similarity relationship. It is still an open question how to acquire high quality continuum representation for the Knowledge Atlas. How to integrate common sense knowledge into the depth learning model. A large number of practical tasks (such as dialogues, questions and answers, reading comprehension, etc.) require machines to understand common sense. The scarcity of common knowledge has seriously hindered the development of general artificial intelligence. How to introduce common sense into the deep learning model will be a major challenge in the field of AI research and also a great opportunity.

Reference documents
[1] Liu Zhiyuan, Sun Maosung, Lin Yuakei, et. Research progress in knowledge representation learning [J]. Computer research and development, 2016, 53 (2): 247-261.
[2] Bordes A, Weston J, Collobert R, et al. Learning structured embeddings of knowledgebases[c]//AAAI Conference on Artif Icial Intelligence, Aaai, SanFrancisco, California, Usa, August. DBLP, 2011.
[3] Bordes A, usunier N, Garcia-duran A, et al. translating embeddings for Modelingmulti-relational Data[j]. Advances in neural information processing systems,2013:2787-2795.
[4] June Yin, Xin Jiang, Zhengdong lu,lifeng Shang, Hang Li, xiaoming li, neuralgenerative question. IJCAI2016.
[5] Giovanni Semeraro, Pasquale lops, Pierpaolo Basile, knowledge infusion intocontent-based recommender systems:acm Co Nference on Recommender Systems, 2009.
[6] Fuzheng Zhang, Nicholas Jing Yuan, Defu Lian, Xing Xie, wei-ying Ma, collaborative knowledge Base embedding for Recomm Ender Systems, in Proc. of KDD, 2016.
[7] Hu, Z., Ma, X., Liu, Z., Hovy, E., & Xing, E. (2016). Harnessing deep neural networks with logic rules. ArXiv preprint arxiv:1603.06318.
[8] Matthew Richardson and Pedro Domingos. 2006. Markov logic Networks. Machine learning,62 (1-2): 107–136.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.