Knowledge Map _ Knowledge Atlas

Source: Internet
Author: User


1. What is a knowledge map
In the Internet age, search engines are an important tool for people to get information and knowledge online. When a user enters a query word, the search engine returns the page it thinks is most relevant to the keyword. From the date of birth, the search engine is such a pattern, until May 2012, search engine giant Google in its search page for the first time to introduce a "knowledge map": Users in addition to the search page links, but also see the query word related to the more intelligent answer-case. As shown in the following illustration, when the user enters "Marie Curie" (Induced Marie Curie) This query, Google will provide Marie Curie details on the right side, such as personal profile, place of birth, date of birth and death, and even some historical figures related to Madame Curie, such as Einstein, Pierre · Curie (Madame Curie's husband) and so on.
The knowledge map is Google's first choice to put forward, divided into pattern layer (concept) + data layer (instance). In RDF ternary Group < entity 1, relationship, entity 2> composition.The pattern layer can also be called ontology, the most basic ontology includes concept, concept level, attribute, attribute value type, relation, relation definition domain concept set and relation range concept set. On this basis, we can add rules or axioms to represent more complex constraint relationships in the schema layer. The utility is very extensive and can be used forQuery UnderstandingKnowledge QuizShort Text AnalysisDocument RepresentationWait a minute
2. Knowledge Map Construction Brief introduction
At present, the establishment of knowledge Atlas usually adoptsTop-down and bottom-up combinationThe way. The Top-down approach refers to the construction of ontologies in advance through the Ontology editor. Of course, the ontology building here is not a process from scratch, but relies on the pattern information extracted from the high quality knowledge obtained from the encyclopedia class and structured data. That is, structural taxonomy. The graph pattern defines domain, category (type), and subject (topic, or entity). Each domain has several categories, each containing multiple topics and associated with multiple properties or relationships (properties) that specify the attributes and relationships that are required for those topics that belong to the current category. The bottom-up approach uses the various extraction techniques described above, especially through search logs and World Wide Web tables (Web table), to extract the discovered categories, attributes, and relationships and incorporate these high confidence patterns into the knowledge map. That is to extract relationships. The Top-down approach is advantageous to the extraction of new examples, which can guarantee the quality of extraction, while the bottom-up approach can find new patterns. Knowledge merging is essential when acquiring knowledge from multiple data sources. Then there are a lot of questions to consider, such as entity disambiguation (Jordan issue), mutual reference digestion and so on.
Knowledge storage and maintenance need to be taken into account, and the relational database with JSON extension is usually used, and the graph database such as neo4j and ORIENTDB can be used. , the web crawler captures the infobox or related textual information of the encyclopedia page without interruption, and stores the knowledge data locally in ternary text form. The data Import program uses multithreading to continuously scan ternary files, and in the data import layer to MongoDB, to ensure the uniqueness of the data in the guide. Based on this process, the backend of the knowledge map system can be constructed quickly under the supervision of unmanned workers, which makes the data accumulate continuously, which enriches the knowledge field covered by the knowledge map. Since data sources such as Wikipedia are constantly being updated (such as the Obama presidency), all of us need to update the data regularly, and the general practice is divided into building from scratch and incremental build. The first method is simple, but time-consuming. The second method saves time, but when the wiki new data is updated, we need to translate the data into rules, and the program parsing rules to update the atlas is very difficult. The first method is generally used. In addition to the above Automatic Updates, can also be artificial update, because the person's judgment is the most accurate, people can write rules manually to increase or modify the knowledge map of the Edge node and so on

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.