I. The Atlas of Knowledge
Internet, the background of large data, Google, Baidu, Sogou and other search engines based on the background, to create their own knowledge map knowledge graph (Google), intimate (Baidu) and Knowledge Cube (Sogou), mainly used to improve search quality.
1. What is a knowledge map
A graph based data structure consisting of node (point) and Edge (edge). Where the node is an entity, is marked by a globally unique ID, and the relationship (also called the property) is used to connect two nodes. In layman's terms, a knowledge map is a network of relationships that connect all the different kinds of information (heterogeneous information). The knowledge map provides the ability to analyze problems from the perspective of "relationship".
2. Knowledge Card
Knowledge card is designed to provide users with more information related to the search content, for example, when the search engine to enter "Yao" as a keyword, we found that the right of the search results page used to place ads in the location of the knowledge card replaced. The bottom side even with a list of documents that match the keyword.
3, the role of knowledge map
The knowledge map was first proposed by Google, mainly to optimize existing search engines, such as the search for Yao Ming, in addition to Yao's own information, but also associated with Yao's daughter, Yao's wife and other search keywords related to information. That is to say, the larger the knowledge map of the search engine, the more information related to a certain keyword, the more information that the searcher is most likely to see, and the quality and breadth of the search can be greatly improved by the Knowledge Atlas.
So this can also understand why Google Baidu and other search engines are attracted to the big head, creating their own users to meet their own search habits of knowledge map. According to incomplete statistics, the Google Knowledge Atlas has so far contained 500 million entities and 3.5 billion facts (Form entity-attribute-value, and entity-relationship-entity)
4. Mining on the map of knowledge
The knowledge map can be created by large data extraction and integration, and the knowledge map should be excavated further to increase the knowledge coverage of Knowledge Atlas. Common Mining Techniques:
Inference: Mining for entity attributes or relationships through a rule engine for discovering unknown implied relationships
Entity importance ordering: When querying multiple keywords, search engines will select entities that are more relevant to the query to display. Common PageRank algorithms compute the importance of entities in the knowledge map.
Second, neo4j map database
The above is a neo4j map database, composed of vertex-edge, often used in micro-bo friend relationship analysis, urban planning, social, referral and other applications.
1, characteristics
Support for ACID transactions
Enterprise version neo4j support cluster construction to ensure ha
Easily expand the billions of nodes and relationships
Have your own advanced query Language cypher efficient retrieval
CSV data import, Java language writing can be
2, Cypher language:
The Match where return Create delete the set foreach with keyword is equal to the SQL statement select and other keyword operations, such as
SQL Statement
SELECT name from person left
join person_department to
person.id = Person_department.personid left
join Department on
department.id = Person_department.departmentid
WHERE department.name = "Itdepartment"
Cypher Statement
MATCH (P:person) <-[:employee]-(d:department) whered.name = "IT Department" Returnp.name
Java program Conn
Connectioncon = Drivermanager.getconnection ("jdbc:neo4j://localhost:7474/");
Stringquery = "MATCH (:P Erson {name:{1}})-[:employee]-(d:department) return d.name as dept";
Try (preparedstatementstmt = con.preparestatement (QUERY)) {
stmt.setstring (1, "John");
Resultsetrs = Stmt.executequery ();
while (Rs.next ()) {
stringdepartment = rs.getstring ("dept");
....
}
3, the application scene:
Anti-fraud: By looking for different accounts, such as banks, credit cards, etc., find out whether the account is normal normal, the relevant users of the transaction information is normal to judge the user's credit rating.
Recommendation: Through the diagram database, query the consumption of a node, friend information can be recommended for its high degree of relevance of friends or possible consumption of goods.
Because the neo4j storage principle makes its query speed is at O (l) level of complexity, query efficiency.