Introduction to Neo4j database, neo4j Database

Source: Internet
Author: User
Tags neo4j

Introduction to Neo4j database, neo4j Database

As the world's advanced graph database, Neo4j has become the first choice for many Internet companies. Neo4j is an open source graph database developed based on java and also a NoSQL database. While ensuring a good characterization of data relationships, Neo4j also supports ACID features of traditional relational data, and has a good performance in terms of storage efficiency, cluster support, and failover. Recently, due to the relationship between lab projects, we have some knowledge about Neo4j. At the same time, I am also very interested in its design ideas and architecture. Therefore, I wrote this blog to help you better understand this database and share my understanding with you.

Design Concept

Neo4j is designed to better and more efficiently describe the relationships between entities. In real life, every entity is closely related to other entities. The information stored in these relationships is even greater than the attributes of the body. Then, traditional relational databases focus more on describing the internal properties of entities. The relationships between entities are usually implemented using foreign keys. Therefore, the join operation is usually required when solving the relationship, and the join operation is usually time-consuming. The explosive growth of the Internet, especially the mobile Internet, has overwhelmed traditional relational databases. Coupled with the high demand of applications such as social networks for relationships, it can be said that relational databases have no advantage. As a database that focuses on describing the relationship between data, graph databases have become a very important part of NoSQL. Neo4j is one of the most outstanding graph databases.

There are only two types of data in the Neo4j database:

  • Node: the Node is similar to the entity of the E-R graph, each entity can have 0 to multiple attributes that exist in the form of a key-value pair, you do not need to define attributes in advance. In addition, each node can be tagged to distinguish different types of nodes.
  • Relational Relationship: a relational relationship is similar to a Relationship of a E-R. A relational relationship consists of a starting node and a ending node. In addition, like node, the link can also have multiple attributes already labeled

The specific structure is as follows: an actual graph database example is shown in the following figure:

Based on this design philosophy, Neo4j has the following features:

  • The link has been implemented when it is created, so it is an O (1) operation when querying the link.
  • All relationships are equally important in Neo4j.
  • Provides deep-first search, breadth-first search, shortest path, simple path, Dijkstra, and other algorithms for graphs.
Neo4j Storage Structure

Now let's take a look at how data is stored in Neo4j. The first is the Node format:Node:in_use(byte)+next_rel_id(int)+next_prop_id(int)The significance of each digit is as follows:

  • In_use: 1 indicates that the node is used, and 0 indicates that the node is deleted.
  • Next_rel_id (int): the id of the next link of the node.
  • Next_prop_id (int): id of the next attribute of the node

Relation format:in_use+first_node+second_node+rel_type+first_prev_rel_id+first_next_rel_id+second_prev_rel_id+second_next_rel_id+next_prop_id

  • In_use, next_prop_id: Same as above
  • First_node: Start Node of the current link
  • Second_node: The ending node of the current link.
  • Rel_type: Link Type
  • First_prev_rel_id & first_next_rel_id: id of the first and last link of the Start Node
  • Second_prev_rel_id & second_next_rel_id: id of the previous and last link of the terminated Node

I believe that after reading the storage structure, we all know why Neo4j is so fast in querying node relationships, because the relationship between each node directly exists in the predefined domain of the node, you just need to directly access the table without looking for another table.

The following is an example of graph traversal:

  • Start from node 1 and traverse the width first. The storage structure is 01 00000002 ffffffff.
  • The next link id is 2, and the access link is 2: 01 00000001 00000004 00000000 00000001 ffffffff. node 1-> node 4 is obtained, and the next link is 1.
  • Relationship 1: 01 00000001 00000003 00000000 00000002 00000000 00000003 ffffffff ffffff node1-> node 3, node3 has other relationships, so the node3 is saved to the queue and the access relationship is 0.
  • Link 0: 01 00000001 00000002 00000000 00000001 ffffffffff ffffffff node1-> node2. Access all links of node1 and exit node3 from the queue.
  • Used to access node3 using the same method above
  • The final result is as follows:
(1)–[KNOWS,2]–>(4)(1)–[KNOWS,1]–>(3)(1)–[KNOWS,0]–>(2)(1)–[KNOWS,1]–>(3)–[KNOWS,5]–>(7)(1)–[KNOWS,1]–>(3)–[KNOWS,4]–>(6)(1)–[KNOWS,1]–>(3)–[KNOWS,3]–>(5)
Differences between Neo4j and relational databases

In fact, through the above explanation, I believe everyone has a certain understanding of the difference between neo4j and RDBMS (Relational Database Management System). Now I will repeat it with the following table:

Neo4j RDBMS
Allow simple and diverse management of data Highly structured data
Flexible data addition and definition, not limited by data types and quantities, no need to be defined in advance Table schema needs to be predefined. It is complex to modify and add data structures and types, and has strict restrictions on data.
Relational query operations with constant time Link query operation time
The new Query Language cypher is proposed, which makes the query statement simpler. Query statements are more complex, especially when join or union operations are involved.

The following two figures show the differences between the two in the query relationship: RDBMSNeo4j

The specific installation and use of Neo4j is not the focus of the Article. If you want to use Neo4j, you can go to the official website of Neo4j with a lot of information.



What is neo4j? How to configure it? Can be used separately?

Neo4j is an embedded, disk-based Java persistence engine that supports complete transactions. It stores data in images rather than tables. Neo4j provides large-scale scalability. On a single machine, it can process images of billions of nodes, links, and attributes, and can be extended to multiple machines for parallel operation. Compared with relational databases, graphical databases are good at processing a large amount of complex, interconnected, and low-structured data. These data changes rapidly and requires frequent queries-in relational databases, these queries cause a large number of table connections, which may cause performance problems. Neo4j focuses on solving the performance degradation problem of traditional RDBMS with a large number of connections during query. By modeling data around a graph, Neo4j traverses nodes and edges at the same speed, and its traversal speed has nothing to do with the volume of data that forms the graph. In addition, Neo4j provides very fast graph algorithms, recommendation systems, and OLAP-style analysis, which cannot be achieved in the current RDBMS system.

The neo4j database is on the server, and the program is on its own computer. Now, if you want to connect the program to the database without using rest, how does java get it?

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.