NEO4J Database Introduction

Source: Internet
Author: User
Tags neo4j

As the world's Advanced graph database, NEO4J became the first choice for many internet companies nowadays. NEO4J is an open source graph database based on Java development and a NoSQL database. NEO4J also supports the acid characteristics of traditional relational data while ensuring good characterization of data relationships, and has a good performance in storage efficiency, cluster support, and fail-over redundancy. Recently, because of the relationship between the project in the laboratory, NEO4J has a certain understanding. At the same time, I also have a lot of interest in its design ideas and architecture, so write down this blog to help you better understand the database, but also share the understanding.

Design concept

NEO4J is designed to better and more efficiently describe the relationship between entities. In real life, every entity is inextricably linked to other entities around it, and the information stored in these relationships is even larger than the physical properties of the body itself. Then the traditional relational database is more focused on characterizing the attributes inside the entity, and the relationship between entities and entities is usually realized by using foreign keys. Therefore, the join operation is usually required when solving the relationship, and the join operation is often time consuming. The explosive growth of the Internet, especially the mobile internet, has already overwhelmed traditional relational databases, and the high demand for relationships, such as social networking, can be said to have no advantage in relational databases. The graph database, as a database that focuses on the relationship between data, has emerged as a very important part of NoSQL. And Neo4j is one of the best in the graph database.

There are only two types of data in the NEO4J database:

    • Node nodes: Nodes are similar to entities of the E-R (entity), where each entity can have 0 to many properties that exist in the form of a key-value pair and do not have a category requirement for the attribute or need to be defined in advance. In addition, each node is allowed to be labeled to differentiate between different types of nodes.
    • Relationship relationship: The relationship is similar to the relationship of E-r (relationship), a relationship has a starting node and a terminating node component. In addition, as with node, relationships can have multiple attributes already tagged

Its specific structure is as follows: and an actual diagram database example is as shown here:

Based on this design concept, NEO4J has the following features:

    • The relationship is implemented at the time of creation, so it is an O (1) operation when querying the relationship.
    • All relationships are equally important in the neo4j.
    • The algorithm provides depth-first search, breadth-first search, shortest path, simple path already dijkstra and so on.
Storage structure of NEO4J

Now let's take a look at how the data is stored in the neo4j, first of all the node nodes in the format: Node:in_use(byte)+next_rel_id(int)+next_prop_id(int) , each bit of the specific meaning is as follows:

    • In_use:1 indicates that the node is in use and 0 means it is deleted
    • next_rel_id (int): The node's next relationship ID
    • next_prop_id (int): The ID of the next property of the node

Relation format:in_use+first_node+second_node+rel_type+first_prev_rel_id+first_next_rel_id+second_prev_rel_id+second_next_rel_id+next_prop_id

    • IN_USE,NEXT_PROP_ID: Ibid.
    • First_node: The starting node of the current relationship
    • Second_node: The terminating node of the current relationship
    • Rel_type: Relationship Type
    • FIRST_PREV_REL_ID & first_next_rel_id: The first and last relationship ID of the starting node
    • SECOND_PREV_REL_ID & second_next_rel_id: Terminating the first and last relationship ID of a node

I believe that after looking at the structure of the deposit number, you know why neo4j in the query node relationship is so fast, because each node has any relationship is directly within the defined domain of the node, direct access to the line, there is no need to find another table.

Here is an example of a graph's traversal:

    • Starting from node 1, the width is first traversed, its storage structure is: 00000002 ffffffff
    • relationship 1:01 00000001 00000003  00000000    00000002 00000000   00000003 ffffffff    ffffffff node1, node 3,node3 has other relationships, So node3 into the queue, and access the relationship 0
    • relationship 0:01 00000001 00000002  00000000   00000001 ffffffff   ffffffff ffffffff    ffffffff node1 node2, Access complete node1 all relationships, exit Node3
    • from the queue Li style= "line-height:25px" for the same method as above to access Node3
(1)–[KNOWS,2]–>(4)(1)–[KNOWS,1]–>(3)(1)–[KNOWS,0]–>(2)(1)–[KNOWS,1]–>(3)–[KNOWS,5]–>(7)(1)–[KNOWS,1]–>(3)–[KNOWS,4]–>(6)(1)–[KNOWS,1]–>(3)–[KNOWS,3]–>(5)
The difference between neo4j and relational database

In fact, through the above explanation, I believe we all have a certain understanding of the difference between neo4j and RDBMS (relational Database Management System), and now use the following table to reorganize:

Neo4j Rdbms
Allows for simple and diverse management of data Highly structured data
Flexible data addition and definition, not limited by data type and number, no need to define in advance Table schema needs to be predefined, modify and add data structures and types complex, with strict restrictions on data
Relational query operation for constant time Time-consuming relational query operations
Propose a new query Language cypher, query statements more simple Query statements are more complex, especially when joins or union operations are involved

Finally, the following two images show the difference between the two in the query relationship: rdbmsneo4j

About neo4j specific installation and use, not the focus of the article, if you want to really get started with neo4j, you can go to neo4j official website has a lot of information


NEO4J Database Introduction

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.