CPU: I32.4Ghz4 core, 8 GB memory mode 1: using its Native Interface operation JVM:-Xms1024m-Xmx1024m-Xmn512m-XX: PermSize128m-XX: MaxPermSize256m4000 node (50 attributes), 4000 relationship: 1 second, in the meantime, the cpu usage is 25%, the 761M8000 node (50 attributes), the relationship is 8000: 2 seconds, during which the cpu usage is 25%, 82
CPU: I3 2.4 Ghz 4-core, memory 8G Mode A: using its Native Interface operation JVM:-Xms1024m-Xmx1024m-Xmn512m-XX: PermSize = 128 m-XX: maxPermSize = 256 m 4000 nodes (50 attributes), 4000 links: 1 second, cpu usage 25%, 761 M 8000 nodes (50 attributes), 8000 links: 2 seconds, cpu usage 25%, 82
CPU: I3 2.4 Ghz 4-core, 8 GB memory
Method 1: Use its Native Interface
JVM:-Xms1024m-Xmx1024m-Xmn512m-XX: PermSize = 128 m-XX: MaxPermSize = 256 m
4000 nodes (50 attributes), 4000 links: 1 second, during which the cpu usage is 25%, 761 M
8000 nodes (50 attributes), 8000 links: 2 seconds, during which the cpu usage is 25%, 829 M
16000 nodes (50 attributes), 16000 links: 5 seconds, during which the cpu usage is 25,983 MB
24000 nodes (50 attributes), 24000 links: 9 seconds, during which the cpu usage is 25%, 1079 M
32000 nodes (50 attributes), 32000 links: 14 seconds, during which the cpu usage is 25%, 1187 M
40000 nodes (50 attributes), 40000 links: after execution for more than 1 minute, the message outOfMemery: java heap space is reported directly.
Memory usage:
Conclusion: when the transaction insertion interface is used during insertion, more than 30 thousand nodes and relationships can be inserted at last in the JVM 1 GB memory configuration, and memory overflow occurs when more nodes are inserted.
Method 2: Use the BatchInserter Interface
JVM: use the default JVM settings.
40000 nodes (50 attributes), 40000 relationships: 6 seconds, CPU usage 25%, memory 288 M
80000 nodes (50 attributes), 80000 relationships: 17 seconds, CPU usage 25%, memory 288 M
120000 nodes (50 attributes), 120000 links: 31 seconds, CPU usage 25%, memory 289 M
200000 nodes (50 attributes), 200000 relationships: 56 seconds, CPU usage 25%, memory 288 M
Analysis:
According to the official documentation, when a small amount of data is inserted (less than 5000 items are observed according to the test), we recommend that you use the transaction-Type Insert interface (usually the data operation interface of NEO4J), the speed is still acceptable; when the data volume is large, we recommend that you use the dedicated BatchInserters interface, which does not create transactions during insertion. It is estimated that the memory usage is very small. Basically, the memory remains unchanged during operations on different data volumes. It can be seen that when importing a large amount of data to NEO4J, there are two methods to achieve rapid insertion:
Shfa
This method divides a large number of data sets into 5000 or fewer sets, and inserts data using the transaction insert interface. In this way, the overall insertion time is based on the above test results, 100000 data records can be inserted within 30 seconds. The disadvantage is that you need to split the dataset into a small set. The advantage is that when you are already running a set of NEO4J databases, you only need to modify the relevant code and do not need to pause the database during the import.
Batch insert method
This method can achieve fast insertion regardless of the amount of data, achieving a balance between speed and memory. It is suitable for importing a large amount of data at one time during database initialization (or when a large amount of data needs to be imported; the disadvantage is that you need to pause the database when importing data and use the BatchInserters interface to import the data. This does not enable uninterrupted business operation.
Suggestion:
The large-to-small method is adopted. When the number of data inserted (imported) exceeds 1000, batch inserts can be used to quickly insert data, it can also ensure that the memory usage does not change much, resulting in OOM.