Berkeley DB Basic Tutorial

Source: Internet
Author: User


I. Introduction to Berkeley DB

(1) Berkeley DB is an embedded database that is suitable for managing massive, simple data. If Google uses it to save account information. Heritrix use it to save Froniter.

(2) Key/value is the basis for Berkeley DB to manage data, and each Key/value pair represents a record.

(3) Berkeley DB adopts B-tree at the bottom of the implementation. Can be seen as a hashmap that can store large amounts of data.

(4) It is a product of Oracle company, the C + + version number is up to date, and then Java and other version numbers appear.

It does not support SQL statements. The application operates on the database through the API.

The following content is reproduced to Baidu Library

Berkeley DB is a library of open-source embedded databases developed by the United States Sleepycat Software Corporation (Database library). It provides scalable, high-performance, transaction-protected data management services for applications.

Berkeley DB provides a concise set of function call API interfaces for data access and management.


It is a classic c-library model of toolkit that provides a wide range of function sets for program apes and is designed for application developers to provide industrial-strength database services.

The main features are as follows:
Embedded (Embedded): It is directly linked to the application, and is executed in the same address space as the application, so. Database operations do not require interprocess communication, whether between different computers on the network or between different processes on the same computer.
Berkeley DB provides API interfaces for a variety of programming languages, including C, C + +, Java, Perl, TCL, Python, and PHP, all of which occur within the library. Multiple processes. or multiple threads of the same process can use the database at the same time, as if they were used separately, the underlying services such as locking, transaction logs, shared buffer management, memory management, and so on, are run transparently by the library.


Lightweight and flexible (portable): It can be performed on nearly all UNIX and Linux systems and their variant systems, Windows operating systems, and a variety of embedded real-time operating systems.

It can be executed on both 32-bit and 64-bit systems and has been used by many high-end Internet servers, desktops, PDAs, set-top boxes, network switches, and other applications. Once the Berkeley DB is linked to the application, the end user generally does not feel that there is a database system present at all.


Scalable (Scalable): This table is very multifaceted today. The Database library itself is very streamlined (less than 300KB of text space). But it can manage databases up to 256TB in size. It supports a high degree of concurrency. Thousands of users can manipulate the same database at the same time. Berkeley DB can be implemented in tightly constrained embedded systems with a small enough footprint, and can consume several gigabytes of memory and several terabytes of disk space on high-end servers.


Berkeley DB is better than relational databases and object-oriented databases in embedded applications. There are two reasons:
(1) Database operations do not require interprocess communication because the database library is executed in the same address space as the application. The overhead of process communication between different processes in a machine or between machines in a network. is much larger than the cost of a function call;
(2) Since Berkeley DB uses a set of API interfaces for all operations, there is no need to parse a query language or generate execution plans, which greatly improves execution efficiency.


BERKELEYDB System Structure

Berkeley DB consists of five basic subsystems. Includes: Access management subsystem, memory pool management subsystem, transaction subsystem, lock subsystem, and log subsystem.

While the access management subsystem is the internal core component of the Berkeley DB Database process package, other subsystems exist outside the Berkeley DB database process package.
Each subsystem supports a different level of application.


1. Data Access Subsystem
The data Access Methods subsystem provides a variety of support for creating and accessing database files.

Berkeley DB provides the following four methods of file storage:
Hash files, b-trees, fixed-length records (queues), and variable-length records (simple storage based on record numbers). The application is able to select the most appropriate file organization structure from which to use.


Program apes can use a random structure when creating tables. and the ability to mix files of different storage types in the same application.


In the absence of transaction management, the modules in this subsystem can be used alone to provide high-speed and efficient data access services for applications.
The data access subsystem is suitable for applications that require only high-speed format files to be accessed without a transaction.
2. Memory Pool Management Subsystem
The memory pool subsystem effectively manages the shared buffers used by Berkeley DB. It agrees to access multiple threads of the database at the same time, or to share a quick cache with multiple processes.    Responsible for writing the changed page back to the file and allocating the memory space for the newly paged page. It is also able to be used independently of the Berkeley DB System, and is individually applied by the application. Allocates memory space for its own files and pages.

The Memory pool management subsystem is suitable for applications that require flexible, page-oriented, buffered shared file access.


3. Transaction subsystem
The transaction (Transaction) subsystem provides transaction management capabilities for Berkeley DB.

It agreed to think of a set of changes to the database as an atomic unit, which either did it all or did nothing. In the default case. The system will provide strict acid transaction properties, but the application can choose not to use the system's isolation guarantees. The subsystem uses two-segment lock technology and write-ahead log policy to ensure the correctness and consistency of database data.

It can also be used by the application alone to protect its own data updates. The transaction subsystem is suitable for applications that require transaction assurance data changes.

4. Lock Subsystem
The Lock (Locking) subsystem provides a locking mechanism for Berkeley DB. Provides multi-user read and single-user change sharing control of the same object for the system. The subsystem can be used by the data access subsystem to gain read and write access to the page or record.   The transaction subsystem uses the locking mechanism to implement concurrency control for multiple transactions. The subsystem can also be used by the application alone. The lock subsystem is suitable for a flexible, high-speed, configurable lock manager.

5. Log Subsystem
The log (Logging) subsystem is used to write a log-first policy. Used to support the transaction subsystem for data recovery. Ensure data consistency. It is unlikely to be used by the application alone, only as a calling module of the transaction subsystem.

The above parts constitute the entire Berkeley DB database system. The relationships of each part, for example, are as seen:

In this model, the application directly calls the data access subsystem and the transaction management subsystem, which in turn calls the lower memory management subsystem, the lock subsystem, and the log subsystem.

Because several subsystems are relatively independent, the application can specify which data management services will be used at the beginning. Can be used in all, and can only use part of it.

Like what. Suppose an application needs to support multiple user concurrency operations. But there is no need for transaction management, it can
Simply use the lock subsystem instead of the transaction. Some applications may require a high-speed, single-user, B-tree storage structure without transaction management capabilities. Then the application can invalidate the lock subsystem and the transaction sub-system, which reduces overhead.


BerkeleyDB Storage Features Overview

The logical organizational unit of data managed by Berkeley DB is a number of independent or relational databases (database), each of which consists of several records, all of which are represented as (Key,value) forms. Suppose to put a set of related (key. Value) is also considered a table, then each database simply agrees to store a table, which differs from the general relational database. In fact, the "database" mentioned in Berkeley DB is equivalent to a table in a general relational database system. The "Key/data" pair is equivalent to rows in the relational database system (rows). Berkeley DB does not provide direct access to the columns in the relational database, but instead encapsulates the fields (columns) in the data item in the "Key/data" pair with the actual application.
In the physical organization, each database can be created with an application that chooses an appropriate storage structure based on its data characteristics. The four file storage structures to choose from are: hash files, b-trees, fixed-length records (queues), and variable-length records (simple storage based on record numbers).


A physical file can hold only a single database, and can hold several related or unrelated databases. And these databases can be used in addition to the queue at random different organizational mode, the queue organization of the database can only be stored in a single file, can not be mixed with other storage types stored.


A file can theoretically store arbitrary multiple databases in addition to being constrained by the maximum file length and storage space. Therefore, it usually takes two parameters for the system to locate a database-"file name" and "Database name". This is also Berkeley DB different from
Where the general relational database is.


The Berkeley DB Storage System provides an array of interface functions for the application to manage and manipulate the database. This includes:

(1) database creation, opening, closing, deletion, renaming and so on, as well as the retrieval of data and additions and deletions to the operation;
(2) Provide some additional functions, such as reading the database status information, reading the information of the file, reading the information of the database environment, emptying the contents of the database, synchronous backup of the database, version number upgrade, error message and so on;
(3) The system also provides a cursor mechanism for accessing and interviewing groups of data, as well as the correlation and equivalent connection of two or more related databases;
(4) The system also gives some interface functions to optimize the access policy configuration, for example, the application can set its own B-tree sorting function, the minimum number of keys stored on each page. The fill factor of the hash bucket, the hash function, the maximum length of the hash table, the maximum length of the queue, the byte order in which the database is stored,
The size of the underlying storage page. The memory allocation function, the size of the fast cache, the size of the fixed-length record, and the padding bit. The delimiter used to change the length of the record, and so on.



II. Application of Berkeley DB

1, from the official site http://www.oracle.com/technetwork/database/database-technologies/berkeleydb/overview/ index.html Download the installation files for Berkeley DB and the Java Development Kit.

2, install Berkeley DB in Windows, always press the next step. For ease of development, the Windows version number is installed and the Linux version number should be used for formal execution.

(An error occurred while setting path, you need to execute the installer as an administrator).

3. Put the jar files in the Java development package into BuildPath. Mainly includes Je-6.0.11.jar, Jejconsole.jar, Epjejconsole.jar three packs.


Test procedure:

Package Com.ljh.test;import static Org.junit.assert.*;import Org.junit.before;import Org.junit.test;public class berkeleydbutiltest {private Berkeleydbutil dbutil = null, @Beforepublic void Setup () {dbutil = new Berkeleydbutil ("D:/tmp" );} @Testpublic void Testwritetodatabase () {for (int i = 0; i < i++) {dbutil.writetodatabase (i+ "", "Student" +i, True);}} @Testpublic void Testreadfromdatabase () {String value = Dbutil.readfromdatabase ("2"); Assertequals (Value, "Student 2");} @Testpublic void Testgeteveryitem () {int size = Dbutil.geteveryitem (). Size (); Assertequals (size, 10);} @Testpublic void Testdeletefromdatabase () {dbutil.deletefromdatabase ("4"); Assertequals (9, Dbutil.geteveryitem (). Size ());} public void Cleanup () {dbutil.closedb ();}}


Basic operation of Berkeley DB:

Contains the following sections

(1) Open the database

(2) Writing data to the database

(3) reading a certain data according to the key value

(4) Read the full data list

(5) Delete a data based on key value

(6) Closing the database

Note: Because individual operations may be the same database, is it necessary to use singleton mode?

Package Com.ljh.test;import Java.io.file;import Java.io.unsupportedencodingexception;import java.util.ArrayList; Import Com.sleepycat.je.cursor;import Com.sleepycat.je.cursorconfig;import Com.sleepycat.je.database;import Com.sleepycat.je.databaseconfig;import Com.sleepycat.je.databaseentry;import Com.sleepycat.je.Environment;import Com.sleepycat.je.environmentconfig;import Com.sleepycat.je.lockconflictexception;import Com.sleepycat.je.lockmode;import Com.sleepycat.je.operationstatus;import Com.sleepycat.je.transaction;import Com.sleepycat.je.transactionconfig;public class Berkeleydbutil {//Database environment private Environment ENV = null;//Database Private static database Frontierdatabase = null;//db name private static String DbName = "Frontier_database";p ublic berkeleydbutil (S Tring homedirectory) {//1, create environmentconfigenvironmentconfig envconfig = new Environmentconfig (); Envconfig.settransactional (True); Envconfig.setallowcreate (true);//2, using Environmentconfig configuration environmentenv = new Environment (New File (homedirectory), envconfig);//3, creating databaseconfigdatabaseconfig Dbconfig = new Databaseconfig (); Dbconfig.settransactional (True);d bconfig.setallowcreate (TRUE);//4, Use environment with Databaseconfig to open databasefrontierdatabase = Env.opendatabase (null, DbName, dbconfig);} /* * Write records to the database. And infer whether there is enough data to be repeated. Pass in key and value * If you can have repeated data. You can use put () directly. If you cannot have repeated data, use Putnooverwrite ().

*/public boolean writetodatabase (String key, String value, Boolean Isoverwrite) {try {//Set Key/value, note databaseentry used within is the bytes array databaseentry thekey = new Databaseentry (key.getbytes ("UTF-8"));D atabaseentry thedata = new Databaseentry ( Value.getbytes ("UTF-8")); Operationstatus status = NULL; Transaction Txn = null;try {//1, Transaction configuration transactionconfig txconfig = new TransactionConfig (); Txconfig.setserializableisolation (true); txn = Env.begintransaction (null, txconfig);//2, write data if (isoverwrite) {status = Frontierdatabase.put (Txn, Thekey, thedata);} else {status = Frontierdatabase.putnooverwrite (Txn, thekey,thedata);} Txn.commit (); if (status = = Operationstatus.success) {System.out.println ("write to Database" + DbName + ":" + key + "," + value); retur n true;} else if (status = = Operationstatus.keyexist) {System.out.println ("writes to database + DbName +": "+ key +", "+ Value +" fails, the value already exists " ); return false;} else {System.out.println ("write to Database" + DbName + ":" + key + "," + Value + "failed"); return false;}} catch (lockconfLictexception lockconflict) {txn.abort (); System.out.println ("Write to Database" + DbName + ":" + key + "," + value+ "with lock exception"); return false;} catch (Exception e) {///error handling System.out.println ("Write to Database" + DbName + ":" + key + "," + value+ "error"); return false;} /* * Read data from the database incoming key returns value */public string Readfromdatabase (string key) {try {databaseentry Thekey = new Databaseentry (k Ey.getbytes ("UTF-8"));D atabaseentry thedata = new Databaseentry (); Transaction Txn = null;try {///1, configuration Transaction related information transactionconfig txconfig = new TransactionConfig (); Txconfig.setserializableisolation (true); txn = Env.begintransaction (null, txconfig);//2, read data operationstatus status = Frontierdatabase.get (Txn, Thekey,thedata, Lockmode.default); Txn.commit (); if (status = = Operationstatus.success) {//3 , convert bytes to stringbyte[] Retdata = Thedata.getdata (); String value = new String (Retdata, "UTF-8"); SYSTEM.OUT.PRINTLN ("from database" + DbName + "read:" + key + "," + value "; return value;} else {System.out.println ("No Record found for key '"+ key + "'."); Return "";}} catch (Lockconflictexception lockconflict) {txn.abort (); SYSTEM.OUT.PRINTLN (read from database + DbName +: "+ key +" Lock exception appears "); return" ";}} catch (Unsupportedencodingexception e) {e.printstacktrace (); return "";}} /* * traverse all records in the database. Back to list */public arraylist<string> Geteveryitem () {//TODO auto-generated method StubSystem.out.println ("======== = = = Traverse the database "+ DbName +" in all data ========== "); Cursor mycursor = null; arraylist<string> resultlist = new arraylist<string> (); Transaction Txn = null;try {txn = this.env.beginTransaction (null, NULL); Cursorconfig cc = new Cursorconfig (); cc.setreadcommitted (true); if (mycursor = = null) MyCursor = Frontierdatabase.opencursor (TXN, cc);D atabaseentry foundkey = new Databaseentry ();D atabaseentry founddata = new Databaseentry ()///Use the Cursor.getprev method to traverse the cursor to get the data if (Mycursor.getfirst (Foundkey, founddata, lockmode.default) = = operationstatus.success) {String thekey = new String (Foundkey.getdata (), "UTF-8"); String thedata = new String (fouNddata.getdata (), "UTF-8"); Resultlist.add (Thekey); System.out.println ("Key | Data: "+ Thekey +" | "+ thedata+" "); while (Mycursor.getnext (Foundkey, founddata, lockmode.default) = = Operationstatus . SUCCESS) {Thekey = new string (Foundkey.getdata (), "UTF-8"), Thedata = new String (Founddata.getdata (), "UTF-8"); Resultlist.add (Thekey); System.out.println ("Key | Data: "+ Thekey +" | " + Thedata + "");}} Mycursor.close (); Txn.commit (); return resultlist;} catch (Unsupportedencodingexception e) {e.printstacktrace (); return null;} catch (Exception e) {System.out.println (" Geteveryitem processing Exception "); Txn.abort (); if (mycursor! = null) {Mycursor.close ();} return null;}} /* * Delete a record in the database based on the key value */public Boolean deletefromdatabase (String key) {Boolean success = False;long Sleepmillis = 0;for ( int i = 0; I < 3; i++) {if (Sleepmillis! = 0) {try {thread.sleep (sleepmillis);} catch (Interruptedexception e) {e.printstacktrace ();} Sleepmillis = 0;} Transaction Txn = null;try {///1, use Cursor.getprev method to traverse cursor fetch data TransactionConfig txconfig = new TransactionConfig (); Txconfig.setserializableisolation (true); txn = Env.begintransaction (null, Txconfig);D atabaseentry thekey;thekey = new Databaseentry (key.getbytes ("UTF-8"));//2, delete data and submit operationstatus res = Frontierdatabase.delete (TXN, Thekey); Txn.commit (); if (res = = operationstatus.success) {System.out.println ("from database" + DbName + "Delete:" + key); success = True;return success;} else if (res = = Operationstatus.keyempty) {System.out.println ("not found in database" + DbName + "):" + key + ".

Cannot delete ");} else {System.out.println ("delete operation failed because" + res.tostring ());} return false;} catch (Unsupportedencodingexception e) {e.printstacktrace (); return false;} catch (Lockconflictexception lockconflict) {System.out.println ("delete operation failed with lockconflict exception"); sleepmillis = 1000;continue;} finally {if (!success) {if (TXN! = null) {Txn.abort ();}}}} return false;} public void Closedb () {if (frontierdatabase! = null) {Frontierdatabase.close ();} if (env! = null) {Env.close ();}}}





Berkeley DB Basic Tutorial

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.