Thinking logic of computer programs (60) and thinking Logic
Section 57 introduces byte streams, and section 58 introduces byte streams. They all read and write files in the stream mode. There are several restrictions on the stream mode:
- Either read or write, not both read and write
- Random read/write is not allowed. It can only be read from the beginning and end, and cannot be read repeatedly. Although partial re-reading can be achieved through buffering, there are restrictions.
In Java, there is also a class RandomAccessFile, which does not have these two restrictions. It can be read, written, or random read/write. It is a encapsulation class closer to the operating system API.
In this section, we will introduce this class. At the same time, we will introduce an application of this class to implement a simple key-Value Pair database. How can we implement a database? Let's first look at the usage of RandomAccessFile.
RandomAccessFile
Constructor
RandomAccessFile has the following constructor:
public RandomAccessFile(String name, String mode) throws FileNotFoundExceptionpublic RandomAccessFile(File file, String mode) throws FileNotFoundException
The parameter name and file are easy to understand, indicating the File path and file object. What does mode mean? It indicates the open mode, which can have four values:
- "R": used only for reading
- "Rw": used for reading and writing
- "Rws": similar to "rw", it is used for reading and writing. In addition, it requires that the file content and metadata be synchronized to the device for any updates.
- "Rwd": Like "rw", it is used for reading and writing. In addition, it requires that any updates to the file content be synchronized to the device. The difference with "rws" is that, metadata updates do not require synchronization.
DataInput/DataOutput Interface
Although RandomAccessFile is not a subclass of InputStream/OutputStream, it also has a method similar to reading and writing throttling. In addition, it also implements the DataInput/DataOutput interface. We have introduced these methods before, some methods are listed here to enhance your intuitive experience:
// Read a single byte, with a minimum of eight digits. values 0 to 255 public int read () throws IOExceptionpublic int read (byte B []) throws IOExceptionpublic int read (byte B [], int off, int len) throws IOExceptionpublic final double readDouble () throws IOExceptionpublic final int readInt () throws IOExceptionpublic final String readUTF () throws IOExceptionpublic void write (int B) throws IOExceptionpublic final void writeInt (int v) throws IOExceptionpublic void write (byte B []) throws IOExceptionpublic void write (byte B [], int off, int len) throws IOExceptionpublic final void writeUTF (String str) throws IOExceptionpublic void close () throws IOException
RandomAccessFile has two other read methods:
public final void readFully(byte b[]) throws IOExceptionpublic final void readFully(byte b[], int off, int len) throws IOException
The difference with the corresponding read methods is that they can ensure that the desired length is read. If they do not read enough at the end of the file, they will throw an EOFException.
Random Access
RandomAccessFile has a file pointer pointing to the current read/write location. All read/write operations automatically update this pointer. Unlike the stream, RandomAccessFile can obtain this pointer, you can also change the pointer by using the following methods:
// Get the current file pointer public native long getFilePointer () throws IOException; // change the current file pointer to pospublic native void seek (long pos) throws IOException;
RandomAccessFile uses a local method to call the operating system API to adjust the file pointer.
InputStream has a skip method that can skip n bytes in the input stream. By default, it is implemented by actually reading n Bytes. RandomAccessFile has a similar method, however, it is implemented by changing the file pointer:
public int skipBytes(int n) throws IOException
RandomAccessFile can directly get the file length and return the number of bytes of the file. The method is as follows:
public native long length() throws IOException;
It can also directly modify the file length:
public native void setLength(long newLength) throws IOException;
If the length of the current file is smaller than newLength, the file will be extended, and the content of the extended part is undefined. If the length of the current file is greater than newLength, the file will be shrunk and the excess part will be truncated. If the current file pointer is larger than newLength, it will change to newLength after being called.
Methods To note
RandomAccessFile has the following methods:
public final void writeBytes(String s) throws IOExceptionpublic final String readLine() throws IOException
It seems that writeBytes can directly write strings, while readLine can read strings by row. In fact, both methods are problematic and they do not have the concept of encoding, assume that a byte represents a character, which is obviously not true for Chinese. Therefore, avoid using these two methods.
BasicDB Design
In daily reading and writing of common files, you can use a stream. However, in some system programs, the stream is not suitable. RandomAccessFile is closer to the operating system and more convenient and efficient.
Next, let's look at how to use RandomAccessFile to implement a simple key-value database, which we call BasicDB.
Function
The interface provided by BasicDB is similar to the Map interface. It can be saved, searched, and deleted by pressing a button, but the data can be saved to a file persistently.
In addition, unlike HashMap/TreeMap, they store all data in the memory, BasicDB only stores metadata such as index information in the memory, and value data is stored in the file. Compared with HashMap/TreeMap, the memory consumption of BasicDB can be greatly reduced, and the number of stored key-value pairs is greatly increased, especially when the value data is relatively large. BasicDB ensures efficiency through indexes and random read/write functions of RandomAccessFile.
Interface
Externally, BasicDB provides the following constructor methods:
public BasicDB(String path, String name) throws IOException
Path indicates the directory where the database file is located. The directory must already exist. Name indicates the name of the database. BasicDB will store the metadata of two files starting with name. The suffix is. meta, and the value data in a key-value pair is stored with the suffix. data. For example, if name is student, the two files are student. meta and student. data, these two files may not exist. If they do not exist, create a new database. If they exist, load the existing database.
BasicDB provides the following public methods:
// Save the key-value pair. The key type is String and the value is byte array public void put (String key, byte [] value) throws IOException // obtain the value based on the key, if the key does not exist, return nullpublic byte [] get (String key) throws IOException // Delete public void remove (String key) based on the key) // ensure that all data is saved to the file public void flush () throws IOException // close the database public void close () throws IOException
To facilitate implementation, we assume that the length of the byte array cannot exceed 1020. If it exceeds, an exception is thrown. Of course, this length can be adjusted in the code.
After put and remove are called, the changes are not immediately reflected in the file. If you need to ensure that the changes are saved to the file, you need to call flush.
Use
In BasicDB, we designed a byte array, which seems to be a limitation and is inconvenient to use. We mainly want to simplify it and save any data into a byte array. For strings, you can use the getBytes () method. For objects, you can use the previous flow to convert them to byte arrays.
For example, to save some student information to the database, the code can be:
private static byte[] toBytes(Student student) throws IOException { ByteArrayOutputStream bout = new ByteArrayOutputStream(); DataOutputStream dout = new DataOutputStream(bout); dout.writeUTF(student.getName()); dout.writeInt(student.getAge()); dout.writeDouble(student.getScore()); return bout.toByteArray();}public static void saveStudents(Map<String, Student> students) throws IOException { BasicDB db = new BasicDB("./", "students"); for (Map.Entry<String, Student> kv : students.entrySet()) { db.put(kv.getKey(), toBytes(kv.getValue())); } db.close();}
Store Student information to the students database in the current directory. The toBytes method converts Student to byte.
In subsequent sections, we will introduce serialization. If you have the serialization knowledge, we can replace the byte array with any serializable object. Even if the byte array is used, the code of the toBytes method can be more concise with serialization.
Design
We adopt the following simple design:
For the moment, we do not consider consistency issues caused by concurrent access and abnormal shutdown.
This design is obviously rough and mainly used to demonstrate some basic concepts. Let's look at the code below.
Implementation of BasicDB
Internal components
BasicDB has the following static variables:
Private static final int MAX_DATA_LENGTH = 1020; // fill in the byte private static final byte [] ZERO_BYTES = new byte [MAX_DATA_LENGTH]; // the data file suffix private static final String DATA_SUFFIX = ". data "; // Metadata File suffix, including index and blank space data private static final String META_SUFFIX = ". meta ";
The data structure in the memory that indicates the index and blank space is:
// Index information, key-> value. location Map in the data file <String, Long> indexMap; // blank space with the value in. location Queue in the data file <Long> gaps;
The data structure of the file is:
// Value data File RandomAccessFile db; // Metadata File metaFile;
Constructor
The constructor code is:
public BasicDB(String path, String name) throws IOException{ File dataFile = new File(path + name + DATA_SUFFIX); metaFile = new File(path + name + META_SUFFIX); db = new RandomAccessFile(dataFile, "rw"); if(metaFile.exists()){ loadMeta(); }else{ indexMap = new HashMap<>(); gaps = new ArrayDeque<>(); }}
When a metadata file exists, loadMeta is called to load the metadata to the memory. Let's assume that the metadata does not exist. Let's look at other code first.
Save key-value pairs
The code for the put method is:
public void put(String key, byte[] value) throws IOException{ Long index = indexMap.get(key); if(index==null){ index = nextAvailablePos(); indexMap.put(key, index); } writeData(index, value);}
First, query whether the key exists through the index. If not, call nextAvailablePos () to find a storage location for the value and save the key and storage location to the index. Finally, call writeData to write the value to the data file.
The code for the nextAvailablePos method is:
private long nextAvailablePos() throws IOException{ if(!gaps.isEmpty()){ return gaps.poll(); }else{ return db.length(); }}
It first looks for the blank space. If yes, it will be reused; otherwise, it will be located at the end of the file.
The writeData method actually writes value data. Its code is:
private void writeData(long pos, byte[] data) throws IOException { if (data.length > MAX_DATA_LENGTH) { throw new IllegalArgumentException("maximum allowed length is " + MAX_DATA_LENGTH + ", data length is " + data.length); } db.seek(pos); db.writeInt(data.length); db.write(data); db.write(ZERO_BYTES, 0, MAX_DATA_LENGTH - data.length);}
It first checks the length, when the length is satisfied, locates at the specified position, writes the actual data length, writes the content, and finally fills in the white.
We can see that in this implementation, the index information and blank space information are not saved to the file in real time. To save the information, you need to call the flush method. We will look at this method later.
Get value based on key
The get method code is:
public byte[] get(String key) throws IOException{ Long index = indexMap.get(key); if(index!=null){ return getData(index); } return null;}
If the key exists, call getData to obtain data. The Code of getData is:
private byte[] getData(long pos) throws IOException{ db.seek(pos); int length = db.readInt(); byte[] data = new byte[length]; db.readFully(data); return data;}
The Code is also very simple. Locate the specified position, read the actual length, and then call readFully to read enough content.
Delete a key-Value Pair
The code of the remove Method is:
public void remove(String key){ Long index = indexMap.remove(key); if(index!=null){ gaps.offer(index); }}
Delete from the index structure and add it to the blank space queue.
Synchronize metadata flush
The flush method code is:
public void flush() throws IOException{ saveMeta(); db.getFD().sync();}
Review, getFD () will return the file descriptor, and its sync method will ensure that the file content is saved to the device. The code of the saveMeta method is:
private void saveMeta() throws IOException{ DataOutputStream out = new DataOutputStream( new BufferedOutputStream(new FileOutputStream(metaFile))); try{ saveIndex(out); saveGaps(out); }finally{ out.close(); }}
The index information and blank space are saved in a file, and saveIndex stores the index information. The code is:
private void saveIndex(DataOutputStream out) throws IOException{ out.writeInt(indexMap.size()); for(Map.Entry<String, Long> entry : indexMap.entrySet()){ out.writeUTF(entry.getKey()); out.writeLong(entry.getValue()); }}
First, save the number of key-value pairs, and then store the key and value in the. data file for each index information.
SaveGaps saves the blank space information. The code is:
private void saveGaps(DataOutputStream out) throws IOException{ out.writeInt(gaps.size()); for(Long pos : gaps){ out.writeLong(pos); }}
The length is also saved first, and then each blank space information is saved.
We used the previously introduced stream for saving. The code is cool. If we use the serialization described in subsequent chapters, the code will be more concise.
Load metadata
In the constructor, we mentioned the loadMeta method, which is the inverse operation of saveMeta and the code is:
private void loadMeta() throws IOException{ DataInputStream in = new DataInputStream( new BufferedInputStream(new FileInputStream(metaFile))); try{ loadIndex(in); loadGaps(in); }finally{ in.close(); }}
Load the index using loadIndex. The code is:
private void loadIndex(DataInputStream in) throws IOException{ int size = in.readInt(); indexMap = new HashMap<String, Long>((int) (size / 0.75f) + 1, 0.75f); for(int i=0; i<size; i++){ String key = in.readUTF(); long index = in.readLong(); indexMap.put(key, index); }}
LoadGaps loads a blank space. The code is:
private void loadGaps(DataInputStream in) throws IOException{ int size = in.readInt(); gaps = new ArrayDeque<>(size); for(int i=0; i<size; i++){ long index = in.readLong(); gaps.add(index); }}
Close
The code for disabling the database is:
public void close() throws IOException{ flush(); db.close();}
Is to synchronize data and close the data file.
Summary
This section describes the usage of RandomAccessFile, which can be read and written randomly and is closer to the operating system API. When implementing some system programs, it is more convenient and efficient than the stream. Using RandomAccessFile, we implemented a very simple key-Value Pair database. We demonstrated the usage, interface, design, and implementation code of this database. In this example, we also demonstrate the usage of containers and streams described earlier.
This database is simple and rough, but it also has some excellent features, such as a small memory space, a large number of key-value pairs can be stored, and can be accessed efficiently based on keys. Complete code can be downloaded from github: https://github.com/swiftma/program-logic.
There is another way to access files, that is, memory ing files. What features does it have? What is the purpose? Let's continue exploring in the next section.
----------------
For more information, see the latest article. Please pay attention to the Public Account "lauma says programming" (scan the QR code below), from entry to advanced, ma and you explore the essence of Java programming and computer technology. Retain All copyrights with original intent.