Cassandra Data storage structure
The data in the Cassandra is divided into three main types:
Commitlog: The main record of the data submitted by the client and operations. This data will be persisted to disk so that the data is not persisted to disk and can be used for recovery.
Memtable: The user writes the data in the form of memory, and its object structure is described in detail later. In fact, there is another form of binarymemtable This format is currently Cassandra not used, this is no longer introduced.
Sstable: Data is persisted to disk, which is divided into data, Index, and Filter three formats.
Commitlog data format
Commitlog data only one, that is, according to a certain format to form the number of bytes, write to the IO buffer in time to be brushed to disk persistence, in the previous profile of the configuration file has been said to commitlog the persistence of two ways, one is periodic one is Batch, their data format is the same, but the former is asynchronous, the latter is synchronized, the data is brushed to the frequency of the disk is not the same. The related class structure diagram of Commitlog is as follows:
Figure 1. Commitlog-related class structure diagram
Its persistence strategy is simply to serialize the object rowmutation the user submitted data into a byte array, and then pass the object and byte array to the Logrecordadder object, which is invoked by the Logrecordadder object commitl Ogsegment Write method to complete the writing operation, the code of this write method is as follows:
Listing 1. Commitlogsegment. Write
Public Commitlogsegment.commitlogcontext write (rowmutation rowmutation,
Object serializedrow) {
Lon G currentposition = -1l;
...
Checksum Checkum = new CRC32 ();
if (serializedrow instanceof dataoutputbuffer) {
Dataoutputbuffer buffer = (dataoutputbuffer) Seri Alizedrow;
Logwriter.writelong (Buffer.getlength ());
Logwriter.write (Buffer.getdata (), 0, Buffer.getlength ());
Checkum.update (Buffer.getdata (), 0, Buffer.getlength ());
}
else{
assert Serializedrow instanceof byte[];
byte[] bytes = (byte[]) Serializedrow;
Logwriter.writelong (bytes.length);
Logwriter.write (bytes);
Checkum.update (bytes, 0, bytes.length);
}
Logwriter.writelong (Checkum.getvalue ());
...
}
The main function of this code is that if the current ID of the columnfamily is not serialized, a Commitlogheader object will be generated based on this ID, the position in the current Commitlog file will be recorded, and the header is serialized, Overwrite the previous header. This header may contain multiple IDs of the rowmutation corresponding to the columnfamily that are not serialized to disk. If it already exists, write the serialized result of the Rowmutation object directly into the Commitlog file buffer and add a CRC32 check code. The Byte array is formatted as follows:
Figure 2. Commitlog file Array structure
The IDs of each of the different columnfamily in the previous illustration are included in the header, which is intended to make it easier to judge that the data is not serialized.
The role of Commitlog is to recover data that is not written to disk, and how to recover from the data stored in the Commitlog file? This code is in the Recover method:
Listing 2. Commitlog.recover
public static void recover(File[] clogs) throws IOException{
...
final CommitLogHeader clHeader = CommitLogHeader.readCommitLogHeader(reader);
int lowPos = CommitLogHeader.getLowestPosition(clHeader);
if (lowPos == 0) break;
reader.seek(lowPos);
while (!reader.isEOF()){
try{
bytes = new byte[(int) reader.readLong()];
reader.readFully(bytes);
claimedCRC32 = reader.readLong();
}
...
ByteArrayInputStream bufIn = new ByteArrayInputStream(bytes);
Checksum checksum = new CRC32();
checksum.update(bytes, 0, bytes.length);
if (claimedCRC32 != checksum.getValue()){continue;}
final RowMutation rm =
RowMutation.serializer().deserialize(new DataInputStream(bufIn));
}
...
}