Facebook recently unveiled Apollo, a Paxos-like NoSQL database on Facebook. Built on the Apache Thrift 2 RPC framework, Apollo was developed as a tiered storage system, and all data was partitioned into Shard, very similar to the regional servers in HBase. Its greatest benefit is low-latency online storage, especially in flash and memory.
Unlike storage for documents and key values, Apollo is a modified data structure that allows you to store maps, queues, trees, key values, and so on. Each individual block of data in the system is very small, from 1 bytes to 1MB, while all the total sizes are from 1MB to 10+PB. It supports more servers from at least three to thousands of units.
Each shard has four components.
The first one is the quorum conformance protocol .
the second component is storage. Currently, primary storage is based on ROCKSDB, a key/value storage structure built on Google Leveldb. Although it is key/value storage, Facebook uses it to emulate other data structures. Apollo was designed to store unknown structures, and the development team is also increasing support for MySQL as an alternative storage engine.
The third component is the client API, which has the read () and write () methods. all the operations that Apollo performs on the Shard layer are atomic, so you can describe the preconditions and, if satisfied, return reads or writes. The code examples are as follows:
Read (conditions: {map (M1). Contains (x)},
reads: {deque (D2). Back ()})
The code above says "If map M1 contains X, it returns the value on the back of the double-ended queue (Deque) D2. ”
You can combine any number of conditions with any amount of read.
Write is also very similar, allowing you to describe the condition:
Write (conditions: {ver (k1) = = v}, reads: {},
Writes: {val (k1): = x})
the last component is a fault-tolerant state machine (Fault tolerant-MACHINE,FTSM). They are primarily used by system code, but can also be used by user code. Each ftsm belongs to Shard, for example, in a shard with three machines, they all execute the same code at the same time. They can access the persistent storage of each machine. Most importantly, if a node fails, the code will continue to execute in the correct order in which all nodes agree.
State machines are also used for load balancing, data migration, shard creation and destruction, and for coordinating cross-shard transactions. State machines also have external side effects, such as the ability to send RPC requests to remote machines, but whenever they want to change the persisted state, they must be submitted to raft to obtain the consent of all servers.
Facebook is currently using Apollo as a replacement for some of the memcached's applications, and Facebook now uses memcached on a large scale. The company is also trying to use it as a reliable queuing system for sending Facebook messages to iOS, Android and carrier SMS. It may also be used for faster analysis.
Apollo is still in the development phase, and there is no open source, but it will be published to everyone later.
This blog related references see: Http://mingkr.com/facebook-nosql-apollo