Calvinfs got the best paper for fast 15, found a little partner for 13 years, and found out that King brother, who worked as a youth League committee, was interested in Calvin; Other teams have been more interested in distributed transactions and storage issues/interests ... A few things inspired me to write this article's motives, to know that the previous article was 2012 (although there has been a personal study, work notes). The most valuable part of yale's CALVINFS is the Meta Data Management Section, which is Calvin (the modified version). There is no cross-IDC Calvin, there is no calvinfs across the IDC. The following are written in the spectator perspective, some of which are simple to describe, but are actually very difficult to handle, and the authors of distributed systems such as Calvin may have spent a lot of effort to complete them. Mention two things before entering the Calvin content. One is acid and cap--because there is a, C, so put together to mention: P. Acid is the concept of a database, that is, a transaction; cap is the concept of a distributed system, saying that there are no three perfect systems. A and C are different in the two. In acid A is atomic, and a in cap is usability, whereas in C, the former refers to a consistent change in related content (such as indexes and data, associations, etc.), which generally says that several copies of the same content change consistently. Acid inside I is isolation, D is persistent, are better understood. The p in Cap is said to be partition tolerant. Another is 2PC and Paxos. First of all, there are too many varieties of Paxos, on-line data often mix several variants, it is recommended to understand or read books or original paper. The distributed transaction will refer to 2PC, 3PC, PAXOS, the middle kind of generally no one to say. 2PC The main problem is blocking, non-coordinator service hangs basically have a better solution, and coordinator in prepare received All is OK after hanging off ... And Paxos aspect, since has zookeeper, also many people used up, in addition to synchronizing the data, there are some application scenarios are HA selection master and so on. The general function of the tall is accompanied by a compromise in the CAP, Calvin is no exception. Machine hanging off or IDC fault, such as p problems always exist, generally only in a and c to make choices, Calvin main sacrifice is a availability (this is similar to Spanner's choice, in fact, is now slowly recognized direction). Spanner benefits from the design of its truetime global logic time, which achieves the level of external consistency, and the real system is unlikely to do much better. The Calvin chooses the sequential consistency, which is weaker than the external consistency. But all systems or designs are part of the scene, and Calvin is no exception, though itProvides a strong and consistent level of support for acid, but does not mean that you must unconditionally abandon the pursuit of a in all scenarios. A strongly consistent transaction simply provides an option to skip this layer to improve performance (usability aspects) when this is not warranted (isolation I, conformance c); Although the costs of strong consistency must exist, consider shifting the costs to less common places, such as reading and writing scenarios, Increase write delay to optimize read performance; At the same time, even to write a scene, to some extent, the delay, but in exchange for greater throughput has become the design direction of many cross-IDC systems. mentioned the transaction and consistency, which is very interesting in the implementation of Calvin, followed by a route that is different from the time-popular direction. In the design of distributed system, transaction is a difficult point. 2PC blocking problem through the virtual node can be mitigated, spanner in 2pc lower Paxos is a meaning. However, this is always a problem, and the deadlock problem caused by concurrent updates in a typical system is magnified in a distributed system where the update operation becomes heavier. Calvin directly killed 2PC, this is by the update operation instead of updating the results of synchronization to achieve, in fact, with the global queue (Paxos implementation), to synchronize the operation of the log. The deadlock aspect is circumvented by pre-defined read-write set and ordered lock (some transactions cannot pre-define read-write set to use the OLLP mechanism). This transaction design is sensitive to request return speed on a single machine, especially after a maximum 10ms wait time has been artificially introduced for batch processing. Calvin used a warm up scheme to make up, and it was interesting to think that the idea was to put some things before the start of the transaction and load the data into memory ahead of time. However, it is important to note that because the operation logs are synchronized instead of the results, it is required that each copy should have the same result after receiving the same log, which means that the problem caused by uncertainties (such as bad local hard disks) cannot abort the transaction. This is a disadvantage compared to the 2PC scenario, in which case the problematic copy can only be synchronized/restored to the correct state of the copy itself. There is a place where you want to be noisy: which scenarios should be warm-up, which data should be warm-up, these two problems are warm-up mechanism needs to deal with, and Calvin paper in the former is a direct experiment of an empirical value, the latter is not a good solution ...... Calvin In addition to deterministic this, the other is a lot of ideas including the time-delay situation across the IDC situation, efforts to improve throughput and the main scene performance is now the majority of distributed system design of the same knowledge/direction. The linearization of its design is also being pursued by other systems, although I have some questions about its capability to expand the capacity of ...... to write more casually,Please correct me if there are errors or omissions. At the same time, as mentioned above, the handling of some problems is very difficult, I hope not because of the light strokes here to mislead into Calvin solution is very simple.
A few words say the cross-IDC distributed database Calvin