ACID, yield

Source: Internet
Author: User

ACID, yield
There are actually a lot of details... 1. What is ACID?

The first thing to note is that in the IT field, many terms have different semantics in different contexts. For example, some products claim to support "100% ACID" and "strong consistency. So what do these terms actually mean? Without the specific context, we may have various misunderstandings. Before using NoSQL or NewSQL, we need to focus on the differences in isolation levels supported by database products, without worrying about whether they actually implement ACID. Because ACID is the foundation of database products, there is little difference between different products. ACID is the most likely to cause ambiguity in terms of isolation level. The isolation level supported by different database products and the isolation level defined by SQL standards differ not only in terms of names, but also in actual meanings. For example, Oracle serializable is actually Snapshot Isolation1; DB2's Repeatable Read is actually Serializable2. By default, the I (Isolation) in ACID refers to serializability in academic circles. Otherwise, the problem will not be complicated until it has been studied for decades. Without serializability, the Consistency in ACID is naturally unreliable. 18 database products are listed in articles 3 and 4, and their default isolation level and maximum isolation level are compared. Here we will take a screenshot for your reference.


Since serializability is so important, why does the database provide isolation levels lower than serializability? Serializability does ensure data consistency when the database does not understand the application logic. However, it has to pay a price, which often results in system performance loss. Low isolation levels can improve performance and availability and meet many application requirements. For example, in a lock-based system, short locks can be applied to read, making transactions not easily deadlocked or rolled back. Of course, low isolation level will lead to various consistency exceptions 2. Since there are so many problems, why are there still many low isolation levels used? Article 3: One possibility is that anomalies are rare and the performance benefits of weak isolation outweigh the cost of inconsistencies. another possibility is that applications are using Ming their own concurrency control external to the database; database programmers can use commands like select for update, manual lock table, and UNIQUE constraints to manually perform their own synchronization. comparison of this interpretation Meets the actual situation in application development.

2. high-availability transactions instead of CAP

Article 3 is a VLDB2014 paper. The question "Highly Available Transactions: elasticsearch and Limitations" should follow The example "J. Gray. The transaction concept: elasticsearch and limitations. In VLDB 1981. It considers this scenario:

  1. In the HA environment of the WAN, The article emphasizes network faults (Network partitions) and latency, and the test environment is also LevelDB on Amazon.
  2. Multiple copies, So You Cannot serialize them. It can only be performed at a lower isolation level.

The isolation level of the database is not consistent with the copy consistency (transactional isolation, replica consistency) of the distributed system. It considers these concepts and high-availability systems in a unified way. This can at least help us solve the glossary mentioned above. Serializable network partitions cannot be achieved: Indeed, serializable transactions-the gold standard of traditional ACID databases-are not achievable with high availability in the presence of network partitions [. 27]. without considering HA, the traditional distributed serializable and SI mechanism: We have a strong understanding of weak isolation in the single-server context from which it originated [2, 11, 37] and sort papers offer techniques for providing distributed serializability [13, 24, 26, 41, 60] or snapshot isolation [42, 58]. These systems actually have various prerequisites.

What is HA, which is defined in section 4th: HA, Sticky A, Transactional. When Sticky A is defined, the concept of full replication system and incomplete replication system is introduced. Most of the actual systems are not completely copied. According to the footnote, the sharding system should be a special case of incomplete replication (each data has only one copy ). As for what is HAT, section 5th specifically describes it: HAT systems provide transactions with transactional availability or sticky transactional availability. And lists the semantics that HAT can do and cannot do.

HAT can:

  1. Atomicity (no matter how many nodes are involved)
  2. RC and RR
  3. Session-level read-your-writes, monotonic reads (I. e., time doesn't go backwards), and causality within and transactions
  4. Eventual consistency, meaning that, if writes to a data item stop, all transaction reads will eventually return the last written value

HAT cannot:

  1. Partition data recency
  2. HATs cannot be "100% ACID compliant" as they cannot guarantee serializability, yet they meet the default and sometimes maximum guarantees of parameter "ACID" databases.
  3. HATs cannot guarantee global integrity constraints (e.g., uniqueness constraints internal SS data items) but can perform local checking of predicates (e.g., per-record integrity maintenance like null value checks ).
3. list all phenomena/exceptions
  1. Symptom/exception related to the isolation level of traditional databases.
    • P0, Dirty Write
    • P1, Dirty Read
    • P2, Non-repeatable Read, Fuzzy Read
    • P3, Phantom. note three situations: insert, delete, and update.
    • P4, Lost Update
    • A5 (Data Item Constraint Violation)
      • A5A Read Skew
      • A5B Write Skew
    • A Critique exists, but it is not separately listed here: P4C, A1, A2, A3.
    • MAV: Read the atomic modification of the transaction.
    • RA, Read Atomic isolation, similar to Oracle's TSC
  2. Session-related phenomena/exceptions.
    • Read Your Writes
    • Monotonic Writes
    • Monotonic Reads
    • Writes Follow Reads
    • Pipelined Random Access Memory (a combination of the above three)
    • Causal consistency (a combination of the above four, that is, a combination of PRAM and Writes Follow Reads)
  3. Phenomena/exceptions related to distributed system consistency.
    • Recency (this seems to be too many and should not be counted)
    • Safe Register
    • Regular Register
    • Linearizability
  4. Serializable
    • The traditional serializable definition is a single copy scenario.
    • In multi-copy scenarios, One-copy can be serialized.
    • It is strictly serializable along with Linearizability.


Footnotes: 1

Http://iggyfernandez.wordpress.com/2010/09/20/dba-101-what-does-serializable-really-mean/

2

Http://research.microsoft.com/pubs/69541/tr-95-51.pdf

3

Http://www.bailis.org/blog/hat-not-cap-introducing-highly-available-transactions/

4

Http://www.bailis.org/blog/when-is-acid-acid-rarely/

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.