"NoSQL Essence" Reading notes

Source: Internet
Author: User
Tags orientdb amazon dynamodb

The first part: Concept 1, why use NoSQL

Impedance detuning: The difference between the relational model of a relational database and the data structure in memory.

Integration Database : Multiple applications that are typically developed by different teams store their data in a common database.

application Database : Its contents can only be accessed directly from the code base of one application, and this codebase is maintained by a team.

Use NoSQL reasons : First, the amount of data to be processed is very large, or the efficiency of data access is very high, so that data must be placed on the cluster, and the second is to adopt a more convenient way of data interaction to improve the efficiency of application development.

NoSQL Database Common features :

    • Do not use a relational model
    • Running well in the cluster
    • Open source
    • Internet companies for the 21st century
    • No mode
2. Aggregate data types

aggregation : Treats a set of interrelated objects as a whole unit, which is aggregated.

for aggregation databases : Key-value databases, document databases, column family databases.

aggregation Ignorance : There is no "aggregation" concept in the data model of a relational database, so it is called "aggregation ignorance".

Most of the data interactions are performed within the consent aggregation, using an aggregation-oriented database, or "aggregating an ignorant database" if the interaction requires data in a variety of different formats.

3. Detailed Data Model

Graph Database : organize the data into a graph composed of nodes and changes, which is suitable for dealing with complex relational data structures.

implicit mode : Refers to a series of assumptions about the data structure as it is written in the operation code.

materialized views : The aggregated data is reorganized in different ways for the aggregation database. (usually computed with a map simplification)

4. Distributed model

Data distribution Methods :

sharding : Separate data shards are stored on multiple servers, each of which is solely responsible for a single server.

replication : Copy data to multiple servers, each of which can be found in multiple nodes.

How to copy :

master-slave replication : One of the nodes as authoritative data source, and responsible for the write operation; the other slave nodes are kept in sync with the primary node, and they can take charge of the read operation.

Peer replication : Any node can be written, and the nodes are coordinated to synchronize their data.

Master-slave replication reduces the chance of conflict when updating data, but it makes the primary node A bottleneck for write operations, which is avoided by peer replication.

5. Consistency

write conflicts and read -write conflicts: A "write conflict" occurs when two clients attempt to modify the same data, and a read-write conflict occurs when a client reads data during a write operation by another customer.

Update consistency : Pessimistic ways to lock data records to avoid conflicts, and optimistic ways to detect conflicts afterwards and fix them.

Final Consistency : The write operation has been propagated to all nodes.

cap theorem : Consistency, availability, partition tolerance these three properties can only meet two at a time. When a "network partitioning" phenomenon is possible, the tradeoff between "availability" and "consistency" of the data must be weighed.

Quorum : When performing database operations in a distributed model with "Replication" technology, there is no need to contact all replicas, so long as enough replicas are recognized, "strong consistency" can be maintained.

6. Version Stamp

version stamp : used to detect concurrency conflicts. After reading and updating a piece of data, its version stamp can be detected to ensure that no one else has updated the data between read and write operations.

version Stamp Implementation : Counter, UUID, content hash code, timestamp, and so on.

array-style version stamp : detects "Conflicting update operations" between different nodes.

7. Mapping-Simplification

mapping-Simplification mode : a means of arranging data processing flow, which can be used in multiple computers in a cluster, while at the same time, the data and processing work required by a computer can be executed as much as possible. A pattern used to perform concurrent computations on a cluster.

map : Reads the data from the aggregation and reduces it to a pair of phase key values. The map operation can read only one record at a time, so it executes concurrently on the node that holds the record.

Mapping and simplification: mapping tasks generate many values that have the same keyword, and the simplification task simplifies them to a single output value. Each degenerate function only operates on a single key-related mapping result, so multiple simplification functions can perform concurrency simplification based on the keyword.

Piping: Multiple degenerate functions with the same input data as the output data can be merged into pipelines to improve concurrency and reduce the amount of data that needs to be transferred.

If you want to use the results of map-simplification calculations extensively, you can store them as materialized views. The materialized view can be updated with incremental mapping-simplification operations.

The second part: Implement 8, key value database

Key-value database : A simple hash table, mainly used in all database access through the gradual operation of the situation. (Riak, Redis, Memcached db, Berkeley db, Hamsterdb, Amazon DynamoDB, Project voldmort)

Applicable cases :

Storing session information

User Configuration information

Shopping Cart Data

non-applicable occasions :

Relationship between data

Transactions that contain multiple operations

Operation Keyword Collection

9. Document Database

Document Database : Documents can be stored and retrieved, in the form of XML, JSON, Bson, and so on. (MongoDB, CouchDB, Terrastore, Orientdb, REVENDB)

Applicable cases :

Event logging

Content management system and blog platform

Website analysis and real-time analysis

E-commerce applications

non-applicable occasions :

Complex transactions with multiple operations

Querying a continuously changing aggregation structure

10. Column Family Database

Column Family Database : can store keywords and their mapped values, you can divide the values into multiple column families, so that each column family represents a data map table. (Cassandra, HBase, hypertable, Amazon SimpleDB)

Applicable cases :

Event logging

Content management system and blogging platform

Counter

Time limit applies

non-applicable occasions :

Systems that require ACID transaction operations to read and write

When early prototyping or testing technical solutions

11. Graph Database

Graph Database : The relationship between entities and entities is stored. (neo4j, Infinite Graph, Orientdb, FLOCKDB)

Applicable cases :

Interconnect data

Arranging transportation routes, dispatching goods and location-based services

Recommended engine

non-applicable occasions :

You need to update the entities in all or a subset.

12. Mode Migration

Migrating relational database, such as strong-mode database, can save the previous schema changes and data migration operations in the capital preservation control sequence.

Non-modal data migrations are available with strong-mode migration techniques, and incremental migrations are also available, noting the modeless "hidden patterns".

13. Mixed persistence

Hybrid Persistence : Use different database counts to handle multiple data storage requirements.

Encapsulating data Access as a service can reduce the impact of database changes on other parts of the system.

Adding database technology makes programming and operations more complex, and balancing the benefits of a candidate database against the complexity of its introduction.

14. Beyond NoSQL

file Systems : Large files that have relatively small amounts and need to be processed in chunks.

Event Traceability : persists all changes that occur in a persistent state, rather than just persisting the state of the current application itself.

memory Image : puts the application state in memory.

version Control : Based on the file system, the most commonly used traceability system. Enables team members to collaboratively modify complex interconnected systems.

XML Database , object Database ...

15, select the appropriate database

Choose a nosql reason :

Improve programmer productivity with a database that better meets your application needs

Improve data access performance with a combination of technologies that can handle large amounts of data, reduce latency, and increase data throughput.

Before deciding to use a NoSQL technology, be sure to test whether it improves programmer productivity and data access performance as expected.

Service Encapsulation Database : It can change its encapsulated database technology after requirement changes or technology maturity. The parts of the application can be planned into different services to introduce NoSQL databases to existing programs.

Most applications, especially "non-strategic" applications, should continue to use relational database technology, at least until the NoSQL technology environment is not yet mature.

"NoSQL Essence" Reading notes

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.