Micro-Service Practice (V): event-driven data management for MicroServices

Source: Internet
Author: User
Tags dynamodb
This is a creation in Article, where the information may have evolved or changed.
"Editor's note" This article is the fifth article to create an app series using MicroServices. The first article introduces the MicroServices architecture model and discusses the advantages and disadvantages of using microservices, and the second and third describes the different aspects of communication between the MicroServices architecture modules, and the fourth one studies the problems in service discovery. In this article, we look at the distributed data management problems brought about by the microservices architecture from another perspective.

1.1 Micro-services and distributed data management issues

Monolithic applications generally have a relational database, and the benefit is that applications can use ACID transactions, which can bring some important operational features:
    • atomicity – Any change is atomic
    • Consistency – The database state is always consistent
    • Isolation – even if the transaction is executed concurrently, it appears to be serial
    • durable– once the transaction is submitted, it cannot be rolled back.


In view of the above features, the application can be simplified to: start a trade, change (INSERT, delete, update) a lot of lines, and then submit these trades.

Another advantage of using a relational database is that it provides support for SQL (powerful, declarative, table-transformed query language). Users can easily combine data from multiple tables by querying the RDBMS query Scheduler to determine the best implementation, and the user does not need to worry about underlying issues such as how to access the database. In addition, because all of the application data is in a database, it is easy to query.

However, for a microservices architecture, data access becomes complex because the data is private to the microservices, and the only way to access it is through the API. This way of packaging data access makes microservices loosely coupled and independent of one another. If multiple services access the same data, the schema updates the access time and coordinates between all services.

More than that, different microservices often use different databases. Applications produce different kinds of data, and relational databases are not necessarily the best choice. In some scenarios, a NoSQL database might provide a more convenient data model, providing more performance and scalability. For example, an application that produces and queries strings takes a character search engine such as Elasticsearch. Similarly, an application that generates social image data can take a picture database, for example, neo4j; therefore, microservices-based applications generally use a database of SQL and NoSQL, which is known as the polyglot persistence approach.

Partitioned, the Polyglot-persistent architecture is used to store data with many advantages, including loosely coupled services and better performance and scalability. However, the challenges associated with distributed data management are followed.

The first challenge is how to complete a transaction while maintaining data consistency across multiple services. This is the problem, we take an online business-to-business store, for example, customer service maintenance includes a variety of customer information, such as credit lines. Order services manage orders and need to verify that a new order does not conflict with the customer's credit limit. In a single application, the order service only needs to use acid trading to check available credits and create an order.

Conversely, under the microservices architecture, orders and customer tables are private tables that correspond to the service, as shown in:

The order service does not have direct access to the Customer table and is accessible only through the API published by Customer service. Order services can also use distributed transactions, known as two-phase commit (2PC). However, 2PC is not optional in the present application. Depending on the cap theory, a choice must be made between availability (availability) and acid consistency (consistency), availability generally a better choice. However, many modern technologies, such as many NoSQL databases, do not support 2PC. Maintaining data consistency between services and databases is a fundamental requirement, so we need to find other solutions.

A second challenge is how to search for data from multiple services. For example, imagine that an app needs to show customers and his orders. If the order service provides an API to accept user order information, the user can receive data using a class-applied join operation. The app accepts user information from the User Service and accepts this user order from the order service. Assuming that the order service only supports querying orders via private key (key) (perhaps using a NoSQL database that only supports primary key acceptance), there is no appropriate way to receive the required data.

1.2 Event-Driven architecture

For many applications, this solution is to use the event-driven architecture (Event-driven architecture). In this architecture, when something important happens, MicroServices publishes an event, such as updating a business entity. When the microservices that subscribe to these events receive this event, they can either update their business entities or cause more time to be released.

You can use events to implement business transactions across multiple services. A transaction typically consists of a series of steps, each of which consists of an event that updates the service entity's microservices and releases the next step. Shows how to use event-driven methods to check credit availability when an order is created, and microservices to Exchange events through the message broker (Messsage broker).
    1. Order service creates an order (order) with a new status and publishes an "Order Created event (Create order)" incident.
    2. Customer service consumption order Created event, reserve credit for this order, issue credit Reserved events
    3. Order service consumption Credit Reserved Event, Change Order status to open

      More complex scenarios can introduce more steps, such as reserving inventory while checking the user's credit.


Considering that (a) each service atomically updates the database and publishes the event, then (b) The message agent ensures that the event is delivered at least once and can then complete the business transaction across multiple services (this transaction is not an acid transaction). This pattern provides weak certainty, such as eventual conformance eventual consistency. This type of transaction is called the BASE model.

You can also use events to maintain a view of the implementation of data pre-connectivity (Pre-join) for different microservices. Maintains the service subscription-related events for this view and updates the view. For example, Customer Order View Update Service (maintain customer order view) subscribes to events published by Customer service and order services.

When the Customer Order View update service receives a customer or order event, the customer Order view data set is updated. You can use a document database, such as MongoDB, to implement a customer order view and store a document for each user. The Customer Order View Query service is responsible for responding to queries against customers as well as recent orders (by querying Customer Order view datasets).

The event-driven architecture also has both advantages and disadvantages, which allows transactions to span multiple services and provide eventual consistency, and allows applications to maintain a final view, while the disadvantage is that programming patterns are more complex than acid trading modes: To recover from an application-level failure, you need to complete a compensatory transaction, for example, If the credit check is unsuccessful, the order must be canceled, and the application must respond to inconsistent data because the changes caused by the temporary (in-flight) transaction are visible, and the data inconsistency is encountered when the application reads the final view that is not updated. Another drawback is that subscribers must detect and ignore redundant events.

1.3 Atomic Operation achieving atomicity

The event-driven architecture also encounters database updates and issue atomicity issues. For example, the order service must insert a row into the order table and then publish the order Created event, both of which require atomicity. If the database is updated, service paralysis (crashes) causes the event to fail to publish and the system becomes inconsistent. The standard way to ensure atomic operations is to use a distributed transaction, which includes the database and the message agent. However, based on the CAP theory described above, this is not what we want.

1.3.1 using local transactions to publish events

One way to get atomicity is to apply the multi-step process involving only local transactions to the release event, and the trick lies in an event table, which functions as a message list in the storage business entity database. The app initiates a (local) database transaction, updates the business entity status, inserts an event into the events table, and submits the transaction. Another standalone application process or thread queries This event table, publishes an event to the message agent, and then uses the local trade flag for this incident as published, as shown in:

The order service inserts a row into the Orders table, and then inserts an order Created event into the events table, and the incident publication thread or process queries the event table, requests that the event be unpublished, publishes them, and then updates the event table flag for the issue as published.

This approach is also both pros and cons. The advantage is that you can ensure that event publishing does not depend on 2PC, that applications publish business-level events without having to infer what happened to them, and that the downside is that this approach is likely to be an error because developers must keep in mind the release events. In addition, this approach is a challenge for some applications that use NoSQL databases because NoSQL itself has limited trading and query capabilities.

This method because the app takes the local trade update status and publishes the event without needing 2PC, now look at another way to get atomicity by applying a simple update state.

1.3.2 Mining Database transaction log

Another way to get a thread or process to publish an event atomically without needing 2PC is to tap a database transaction or commit a log. Application update database, in the database transaction log changes, the transaction log mining process or thread read these transaction logs, the journal published to the message agent. As you can see:

Examples of this approach, such as the LinkedIn Databus project, Databus Mining the Oracle transaction log and publishing events based on changes, LinkedIn uses Databus to ensure consistency across records within the system.

Another example: AWS streams mechanism in AWS DynamoDB is a manageable NoSQL database, and a DynamoDB stream is a time-based change (creation, update, and deletion) of the database tables over the last 24 hours. Apps can read these changes from the stream, and then publish them in an event manner.

Transaction log mining is also the pros and cons coexist. The advantage is to ensure that each update release event does not depend on 2PC. Transaction log mining can be simplified by splitting release events and applying business logic, while the main drawback is that the transaction log has different formats for different databases, and even different database versions, and it is difficult to convert from the underlying transaction log update records to high-level business events.

The transaction log mining method updates the database directly with the application without the need for 2PC intervention. Let's look at a completely different approach: there's no need to update a method that relies only on events.

1.3.3 Using event sources

Event sourcing, which uses radically different event hubs to obtain atomicity that does not require 2PC, guarantees consistency of business entities. This application preserves a series of state-changing events for the business entity, rather than storing the entity's current state. The app can recreate the entity's current state by replaying events. As long as the business entity changes, new events are added to the timesheet. Because the Save event is a single operation, it must be atomic in nature.

To understand how the event source works, consider the event entity as an example. In the traditional way, each order is mapped to a row in the order table, for example in the Order_line_item table. However, for the event source, the order service stores an order as an event state change: Created, approved, shipped, canceled, and each event includes enough data to reconstruct the order status.

Events are kept in the event database for a long time, providing APIs to add and get entity events. The event store is similar to the message agent described earlier, providing an API to subscribe to events. The event store delivers events to all interested subscribers, and the event store is the backbone of the event-driven microservices architecture.

The event source method has many advantages: it solves the key problem of the event-driven architecture, so that the event can be published reliably as long as there is a state change, which solves the problem of data consistency in the MicroServices architecture. Also, because it is a persistence event and not an object, the object relational impedance mismatch problem is avoided.

The data source method provides 100% reliable monitoring logs of business entity changes, making it possible to obtain any point-in-time entity state. In addition, the event source method allows the business logic to be composed of loosely coupled business entities that are exchanged for events. These advantages make it relatively easy for monomer applications to be ported to microservices architectures.

The event source method also has many drawbacks, because it is not easy to re-learn with different or less familiar patterns; event store only supports primary key query business entity, must use Command query Responsibility segregation (CQRS) To complete the query business, so the application must process the final consistent data.

1.4 Summary

In a microservices architecture, each microservices has its own private data set. Different microservices may use different SQL or NoSQL databases. Although the database architecture has a strong advantage, but also facing the challenge of data distributed management. The first challenge is how to maintain business transaction consistency across multiple services; The second challenge is how to get consistent data from a multi-service environment.

The best solution is to use an event-driven architecture. One of the challenges encountered was how to atomically update the state and publish the event. There are several ways to resolve this issue, including the database as Message Queuing, transaction log mining, and event sources.

In a future blog, you'll explore other aspects of microservices in depth.
The other links in the seven articles in this series are as follows:
    • Introduction to microservices:http://dockone.io/article/394
    • Building microservices:using an API gateway:http://dockone.io/article/482
    • Building microservices:inter-process Communication in a microservices architecture:http://dockone.io/article/549
    • Service Discovery in a microservices architecture:http://dockone.io/article/771


original link: Event-driven Data Management for microservices (translator: 杨峰)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.