Data architecture design in micro-service development

Source: Internet
Author: User
Data Architecture Design in micro-service development Preface


Micro-service is currently a very popular technical framework, through the service of miniaturization, atomization and distributed architecture flexibility and high availability, can achieve a loose coupling between the business, flexible adjustment of the business mix and the high availability of the system. Provides a good foundation platform for business innovation and business continuity. This article shares the design ideas and design essentials of the data architecture under this technology architecture, which includes several of the following elements.


Multi-layer data architecture design in micro-service technology framework

Key points in the design of data architecture

Point 1: Data ease of use

Point 2: Master, sub data and data decoupling

Point 3: Sub-Library table

Point 4: Multi-source data adaptation

Point 5: Multi-source data caching

Point 6: Data mart


For easy understanding, this article uses a simplified sales model to illustrate the following figure. Figure 1 shows the relationship between the customer, the seller, the commodity, the pricing, the order (here is omitted payment, logistics and other elements).


Figure 1 Sales Model


In this sales model, the seller provides the commodity, the price, the customer chooses the product to buy, forms the sales order. According to the concept of micro-service design, can be divided into customer service, vendor services, commodity services, pricing services, order services, as well as public services (such as certification, permissions, notifications, etc.), as shown in Figure 2.


Figure 2 Micro-service function
multi-layer data architecture design in micro-service architecture


Distributed architectures generally divide the system into Saas (Software-as-a-service), Paas (Platform-as-a-service), Iaas (infrastructure as a Service) three-tier. Where the SAAS layer is responsible for providing business services to external departments, the PAAS layer provides the basic application platform, and the IAAS layer provides the infrastructure. Micro-services are vertically embedded in these three-tier services and are independent of each other. Therefore, the data architecture design needs to consider the three-tier service to the data focus, but also consider the independence of the micro-service.


Hierarchical design of data architecture


Fig. 3 The Micro-service technology framework


As shown in Figure 3, the Iaas layer provider runs the physical base environment (this involves a lot of hardware and network content, omitted in this article). Pass layer is divided into three layers, the basic service layer, mainly responsible for data storage processing, transaction framework layer, mainly responsible for the registration of micro-services, scheduling management, distributed transaction processing, application service layer, the main implementation of various micro-service APIs for other micro-service direct call and Saas layer of service invocation. Saas Service is a publicly available business service. The knowledge points used in the full text of the architecture technology can be obtained free of charge in group 619881427. Interested in can join in.


The data architecture is divided into Raw database layer, Logic data (inner) layer and Logic data (outer) layer (Iaas is mainly based on the basic hardware environment, omitted in this article). The Raw data layer is based on a database, file, or other form of content. The Logic data (inner) layer is the logic used by the micro-service API, such as customer data, order data, and so on. The Logic Data (outer) layer is an external service provider, such as customer order data. Therefore, the layered results of our data architecture are shown in Figure 4.


Figure 4 Data Tiering architecture


In addition, a lot of information will be displayed in the form of pictures or statements. So on top of Logic Data (outer), you can build information block (commonly used blocks of information), and the view type (display mode) is set up, and the final view is displayed.


As shown in Figure 4, the closer the external service layer is, the greater the customer's impact on the designer, the more it needs to be considered in terms of usability, ease of use, applicability, etc. Conversely, the farther away from the external service layer, design more concerned about the storage of data.


The benefit of the data three-tier architecture is to achieve a layer-by-tier transition of data from system implementation to business implementation, and to achieve loose coupling between business data and system data. At the same time realize the flexible expansion of the business and the flexible expansion of the system.
key points in the design of data architecture


It covers the layered design of the data architecture, and the following are the key points in the data architecture design.


Point 1: Data ease of use


The ultimate goal of data, however implemented, is to be used by the business (or the customer). Therefore, the ease of use of data is critical when providing services externally.

Figure 5 Data ease of use


As shown in Figure 5, customer information is stored in the Logic data (inner) layer by splitting the personnel information into several child tables for the softness and redundancy of the data. For example, the person Address table can store an unlimited number of customer address information. The advantage is that every time a person's address is updated, instead of directly updating the person's address, a new address data is generated, and the original address information is saved as historical data, easy to recover data quickly and track historical information. However, when the Logic data (outer) layer provides the external datum, the first consideration is to provide sufficient information at once (after all, the query operation is much higher than the modified operation), reduce the information not needed in the business scene. For example, when the average customer is provided with only three common addresses, address 1, Address 2, and address 3 in the data design are placed in a table.


Point 2: Master, sub data and data decoupling


It is not realistic to have data that is completely independent of each micro-service API, such as goods, customers (including shippers), sellers, and prices in order. If these data are managed in the Order service API, then the information of customer information change, price adjustment and so on will be synchronized to the order API data, the data coupling degree will become very high. In data design, you need to consider reducing the interdependence between data. Therefore, you first need to determine the primary and secondary data for each micro-service API. The main data refers to the core data of the Micro Service API, which is mainly concentrated in a micro-service API, such as order data in the Order service API. Sub-data refer to or map data from other micro-service APIs, such as commodity data, price data, etc. in the Order service API. Secondly, in order to reduce the coupling between the data, we use the Data association table to characterize the relationship between the data. If you want to remove the relationship between the data, directly remove the association table can not affect the data itself. As shown in Figure 6.


Fig. 6 Data decoupling and data decoupling of main and Sub


Point 3: Sub-Library table


As the volume of business data increases, a single database or a single data table accumulates a large amount of data, such as order data, and as the number of customers increases over time, the resulting order data becomes more and more. When the data accumulates to a certain extent, the performance of the data operation will be greatly reduced, that is, we often say that the database "can not move." Therefore, in the data architecture design phase should consider the data of the Sub-Library table.


As shown in Figure 7, the sub-Library, that is, we divide the order data into the current data application, historical, historical archive database. The current Data application library is used to support the generation of new orders and to check the additions and deletions of orders in execution. Historical databases (here are examples of the last 3 months and the last 1 years) when a customer wants to see a past order. Historical archive data (archived by year) in principle, not directly to customers, for reference, statistical analysis. For the current Data application library, you can continue to divide the library by the customer number range. This will enable the size of each database to be effectively controlled. A table that stores a single piece of information in two or more tables. For example, the order information according to the basic information and details of the table, you can apply to the order of basic information inquiries and order details inquiries. In a word, the core of the sub-Library is to control the load of a single database (data quantity and data information), and to deal with the growth of business data by multiple tables and libraries.


Fig. 7 Sub-table library


Point 4: Multi-source data adaptation


Traditional relational database, there are a variety of data sources, such as images, audio, video and other multimedia data files or data flow, CSV, TXT, Doc, Excle, PDF, XML and other heterogeneous number. All of this data needs to be processed and converted into manageable data information. Therefore, in the data architecture design, the need for different types of data sources to configure the corresponding reading and writing adapters, but also need to have a unified scheduling, as shown in Figure 8. The knowledge points used in the full text of the architecture technology can be obtained free of charge in group 619881427. Interested in can join in.


Fig. 8 Multi-source data adaptation


Point 5: Multi-source data caching


The performance of data processing in addition to the complexity of processing logic, but also a large part of the target data operation time (including the hardware disk devices read and write and network transmission). Network speed, especially the use of optical fiber has been greatly improved, but the efficiency of machine disk reading and writing has not significantly improved, so reducing disk read and write is an important way to improve efficiency. Data caching is the use of commonly used data (data that does not change frequently) and most recently used data in memory. This will significantly reduce the system's operating costs for hardware disk devices and improve the performance of the entire data system, as shown in Figure 9.


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.