A flexible data architecture

Source: Internet
Author: User

In general, the e-commerce data architecture is more complex, and the students who do electrical business have a profound experience.
Where exactly is the complexity, give a few concrete examples:
1 each product has its own unique attributes. For example, the properties of shoes and computers are completely different. How do you construct a commodity data architecture for them?
2 How to maintain them. Not because of the development of the business, so that the data structure becomes extremely complex and can not be extended and maintained?
3 The future may sell more kinds of goods, we humans will also invent more new products. How to adapt to change?
4The amount of concurrency per second is huge, and how does the data architecture be considered? 5Product Data configuration error, if fast in-place rollback, rapid bleeding on the line? 6How can I support the entire development cycle? 7other reasons.
In a scenario like e-commerce, some of the problems become complicated!
If we still design the data architecture in the traditional way, for example, to design the data schema with the strong binding of the domain model, the data architecture will inevitably become more and more complex with the development of the business. Therefore, a new type of data architecture is needed to solve the fundamental problems of our system.
Here is a new idea that has been validated in the actual production environment. Crossing and I will look at you here:
1 Establish a unified, manageable data platform to prevent the generation of digital islands, improve data sharing and take into account performance monitoring and promotion.
2 We provide a mechanism for the data architecture and do not provide policies. Put the policy on the app side to decide, by the app to determine its domain of the upper data architecture, no matter what kind of e-commerce app can be from the data platform "grow out", mode liberalization, to do code first!
3provides data sandbox and multi-version backtracking capabilities to quickly establish data clone and hemostasis. 4provides an abstraction of the e-commerce basic data architecture.

Based on the above considerations, we have this specific design and implementation:
The data patterns that are bound by different processes and nodes are not the same.
For example, different types of customers (ordinary customers, account managers, etc.) for the platform of the business process of different processes, will form different types of orders, different types of orders will produce different settlement data model.
So mode dynamic is a very important design trajectory, otherwise the data pattern must be very complex, difficult to expand and maintain, the more the number of tables, will also bring performance problems.
The Internet system is nothing more than: APP = function Orchestration + process Orchestration + Business rules + data orchestration
So the basic design idea is static and dynamic separation:
The part of static and static separation is the system metadata section, which provides the functions of process orchestration, function orchestration, business rules and dynamic table meta data management.
Part of moving parts of the dynamic table, which represents the final business data, and the business data can be encrypted as a whole (because they are JSON), the way data is accessed using data middleware (such as Mycat). The implementation is as follows:
1

The tablespace is a MySQL 5.7 + physical table. The table's fields are uniform:

Id | header | payload | creator | Createtime2 We define a physical table (metadata table, which is a static part) that manages the table space, and once someone registers a table space on the table, it automatically generates a physical table based on the pattern in 1
3 We define a physical table (the metadata table, which is a static part) that manages the dynamic table schema, which uses JSON strings to store json-schema that describe the structure of a dynamic table. Suppose so:
and dynamically generate virtual columns based on this mode (this is the ability of MySQL 5.7+ itself, please check the official website data mysql Json functions) 4When we write the data to the newly created table space, we write the header and data fields according to the structure described by a json-schema, and use Jsonschema-validator to verify that the format is correct. The form is as follows:
Header: {"table": "A", "sandbox": "Test", "Version": "Timestamp", snapshot= "tag name" , "SchemaName": "Aschema"}
payload:{"id": "1", "Name": "Leo", "Price": 78.89}//Business data
5The Upper query resembles the following: Datacontext.select (). From ("a"). where ("id=1");(recommend Jinq and Jooq)

The virtual table name in DataContext can be reversed to find all of the corresponding table spaces based on the metadata, and then dynamically form union all SQL and commit to the MySQL server execution. 6The underlying interpretation is: Select ID, name, price from name space name where table= "a" and id=1union all other table spaces where table= "a" ....
We agreed: 1) a virtual table A can be saved in n table Space 2) a table space can hold n virtual table 3) The underlying DataContext provides transparency for the upper layer, providing only the virtual table name, the column name that needs to be fetched, and the criteria to get the data without feeling the difference.
The relationship of a virtual table and a tablespace is maintained by a static metadata table.
The purpose of this is that a virtual table can be distributed in different table space or even a different library table space, the data will be hashed to form a distributed table, to achieve better performance requirements.
In addition, the upper program code accesses the data source through the data middleware, automatically sub-table, and further improves the performance index. 7actually formed a virtual table system. 8We call the ID of the JSON in payload as the virtual table ID, which is not to be confused with the ID of the table space (physical table). 9we record the version in the header, if there are N id=1 virtual table records, then we will see in the upper Dataapi data is the largest version of the one (this process is called version collapse). 10If we need to roll back the data, we just need to specify a version and query this version of the data and insert the virtual table, and all the history of the data will be saved. 11In a table space (physical table), there can be many virtual tables of different schemas. In this way we have a dynamic table schema for the liberation mode, but note that:
1) due to the presence of Mycat, the amount of data in the tablespace does not adversely affect performance.
2) Dataapi should be able to build and submit a federated query for static tables and dynamic virtual tables (MySQL supported)
3) forcing us to dataapi the underlying data architecture so that the upper-level code would assume that the static and dynamic tables are not different and that the underlying data architecture is transparent.
4) Provide a management API that can maintain metadata dynamically to form dynamic tables.
12How to implement dynamic table record modification and deletion:
1). Modify: To modify the physical table data field, use: Add a new data record containing the modified, and write the op=modify in the header. To implement updating business data.
2). Delete: Add a blank record with the OP del value in the header by virtual ID such as: header: {"table": "A", "Changeset": "Test", "version" ": Timestamp", snapshot= "tag name", "SchemaName": "Aschema", "id": "1"}
Payload:null//Business data.
Null is a special value that indicates that it has been deleted.
At this time the query is judged payload =null and id=1, this record is not, the query result is NULL. 13Sandbox: A modified set is a sandbox concept.
We agreed that if the sandbox =test, then the data here is for beta testing.
If release is for the formal environment.
So we preset this sandbox name: A:dev Development Data

B:test Data for testing
C:release official data

We can think of this as a virtual partition of the data.
An important byproduct is that if we need to make a test dataset that will be used in a formal environment, after testing, we only need to Changset to release, then the dataset takes effect immediately and the formal environment is ready to use. The static table part, also designed the workflow part, the business metadata part, the label system part, the Authority management data model, these models are stable, and belong to the metadata, dynamic part is the model which the different app forms.
At this point, we can use the limited underlying data model to correspond to the ever-changing business data architecture.
1 Application Management static tables
The app is prefixed with a table that is responsible for the management of the Product Channel 2 workflow-related static table workflow as a prefix table, responsible for defining and managing workflow instances
3 Workflow form static table Workflowform table, define workflow node-related forms and rule bindings
4 Non-workflow form static table Normalform is a table that defines a non-workflow bound, normal form
5 Business Data metadata static tables provide metadata support for business metrology, cities, subways, and more
6 table space Management static tables are used to register the tablespace and are created by the DATAAPI server to create a specific physical table
7 Dynamic Table mode management Static table dynamic table registered to tablespace, schema registered, and created by DATAAPI Dynamic Table 8 Rights Management related static table application permissions definition and allocation
9 label static table for managing defined punctuation and cursor tables
Preset Dynamic Table Description: The default dynamic table corresponding to the Json-schema can not be deleted and modified, as the root business base exists!
Subsequent versions are all based on this root evolution.
The tables in each table space are all defined according to the business pattern, and the definition rights are given to the app, which is the code first by defining the dynamic table through the app codes.
and a dynamic table can have n multiple patterns registered in the tablespace, and the schema is also available in version.
This means that the column definition of a virtual table can be dynamically expanded, and that only one of the latest will take effect, and there can be rollback mode.
We agree that the definition of a dynamic table is equivalent to the definition of its json-schema, That is, a dynamic table corresponds to a fixed json-schema.1 transaction principal table Space 2 calculation rule definition tablespace 3 shopping Cart table Space 4 Order table Space 5 Financial Table space 6 commodity table space in short, the flexible underlying data architecture determines that the underlying data API service becomes an infrastructure that enables other business apps to Expansion and growth.

A flexible data architecture

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.