Cloud computing design model (14)-Materialized view mode

Source: Internet
Author: User
Cloud computing design model (14)-Materialized view mode


Data is formatted in the opinion that data is pre-filled in one or more data storages, which is not conducive to the required query operations. This mode can help you efficiently query and extract data and improve application performance.

Background and problems


When data is stored, developers and data administrators usually focus on how to store data, rather than how to read it. The selected storage format is usually closely related data, used to manage the data size and data integrity, and is used in the format required for storage. For example, when using a NoSQL document store, the data is usually expressed as a series of aggregates, each of which contains all the information, the entity.

However, this may have a negative impact on queries. When querying a subset of data that needs to be summarized from some entities, such as the customer of an order, without all the ordered information, it must extract data of all relevant entities, to obtain the required information.

Solution


To support efficient queries, a common solution is to generate, in advance, that is, a graph that is most suitable for the format of the requested result set in materialized data. In this solution, the source data is not a format suitable for queries. It is difficult to generate an appropriate Materialized view mode in the query to describe the advance filling of the generated data, or the query performance is poor because of the data or the data storage zone of this type.

These instantiated views only contain the data required for a query so that the application can quickly obtain the information they need. In addition to the data entities connecting tables or combinations, the Materialized view can include the current value, the combined value, or the conversion result and value of the specified calculation column or data item as part of the query. Materialized views can even be optimized for a single query.

The key point is that a Materialized view contains exactly one-time data because it can be completely rebuilt from the source data storage. The instantiated View is an application that cannot be directly updated, so it is actually a specialized cache.

When the view changes the source data, the view must be updated to include new information. This can automatically occur on an appropriate schedule or when the system detects changes to raw data. In other cases, you may need to manually regenerate the view.

Figure 1 shows an example of how a Materialized view pattern may be used.

Figure 1-Materialized view mode

Problems and precautions


Consider the following when deciding how to implement this mode:
? Consider how and when the view will be updated. Ideally, a response will be generated to an event indicating to change to the source data, although in some cases this may lead to excessive overhead, if the source data changes dramatically. Alternatively, consider using scheduled tasks, external triggers, or manual operations to start the regeneration of this view.
? In some systems, it may be necessary to use the event sourcing pattern to maintain only the modified data in the event storage area. For example, the Materialized view may be necessary. By checking all events to determine whether the current status is pre-filled, you can obtain the unique way to store information from the event. When using event sourcing, the advantage of measurement is that the Materialized view can be provided. Materialized views are usually specific to one or a few queries. If many queries must be used, maintaining the instantiated view may result in unacceptable storage capacity requirements and storage costs.
? When a view is generated, and when a view is updated, if this happens in a calendar, consider the impact of data consistency. If the source data changes, the point in the generated view may be completely consistent with the original data.
? Consider where you will store the view. Raw data that does not exist in the same store or partition. It may be a subset merged from several different partitions.
? If the view is short, it is only used to improve query performance or scalability by reflecting the current status of the data, it can be stored in the cache or in an unreliable location. It does. If reconstruction is lost.
? When defining a Materialized view that is computed on the basis of a data item or column or converted from an existing data item, you can query the transmitted values or, this is an appropriate combination to maximize its value.
? If the storage mechanism supports it, consider indexing the Materialized view to further improve performance. Most relational databases support indexing because this is a big data solution based on ApacheHadoop.

When to use this mode


This mode is very suitable:
? It is difficult to directly query the data above the instantiation view, or the data must be extracted and stored in normalization. The semi-structured or unstructured data is very complex.
? Creating a temporary view can significantly improve query performance, or directly act as a UI source view or data transmission object (DTO) for reporting or display.
? It supports occasional connection or disconnection, in which connection to data storage is not always available. This view may be locally cached in this case.
? Simplified query and exposure of data without the need for source data formats for experiments. For example, you can combine different tables in one or more databases, or one or more schema domains stored in NoSQL, and then format the data to meet its end-use.
? Provides a specific subset of the access source data. For security or privacy reasons, it should not be a general access, public modification, or completely exposed to the user.
? Use different data storage based on their individual capabilities to bridge the gap. For example, using cloud storage is efficient for writing as the benchmark data storage, and can provide good query and read performance to maintain the instantiated view of relational databases.

This mode may not be suitable for the following situations:
? The source data is simple and easy to query.
? The source data changes rapidly, or you can access it without using a view. The overhead of creating a view may be avoided in these cases.
? Consistency is a high priority. May not always be the same as the original data.

Example


Figure 2 shows an example of using the Materialized view mode. In combination with orders, order item data, and separate partition tables under Microsoft's Azure storage account, the customer generates a view of the total sales of each product included in the electronic category, the number of purchased items of the customer is counted together in each project.

Figure 2-Summary of sales using the Materialized view mode


Creating this instantiated view requires complex queries. However, by using the query results as a Materialized view, you can easily obtain and directly use the results, or include them in another query. Views are likely to be used in a reporting system or dashboard, so you can update the plan based on, for example, once a week.

Note:

Although this example uses Azure Table store, many relational database management systems also provide native support for instantiation views.

MSDN: http://msdn.microsoft.com/en-us/library/dn589782.aspx

Cloud computing design model (14)-Materialized view mode

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.