Deployment and implementation of big data should be combined with specific application scenarios. In fact, enterprise big data storage and processing can use the story of "Three pig houses" (using straw, wood, and bricks respectively) to illustrate, this story better reflects the different protection levels (integrity and reliability) that correspond to the delivery service (cost) in the data storage environment ).
Financial data, external reports, and regulatory compliance data must be stored and processed in the bricks environment. This data requires a reliable hardware infrastructure that is consistent with its original source. It is common for multiple functional departments in an enterprise to use financial data such as product service pricing decision-making, sales performance and analysis, and key employee/Management Compensation and Incentive Mechanism calculations.
The well-designed stick Environment ensures the durability of stored data. This environment is dedicated to applications and is not designed for enterprise-level use and cross-functional data sharing. This data type can be used for data conversion, usually including a large number of marketing data marketplaces. Only necessary functions such as data conversion, coordination, and lineage can be used for specific commercial purposes. Compared with the aforementioned "brick houses", "Wooden Houses" are essentially less costly and faster.
Finally, we will introduce hay ). The "Cottage" actually refers to the conversion, grouping, and Summary of data on the specific date on which data needs to be used. Data may exist in the format of the original source, and almost no data structure is required. You can adjust the data format as needed. Although the cottage design cannot be easily replicated or scaled vertically, it is applicable to non-specific and non-repetitive business problems. This solution has low requirements for data coordination and replication.
The analogy with "three pig" is quite intuitive, but the specific solution should refer to the data governance policy. If they are able to cope with it, business departments want to quickly obtain low-cost solutions, while IT departments need to rely on reliable solutions to provide sound and reliable services. This is also an inherent contradiction in most discussions of the business and IT departments.
Because of the rapid deployment, low cost, and low cost of failure, the "Cottage" solution has received much attention. Under the new economic mechanism, especially in the self-help environment, users' recognition of the value of data (including big data) is the reason for the rapid development of data laboratories and exploration environments. Therefore, it is not surprising that business departments Choose fast and low-cost solutions.
However, the cost of the IT department was astonishing when the "grass room" solution was upgraded to a "wood room" or "Brick room" environment. "Why can't they use the solution we designed within two weeks ?" They can. However, deploying "brick houses" or even "WOOD houses" on the basis of "grass houses" won't work. Deploying the "Wooden Houses" and "brick houses" solutions using the "grass houses" design scheme will waste a lot of IT department budget.
The main challenge is the data control policies and processes that identify the importance of data. When the "creative" solution designed in the "grass room" environment needs to be migrated to a more stable environment, data management methods should be involved (grass room, wood room or brick Room) the decision-making owner needs to fully understand the importance of downstream data.
Three Environments for enterprises to store big data