"Entity analysis and Information quality"-Evolution of 2.1.7 information quality

Source: Internet
Author: User

Although the quality of information has only recently been merged into a subject, it has undergone several stages of evolution in its scope and significance. These are the stages of data cleansing, prevention, product overview, and corporate assets, respectively.

Problem cognition: Data Cleansing phase

In the early 90 's, in Inmon (1992),Kimball, etal. (1998) and other people's joint efforts, from the Data Warehouse movement of most concepts and the current practice of the formation of information quality, began to become popular. Most organizations do not know how to deal with the poor quality of operational data stores and inconsistencies between data until they start trying to integrate them into a unified data storage repository.

It is also at this time that these organizations are beginning to realize that most of their data is inaccurate, incomplete, inconsistent, and often poor in quality. This may be caused by the fact that the data source is such a situation, or we are trying to merge these data sources together in the process of being destroyed. Redman (1998) outlines the magnitude of the problem and, more importantly, the negative issues related to the quality of information that will affect their organization in terms of operational and strategic aspects.

Recognizing the negative impact of information quality on business operations, a new industry is derived, that is, data warehouses based on "Cleaning Dirty Data" (English, 1999; brackett,1996). It was during this period that the quality of information began to focus the problem on data cleaning , sometimes referred to as data hygiene or data cleansing. During the data cleansing phase, most of the attention is focused on standardizing processing of data from different sources using the ETL process, which not only allows the data to be consolidated into a single data warehouse, but also makes the data query more convenient and meaningful. As Lindsey (2008) describes, an early manufacturer involved in product trading found that the spelling differences in the color beige in the product database would affect the database return about the query condition "color equals beige" producing meaningful results.

In Figure 2.7, a conceptual model of the enterprise is shown above the horizontal line, which focuses on an information structure about how the data model and database schema are designed. Below the line shows an implementation model of an information operating system, which includes getting the data source, processing the data, producing the output, and one of the outputs is the data warehouse.


Figure 2.7 First stage: Data cleansing

Root cause detection: prevention phase

Figure 2.8 shows a step in the evolution of information that begins to take advantage of the relevant principles of manufacturing quality management. It focuses on finding the root cause of information quality problems and attempts to prevent the data from being stored in the Data warehouse for the first time. It is also in the prevention phase that organizations are beginning to realize that simple standardized processing does not allow them to get the right data, so they need to focus more on the accuracy of the data.


Figure 2.8 Phase II: root cause detection

Information as a product stage

One of the most important turning points in the quality of information is that it begins to look at the information in a mature perspective, that information is a product of an information system, not a byproduct (Huang, Lee, Wang, 1999). Through our understanding of the "information manufacturing" paradigm, our data sources can be imagined as raw materials, processing processes can be used as a product production process, and the final output is the final product. A general quality management (totalquality MANAGEMENTTQM) principle can be applied to an information system, that is, total data quality management (qualitymanagement TDQM ) processing process (Wang, Kon, 1998).

Information Quality Management In addition to the use of some of TQM rules (such as product management rules), information as a product will also focus on user and user (customer) information. During the cleaning and prevention phase of information quality, attention is focused on the data conditions used to measure dimensions, such as the accuracy, completeness, and consistency of the data. In the data product phase, we begin to understand the product from the user's point of view and link the objective evaluation of the data to the evaluation of the value of the product by the user. For example, we strive to improve the integrity of a column of data in a data table, increasing it from 80% to 90%, which may be an improvement on internal information quality measurements. However, if the data for this column is exported to a report, and the user who uses the report does not believe that the report has any added value to the goal of achieving information quality, the quality of the information is not improved from a product perspective.

Figure 2.9 Information quality from an information product perspective, which outlines the entire information production process, including all MC3 stakeholders: managers, collectors, custodians and consumers. At this stage, the concept and practice of datagovernment DGhave also emerged.


Figure 2.9 Phase III: Information as a product phase

Information as an asset

Now that information is increasingly seen as an asset of an enterprise, the quality of information is entering a new phase. At this stage, the quality of information in the enterprise also from the original passive role played a certain positive role. In the information recognition, prevention and product phase, information quality is largely considered a passive role, and the methods and practices of information quality are always designed and built from the system and information architecture.

Figure 2.10 shows that the scope of the modeling layer touched by the enterprise asset phase illustrates that information quality is increasingly seen as a key component of the information architecture. A well-known software development principle is that the sooner a problem is found, the lower the cost of correcting it. The same principle is applied to information to solve the information quality problem in information architecture. As Deming (1986) says, "Build quality into one product and throughout the production process." Another aspect of this phase is the focus on master Datamanagement MDM, which attempts to establish a record system for the values of key entity attributes (systems of recordSOR) or a single point of fact ( Single points of Truth SPOT), such as customer name, address, product code, and so on.


Figure 2.10 Phase IV: Information as an asset phase

"Entity analysis and Information quality"-Evolution of 2.1.7 information quality

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.