Converting OLTP data to provide acceptable performance in an OLAP system requires the following procedures to be performed:
Merging data
You must be able to incorporate all relevant information about a specific project (product, customer, employee) from multiple OLTP systems into an OLAP system. The merge process must address the problem of coding differences between different OLTP systems. For example, a system might assign an ID to each employee, while the other system has no employee ID. The merge process must be able to match the common employee data in both systems, which can be achieved by comparing the employee name and address. The merge process must also be able to convert data stored in each OLTP system using different data types into a single data type that is used in an OLAP system. You must also select which columns in your OLTP system are independent of the OLAP system and exclude these columns from the merge process.
Systems that provide input data for OLAP systems are not limited to traditional, centrally located OLTP systems. Important information can be stored in multiple legacy locations, and in some cases even relatively small data sources, such as Microsoft Excel spreadsheets, can be stored on file shares.
Clean up data
When you merge OLTP data into a data warehouse, you provide an opportunity to clean up your data. You may find that different OLTP systems spell items differently, or that the merge process may not overwrite previously unknown spelling errors. Other inconsistencies may also be found, such as different addresses for the same store, employee, or customer. These inconsistencies must be identified before the data can be loaded into the data warehouse for use by the OLAP system.
Aggregate data
The OLTP data records all transaction details. OLAP queries often need to summarize data or data that has been aggregated in some way. For example, if the database contains only summary rows that display sales per day or per hour for each product, the query retrieves only the monthly sales totals for each product in the past year, much faster than having to scan the detailed records for each transaction in the past year.
The degree of aggregation of data in a data warehouse depends on many design factors, such as the speed requirements of OLAP queries and the granularity required for analysis. For example, if you aggregate sales details into a daily summary instead of an hourly rollup, OLAP queries will run faster, but you can do this only if you don't need to analyze sales per hour.
Organize data into cubes
The organization of relational OLTP data increases the difficulty and time consuming of analysis processing. When you move OLTP data into a data warehouse, you must transform its structure so that it can better support decision support analysis. The process of building a data warehouse involves reorganizing the OLTP data stored in the relational table into the data stored in the cube.
Transformation phase and Data Warehouse components
The process of making data available through an OLAP application typically passes through the following three phases:
Extract data from an OLTP or legacy data source into an intermediate storage area.
Converts data to a format that is more appropriate for use in an OLAP system. This involves operations such as data cleanup and aggregation.
Loads data into a data warehouse or data mart.
The process of fetching data from OLTP and legacy data sources and converting it to a warehouse server is known as the ETL process, which is usually run periodically, such as once a week or once a month.
Once data is loaded into the data warehouse, an important function of OLAP systems is to provide decision makers with the convenience of accessing and analyzing data in data warehouses and data marts.
Data source
OLTP databases and other legacy data sources that contain data that must be converted into OLAP data in the Data Warehouse and data mart.
Intermediate data storage
A combination of data stores and the process of storing, cleaning, and converting OLTP data to useful OLAP data.
Warehouse Server
A warehouse server is a computer that runs a relational database that contains data warehouse and data mart data, and a server that runs OLAP data.
Business Intelligence
A suite of tools and applications for querying OLAP data and providing reports and information to enterprise decision makers.
Meta data
A structural model of data and applications in different OLAP components. Metadata describes objects such as tables, data warehouses, and cubes in a data mart in an OLTP database, and also records which applications reference different block of records.