Business Intelligence is a complete solution that combines data warehouses, Online Analytical Processing (OLAP), and data mining into commercial activities to collect data from different data sources, after extraction (extract), transformation (Transform), and load, it is sent to the data warehouse or data mart, then, appropriate query and analysis tools, data mining tools, and on-line analysis and processing tools are used to process information, transform information into decision-making knowledge, and finally present the knowledge to users, to achieve technical service and decision-making.
Four key technologies of business intelligence
The supporting technologies of business intelligence mainly include ETL (data extraction, conversion and loading) technology, data warehouse and data mart technology, OLAP technology, data mining technology and data publishing and presentation technology.
1. Data Warehouse Technology
To implement Bi, you must first use different data sources inside and outside the enterprise, such as customer relationship management (CRM), Supply Chain Management (SCM), and enterprise resource planning (ERP) the system and other application systems collect useful data for conversion and merging. Therefore, the data warehouse and data mart technologies are required.
A data warehouse is a collection of data collected from multiple data sources and stored in a consistent manner. One of the Data Warehouse founders W. h. inmon is defined as: "A data warehouse is a topic-oriented, integrated, stable data set that contains historical data. It is used to support decision making in management ". When constructing a data warehouse, data cleaning, data extraction and conversion, data integration and data loading are required. To meet different needs, clean the data to ensure the correctness of the Data. Then, extract and convert the data into the form required by the data warehouse, and load the data to the data warehouse.
Data Warehouse is a semantic consistent data storage that acts as a physical implementation for decision-making to support data models and stores the information required for strategic decision-making. Data Models of data warehouses are star and Snowflake. The star mode is the most common. There is a central table that contains a large number of data and does not include redundancy. Each dimension has a small affiliated table. In snowflake mode, some dimension tables are normalized. Therefore, the data is further decomposed into additional tables, and the pattern is similar to that of Snowflake. The Research on Data Warehouse focuses on the design of data mode, data cleaning and data conversion, and the import and update Methods in data integration.
Data Warehouse is usually an enterprise-level application, so the scope and investment involved are huge, making some enterprises unable to afford it. Therefore, they hope to build a customized data warehouse subset for their own applications in the key departments that are most needed. This demand makes the data market emerge. Data mart focuses on the selected topics and is a Department-wide product. Based on different data sources, data mart is divided into two types: independent and dependent. In an independent dataset, data comes from one or more operating systems or external information providers, or data locally generated in a specific department or region. The data in the dependent data mart comes directly from the enterprise data warehouse.
2. OLAP)
Online Analytical Processing (OLAP), also known as multidimensional analysis, was proposed by EF codd in 1994. It analyzes and presents data in a data warehouse in multiple dimensions, it enables analysts, administrators, or executors to perform operations on information that is converted from raw data, truly understood by users, and truly reflected by enterprise features from multiple perspectives. fast, consistent, and interactive access, this gives you a better understanding of the data. Its core technology is the concept of "dimension". Therefore, OLAP is also a collection of multidimensional data analysis tools.
The premise of OLAP analysis is that a data warehouse has been created, and then the complex query capabilities, data comparison, data extraction, and reports of OLAP can be used for probing data analysis. It is called probe-type data analysis because after you select the relevant data, you can use slice (Select data by two-dimensional selection) and block (Select data by three-dimensional selection), drill-up (select a higher level of data details and data views), drill-down (expand the details of the same level of data), rotation (get data of different views), and other operations, data can be analyzed at different granularities to obtain different forms of knowledge and results. Online Analytical Processing is mainly focused on the query optimization technology of ROLAP (OLAP based on Relational Database Service) and molap (OLAP based on multi-dimensional data organization) to reduce storage space and improve system performance.
3. Data Mining Technology
Unlike OLAP's probing data analysis, data mining mines, mines, and analyzes existing data in databases and data warehouses according to predefined rules, identifies and extracts hidden patterns and interesting knowledge from them to provide decision-making basis for decision makers. A Data Mining task is a data discovery mode. There are many modes, which can be divided into two categories by function: predictive mode and descriptive mode.
The prediction mode accurately determines a result based on the value of a data item. The data used for mining the prediction model can also clearly understand the results. The descriptive mode describes the rules in the data or groups the data based on the data similarity. The descriptive mode cannot be used directly for prediction. In practical application, the model can be divided into classification mode, regression mode, time series mode, clustering mode, Association mode, and sequence mode. The specific algorithms include market analysis, clustering detection, neural networks, demo-trees, and genetic analysis) link Analysis, case-based reasoning, roughset, and various statistical models.
The Difference and connection between OLAP and data mining are: OLAP focuses on interaction with users, fast response speed, and multi-dimensional views of data, data Mining focuses on automatically discovering patterns and useful information hidden in data, although it allows users to guide this process. OLAP analysis results can provide analysis information for data mining as the basis for data mining. Data Mining can expand the depth of OLAP analysis and discover more complex and meticulous information that OLAP cannot discover. The focus of Data Mining Research is to solve new problems when using data mining algorithms and data mining technologies in new data types and application environments, such as mining unstructured data, standardization of data mining languages, and visual data mining.
4. BI representation and publishing technology
To present the analyzed data in front of users intuitively and concisely, some query and report tools are usually used for representation and release. However, more and more analysis results are presented in the form of visualization, which requires information visualization technology.
Information Visualization refers to displaying the complex relationships, potential information, and development trends between raw data in a way that is easily recognized by people, such as graphics, images, and virtual reality, so that we can make better use of our information resources. With the popularization of Web applications, business intelligence solutions can provide Web-based application services, thus expanding the scope of business intelligence information publishing. As a web-based business intelligence solution, some basic components are required, it includes web-based business intelligence servers, session management services, file management services, scheduling, distribution and Notification Services, Server Load balancer services, and application services.