Objective
According to the market research and analysis agency Gartner published "Data warehousing Trends for the CIO, 2011-2012" 1, Appliance (one machine) technology has become a data warehouse, the field of future market hotspots. The IBM Intelligent Analytics System (Smart Analytics Systems) and IBM Netezza are the two major heavyweight appliance of IBM's main push, attracting a lot of market attention. In this paper, the architecture features of two large appliance are briefly described, then the load performance, compression ratio and query performance of the Intelligent Analysis system and Netezza are compared with the benchmark tpc-h data source (300GB data) and the test case (Q1-Q22). Readers can have a preliminary understanding of the IBM Intelligent Analytics System and IBM Netezza performance profiles.
Intelligent Analysis System Architecture
IBM Intelligent Analysis System (hereinafter referred to as ISAS) is a series of preconfigured, preconfigured, pre-integrated hardware, software and service systems for different customer scenarios to provide a suitable data warehouse solution, based on the specific needs of customers integrated machine allows customers to deploy data warehouse applications. The ISAS family includes the 5600 series on the IBM X series platform, the 7700 series based on IBM P series, and the 9600 series on IBM Z systems platform. Each series has a different configuration scale, customers can expand according to their own needs.
Take the ISAS 7700 series as an example, the configuration scale is divided into EXtra Small, Small, Medium, Large, EXtra Large, etc. The ISAS 7700 Series consists of 6 class modules (module), the Base module (Foundation module), the User module (username), the data module, the Failover module (Failover module), The Data Warehouse application module (Warehouse applications module) and the Business Intelligence module (BI module). Customers can choose the required modules and the number of modules based on the size of the data and the scenario. The following is an example of the ISAS 7700 S (Small) configuration, a simple description of the Intelligent Analysis system architecture, as shown in Figure 1 below.
Figure 1. ISAS 7700 S System Architecture
In the ISAS 7700 S configuration, the Data Warehouse application module and the Business Intelligence module are optional, and customers can decide whether or not to integrate the module as needed. The base module and failover module are required in the S configuration, and there are two data modules, but they can be expanded according to the amount of data. The base module contains the deployment node (Management node) and the Database Management node (Administration node), the deployment nodes are used to deploy ISAS clusters and later maintenance management tasks, and the database Management node contains the first database partition of the partitioned database environment as Coordinator node for storing system catalog tables, storing single partition tables, and receiving various queries from applications. Each data module consists of a data node and four DS3524 external storage, and the system resources in the module, i.e. CPU, Memory, I/O, network, and external storage designs are balanced, thus avoiding a resource becoming a performance bottleneck. The failover module contains an alternate node (Standby node), where the standby node accesses the external storage of the failed node through the SAN storage network and takes over the programs and tasks that are running in the failed node. The Data Warehouse application module contains data Warehouse application nodes (Warehouse application node) and OLAP analysis nodes (Warehouse OLAP node) and a DS3524 external storage. Data Warehouse application nodes are used to deploy various data warehouse applications, while OLAP analysis nodes are mainly used in OLAP applications such as data mining, cubing services, and connected storage devices are used to create all kinds of file systems required, and two nodes are mainly prepared. The SAN switches in the architecture are used to connect Data Warehouse nodes (Administration node,data node,standby node) and external storage devices, Juniper 10Gbps Ethernet fiber-optic interaction machines are used to connect all nodes in the cluster, mainly for DB2 partitions Different node partition (FCM) communication in the database environment.