Brief introduction
Today's applications need to be frequently combined with many different sources and different formats of information. As a result, application developers often need to invoke a large number of different APIs and protocols to retrieve information from each source and then incorporate that information into the application.
WebSphere Information Integrator accelerates application deployment of such scenarios by providing a real-time, sql-based interface to heterogeneous data sources, including relational systems such as Db2®universal database™ (DB2 UDB) and non-relational data sources such as text documents and unstructured data, as well as emerging technologies such as XML repositories and data accessed through Web Services. Information integrator to meet the market requirements for fast access to disparate data by transparently managing relational and non relational data and bringing it together into a single virtual location. Figure 1 illustrates the WebSphere Information Integrator environment in detail:
Figure 1. WebSphere Information Integrator provides a single SQL API to access different distributed data
WebSphere Information Integrator builds the database federation in the context of the DB2 UDB database engine by storing metadata for federated data sources in its own catalogs. Since every data source in the federation has been operating autonomously, maintaining consistency between the federated database and the source of its federation is a considerable challenge. Patterns define changes, server and network failures, and password expiration events can occur at any time. Each of these events can cause the data source to become inaccessible or catalogs that are no longer valid, and applications that access these data sources through the federated database may be interrupted.
Therefore, detecting such inconsistencies and autonomic functions of the self-management integration environment can reduce the complexity of data management. Combined with an environment that allows true virtualization, information Integrator Automation delivers an on demand solution that leverages resources between people, processes, and information.
This paper first describes the basic principle of federated database system, and gives a scenario to show the diversity and scalability of the system. Then, let's look at the real-time inconsistencies between the federated database and the source of its federation. Finally, let's take a look at federal health monitoring, the new autonomic functionality provided in the WebSphere Information Integrator V8.2, which is implemented in the DB2 UDB Health monitoring component. This feature warns the system administrator of inconsistencies between the federated database catalogs and federated data sources, proposes corrective action recommendations, and sends troubleshooting notifications on a regular basis.
The fundamentals of the federal system
The WebSphere Information Integrator Federated System includes the following components:
DB2 UDB Engine
Information Integrator instance of distributed data without moving to a central location
One or more data sources
Clients (users and applications)
The federated system is created by installing Information integrator on the DB2 UDB engine and configuring it to register one or more heterogeneous data sources. Users of the Federated database system can conduct distributed queries on data stored anywhere in the federated system, regardless of their location, regardless of the SQL dialect used by the data source. Figure 2 illustrates the architecture of the federated system.
Figure 2. Schema for federated system configuration