How to integrate Puredata System for analytics and Infosphere Streams

Source: Internet
Author: User
Tags odbc unpack client create database linux

Effectively loading massive data into Netezza using the Streams operator

Infosphere Streams is a high-performance computing platform that supports continuous and extremely rapid analysis of massive stream data from multiple sources. The Netezza devices load these datasets and store them for Puredata System for Analytics analysis. This scalable, massively parallel system enables clients to perform complex analysis of massive data.

However, the default ODBC operator provided by the STREAMS 2.0 Standard Database toolkit is not sufficient to maximize the advantages of the high-performance load utility between systems. You need to use the bulk load feature nzload, which is the functionality provided by the Netezza client. This article describes how to construct the C + + original operator to use this feature, and how to invoke these operators from the flow-processing language (Streams processing Language, SPL).

Prepare Netezza and Streams environments

We want to analyze the interconnection between Streams and Netezza-especially using the high-performance load utility to load data from Streams to Netezza database. Our test environment contains the Puredata System for Analytics n1001-010 (technical support from Netezza) and Streams 2.0, which are installed on a single server. The default communication port for the Netezza device is 5480, and we use this port to establish the connection. Figure 1 shows our test environment.

Figure 1. Netezza/streams Connection

Netezza Ready

One of the advantages of Netezza is its simplicity. Therefore, you do not need to focus on the underlying database layout, such as buffer pools and table space design. For Netezza preparation, you only need to create a database: "CREATE Database <DB-Name>", and then define a table from which data from Streams should be loaded into the table:

"CREATE TABLE <TABLE-NAME>
      (col1 integer,
      col2 char (),
      col3 timestamp)"

Streams Ready

To connect Netezza and Streams, and to improve the speed of the High-performance Netezza load utility, you need to install the Netezza client on the Streams server. Download the Netezza client software. Find information Management > IBM Netezza NPS Software and Clients > nps_7.0.0 > Linux.

To install the Netezza client:

Copy the downloaded Netezza Client installation package to the Streams server (we used the Linux installation package in V6.0.5P6).

Log on to the Streams server as the root user and change the directory to the directory where you copied the installation package.

Unpack the installation package: Gunzip nz-linuxclient-v6.0.5.p6.tar.gz tar-xvf Nz-linuxclient-v6.0.5.p6.tar. Contains the following directories:

* WebAdmin (Netezza Online Administration Client)

* LINUX64/(contains 64-bit ODBC driver)

* Linux (includes Netezza client and 32-bit Netezza driver)

Change to the Linux directory used to install the Netezza client and extract the client (using./unpack).

Switch to the installation path and change to the bin directory (for example, using cd/usr/local/nz/bintry) to try to start the help page for the Netezza client interface, such as the Netezza Terminal (Nzsql) and the Netezza load utility (nzload):

Nzsql-h

Nzload-h

Both will be used in Streams.

If any libraries are missing, you can add the path to the LD_LIBRARY_PATH environment variable (export ld_library_path= $LD _library_path:{new_path}). Be sure to use 32-bit libraries instead of 64-bit. Use the Echo $LD _library_path check to make sure that you set the path correctly.

If you set up Ld_library_path and verify that you are using a 32-bit version, the problem with missing libraries is still unresolved (for example, Libssl.so.4 is still missing) and you can find the installed version of the library (for example, libssl.0.9.8) and uses the missing version number to create a symbolic Link: ln-s libssl.so.0.9.8 libssl.so.4 ln-s libcrypto.so.0.9.8.

The help page for the Netezza Terminal (nzsql) and Netezza load utility (nzload) should now be available with the following command:

/usr/local/nz/bin/nzsql-h

This is Nzsql, the IBM Netezza SQL Interactive terminal.

usage:nzsql [Options] [Security Options] [dbname [username] [Password]] ...

Nzload-h

Usage:nzload [-h|-rev] [<options>]

Options: ...

The Netezza Communication port (5480) must be opened between the Netezza server and the Infosphere Streams server.

Additionally, you can add a directory path to the Netezza client binaries to the PATH environment variable to access the Netezza client application on your shell globally: Export path= $PATH: {PATH of Netezza client bi Naries}.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.