Hadoop Tutorial: The application of SMARTBI in large data analysis of Hadoop
Source: Internet
Author: User
KeywordsLarge data analysis JDBC
Large data is currently the hottest topic, although many manufacturers announced the introduction of large data products, but in practical applications, Hadoop has become the fact that large data processing standards, Facebook, Baidu, Ali and other Internet companies do not use Hadoop. Even business database companies such as IBM, Oracle, SAP, Teradata, and even Microsoft use Hadoop. Jin Cang, the National People's Congress, also integrates Hadoop products in large data-side solutions.
The popularity of Hadoop stems from its good system architecture, enabling it to store and process gigabytes (PB) of data at low cost, efficiency, and reliability. Hive is a data warehouse platform built in Hadoop for storing and processing massive structured data. It stores large amounts of data in Hadoop file systems, rather than databases, but provides a set of data storage and processing mechanisms for class databases and automates the management and processing of these data using HQL (class SQL) languages. We can think of the massive structured data in the hive as a table, and actually the data is distributed in HDFS. After parsing and transforming the statements, the Hive finally generates a series of map/reduce tasks based on Hadoop to perform the data processing by performing these tasks.
Traditionally, the business intelligence BI Platform or report platform is built on the basis of relational database, even if the data Warehouse uses Hadoop, it also takes Hadoop as a computational tool, writes the result into Oracle, DB2 database, etc. for BI software query. In fact, as the hadoop/hive industry chain matures, BI tools can directly connect hive query data through JDBC. In this way, in addition to professional IT engineers, many ordinary users can experience the charm of Hadoop.
Smartbi as a leading domestic bi platform, can be very good support hadoop/hive products. SMARTBI combined with hadoop/hive has been successfully used in telecom industry.
The following describes how to use the SMARTBI connection hadoop/hive query data.
1. First installs the system environment, the example is the Ubuntu 12.04/jdk7/hadoop-1.2.1/hive-0.11.0, and loads the data. Then start the Hive service.
2. Copy the following hive JDBC-driven files to the Smartbi Lib directory (. \smartbi\web-inf\lib).
3. Start Smartbi, establish Hadopp JDBC Connection in System Management, driver class: Org.apache.hadoop.hive.jdbc.HiveDriver, connection string is jdbc:hive://ip:port/ Default。 The test connection is passed and saved.
4. Write hive SQL statements in the SMARTBI DataSet definition feature.
5. Click the "Preview" button to view the query results.
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.