Hadoop connector motto: All manufacturers to their own

Source: Internet
Author: User
Keywords So so motto such motto can so motto can manufacturers so motto can manufacturers but

How hot is Hadoop? A series of actions from the industry can be seen. Mainstream database vendors, including Oracle, Microsoft and Sybase, have released Hadoop connector products to make it easier for users to transfer information between traditional relational databases and open source distributed processing systems.

These vendors see the Hadoop Connector Software as an important part of the "Big Data management" strategy, but not just the mainstream database vendors. Like the Data Warehouse provider Teradata and Hewlett-Packard Company's Vertica have launched a similar Hadoop products, there are informatica, talend such data integration software vendors. Startups like Hortonworks, Cloudera and MAPR also play a very important role in the ecosystem.

Openlogic's technical director, Rod Cope, has a lot of experience in using Hadoop, telling users to consider the need to apply to scenarios and data before using a Hadoop connector. Cope describes his company using Hadoop, HBase, and a column of NoSQL database combinations, which, as part of Openlogic's main business, can help its customers audit software applications to verify that embedded open source code used is compliant with the relevant license.

Openlogic has not yet deployed any connector software, but Cope shows some of the great curiosity about the technology, which he believes can be used to transfer frequently accessed data from a relational database to a hbase file.

But cope that the Hadoop Connector software does not solve all the problems, and interested users need to be aware of the speed at which data is loaded. When dealing with large data, people tend to be less concerned with performance standards than before, and if it takes a long time to load data into a Hadoop user, the use of a connector is less meaningful. The problem is not in Hadoop, but in the data source you load.

Ventana Research analyst David Menninger says that the Hadoop Distributed File System (HDFS) and the database products built on it provide users with very good data management and analysis solutions, as opposed to traditional relational databases and data warehouses. The data could be large machine-generated data, such as web search logs, social media information, cell phone calls, and other unstructured data.

Menninger points out that a typical scenario used by the Hadoop Connector software is that the enterprise uses the Hadoop system to extract a small amount of structured analysis information from a large number of unstructured data sources before transferring it to a relational database for further analysis using the BI tool.

Hadoop Connector motto: to the proper

"At the moment users are putting information into relational databases, mainly because it's not easy to make reports with Hadoop data sources," says Menninger. The industry has a sophisticated reporting and analysis system, of course, for relational data. ”

Such data transfer is not necessarily one-time, perhaps you are calculating the number of occurrences of an event, and then you want to calculate the number of times the two events occurred together. You can go back to the data source and then process the information again, which is why people don't delete unstructured data, which can be stored in Hadoop.

In addition, Hadoop provides a better environment for advanced analytics and data mining applications than for SQL databases. For example, analyze customer service phone logs and information on social media, find out customers ' points of interest and the reputation of a product. This is a very difficult thing for SQL, but it can be transferred to a relational database or data warehouse via a Hadoop connector.

Cameron Befus, vice president of Tynt Multimedia, said they used Hadoop to provide analytics services for more than 500,000 users. In addition, Tynt uses the open source MySQL database as backend support. So far, Befus has not seen the need to deploy a Hadoop connector, he said: "We do move the data, but this is usually straightforward." We will import files from Hadoop directly into MySQL, and it may be easier if we use connectors, but it's not a problem for us. ”

But it analysts believe that with the popularity of Hadoop, such connector software will be used more frequently. Analysts such as Menninger believe the company wants to be able to import the analysis based on Hadoop into a larger business environment, which is also driving the development of connector technology. What's important when we look at Big data? That's how the data can tell me the key question. Users want to be able to build a bridge between unstructured, streaming, meaningful, and highly structured data that can be analyzed to find the source of the problem.

(Responsible editor: The good of the Legacy)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.