Big Data extracting value information technology Realization Scheme

Source: Internet
Author: User
Tags sybase

Big Data extracting value information technology Realization Scheme

In 5 steps:

1. collect files via FTP

2, the file into the HDFS system

3. Use hive to select data from HDFs

4. Use DataStage or Infomatica to put the data into storage

5. Warehousing to Sybase IQ database


Precautions:

1, do not necessarily use FTP to collect files, anyway, as long as the collection of massive files can be;

2, the source of the acquisition of files must be massive, can file a large number of files can be a huge amount of content, otherwise it is not called Big data;

3, This is mainly used in the Hadoop HDFs, no use of mapreduce;

4, MapReduce is actually the hive to help you achieve;

5, the use of hive is because as long as the SQL will use hive, the study cost is low, the general enterprise, especially the old enterprise will SQL developers a lot of people;

6, DataStage is IBM, feel not good, so now replace with Infomatica;

7, IBM's things are sold very cheap, but the maintenance fee is very expensive, he does not open source so you have to find him to help maintain, so I always hated it;

8, IBM's things not only maintenance expensive, and expansion of the node is not cheap, now some of the company's main engine has turned to HP;

9, not necessarily choose Sybase IQ, so the company chose also no big problem, query speed is very fast, update and insert temporarily also don't feel very slow, it is based on the column storage and the price is very much cheaper than Oracle.


Application Scenarios:

For example, your site has a large number of user search information, you can put this information file into HDFs, and then select the number of each keyword search, finally put this keyword and the number of times into the IQ. So, if you look directly at IQ, you can see the most recent search of the most attention is what the word.


This article originates from: Ouyida3 's csdn

2015.3.18

Big Data extracting value information technology Realization Scheme

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.