Win or win? Pig vs Hive !!!, Pighive

Source: Internet
Author: User

Win or win? Pig vs Hive !!!, Pighive

From: http://www.aptibook.com/Articles/Pig-and-hive-advantages-disadvantages-features


This article discusses the features of pig and hive.
Developers usually choose a technical system that meets their business needs. In the hadoop system, pig and hive are very similar and can give almost the same results. But which technology is more suitable for special business scenarios? Some comparisons between pig and hive are listed here.

PIG and Hive:
Stream type:
Pig is a procedural data stream language. A procedural language is usually written in one step. You can control and optimize each step.
Hive is more like SQL. Therefore, it is a declarative language. You need to specify what needs to be done rather than how to do it. Hive relies on its own optimizer, so it is difficult to optimize hive.
Ease of use:
Pig has new and different syntaxes and requires additional time to learn.
Hive is more like SQL, and developers will be more excited about using hive.
General scenarios:
We recommend that program developers use Pig. The main reason is that it is efficient in computing. pig is more suitable when you have a large number of join and filter queries.
Hive is more used for analysis. It follows Hadoop and DatawareHouse rules. Generally, Hive is more inclined to generate reports. You can continue to use Hive if you have few join queries and filters. On the contrary, if you have many join queries, Hive performance may be reduced.
Data Type:
Pig can efficiently process structured and unstructured data.
Hive can efficiently process structured data.
Middle Layer:
Pig uses variables to represent data. To store intermediate results, you can easily store variables and reference them later.
Hive uses tables to represent data. It is difficult to store intermediate results. You need to create a table and insert it from other tables. Therefore, when presenting a complex query, hundreds of lines of code may be required.
Debugging method:
Pig can be debugged in local mode.
It is complicated and time-consuming to debug Hive in the original mode.
Scalability:
UDF in Pig is easy.
UDF in Hive is relatively troublesome.
Maintainability:
Pig is worse than Hive.
Hive is relatively simple.
Durability:
Pig may not retain the value of the variable. You need to re-Execute pig code to obtain the value of the variable.
In Hive, even if you exit the current session, the External table still exists because the external table still points to the HDFS file.
Development Time:
Pig Development requires more time and more familiarity with pig dependencies.
SQL statements with little development time.
Compatibility:
The compatibility between RDBMS and Pig is a little complicated because the pig syntax is completely different.
Most of the SQL statements in RDBMS can be executed in Hive, and only a few must be modified.
Data volume:
Pig is efficient in processing big data.
Hive sometimes has memory leakage and unreliable performance. However, some parameters can be adjusted and located.
Giant support:
Pig: Yahoo, Twitter, LinkedIn
Hive: FaceBook

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.