A brief introduction to Hue

Source: Internet
Author: User
Tags commit file upload postgresql solr hadoop ecosystem

Welcome to my Independent blog:
http://blog.ywheel.cn/post/2016/05/29/hue_introduction/

Some time ago, I gave my colleagues a training in the introduction of Hue, and put it in the way. This article gives a brief introduction to hue and how to contribute code to hue. What is hue?

hue=Hadoop User Experience

Hue is an open-source Apache Hadoop UI system that evolved from Cloudera desktop and finally cloudera the company's contribution to the Apache Foundation's Hadoop community, which is based on the Python web framework Django implementation.

By using hue we can interact with the Hadoop cluster on the browser-side Web console to analyze processing data, such as manipulating data on HDFs, running the MapReduce Job, executing the SQL statements of hive, browsing the HBase database, and so on. Hue Link Site:http://gethue.com/github:https://github.com/cloudera/hue reviews:https://review.cloudera.org core Functions SQL Editor, support Hive, Impala, MySQL, Oracle, PostgreSQL, Sparksql, SOLR sql, Phoenix ... Search Engine SOLR's various diagrams spark and Hadoop friendly interface support the scheduling system Apache Oozie, can be workflow for editing, viewing

Hue offers these features that are more user-friendly than those provided by the Hadoop ecosystem, but some scenarios that require debug may still require a native system in order to get deeper into the cause of the error.

When viewing Oozie workflow in Hue, it is also easy to see the entire workflow dag map, but the DAG graph has been removed in the latest version, and you can see only the action list in workflow and the jump relationship between them. If you want to see a DAG diagram, you can still view it using the Oozie native interface system. Hue Login

If you set up your hue, you can create a new user using your administrator account and then sign in with the new user, as shown in the following figure:

Use the Hue official online live demo to get a taste. If you do not set up a big data platform, do not install hue, you can first try on the demo. Clicking on Play with the live Demo now! will enter the "My Documents" in Hue:

HDFs File Browsing

Hue makes it easy to browse directories and files in HDFs, as well as create, copy, delete, download, and modify permissions for files and directories.

HDFs implements a permissions model for files and directories that are similar to POSIX systems. Each file and directory has one owner (owner) and one group. A file or directory has different permissions for its owner, other users in the same group, and all other users. However, the user identity mechanism is only an external feature for HDFs itself. HDFs does not provide the ability to create user identities, create groups, or process user credentials. when using hue to access HDFs, HDFs simply checks the name of the user name and group on the hue for permissions.

In live demo, click on "File Browser" to enter the home directory of HDFs:

PS: The file upload feature is banned in Live demo. Job Browsing

Click Job Browser to see a list of jobs, and you can filter jobs in different states by clicking "Success", "Running", "failed" and "stop" in the upper-right corner:

We found in the actual work that when the cluster (CDH5.2) is configured with Ha, when the active ResourceManager automatically switches (such as ResourceManager on NN1 is active, and NN2 is standby, when NN1 fails, NN2 on the ResourceManager transition to active state), Hue's job browser will not be displayed correctly. The job browser of Hue will work correctly only if the ResourceManager on the NN1 is turned back into the active state after the fault has been repaired. It is not known if the issue has been fixed in subsequent releases. Hive Query

Hue's beeswax app provides a friendly and convenient hive query function, the ability to select different hive databases, write HQL statements, submit query tasks, and be able to see a log of query job runs below the interface. After the results are obtained, the ability to perform simple chart analysis is also provided.

Click "Data Browsers", "Metastore table", you can also see the database in hive, the tables in the database, as well as the metadata of individual tables and other information.

Oozie Workflow Editor

Hue also provides a good Oozie integration to create and edit bundles, coordinator, Workflow on hue. Oozie's introduction can be viewed on the official website. The following figure creates a new workflow on hue, where you can drag different components directly into the nodes in the DAG and set the flow logic for each action.

Of course Oozie can also be submitted by the command line B,c,w. Just the workflow created with hue, or workflow submitted via the command line, can see the health of the run on hue:

Only workflow submitted via the command line can not be edited on hue. Using configuration files, command-line submissions ensures that versions that run on the production environment and run on the test environment are consistent, while the use of the Hue interface is easy to edit, but it can also lead to the risk of manual manipulation of errors in the production environment, as well as pros and cons. contribution

When I was preparing training materials for my colleagues, I went to Hue's GitHub to find information. When you see the main features of Hue, the original text on GitHub is this:

Just the main use of our database is PostgreSQL, see Postgresl feel strange, so Google A, PostgreSQL has two names: PostgreSQL and Postgres, The name on the official website is still PostgreSQL. Regardless of whether Postgresl has any allusions, PostgreSQL must be right. So I went to check out how to submit a code change to hue. Wiki:contribute to Hue can be found on GitHub, Hue has its own jira and review Board, but also said the Hue project gladly welcomes any patches or pull requ ests!

So I sent a issue and a pull Request to hue on GitHub. A few days later the pull request is received and the merge is on the master branch to see the commit.

Here are the steps to record the update: Fork hue works, such as Ywheel/hue create a new branch, do not commit the changes using the Master branch. For example, I created the Fix-postgresql-spelling branch. Pull the code down, modify the commit, and commit to the Fix-postgresql-spelling branch. Create a issue. When Hue works on creating issue, describe clear questions and submit. Click ' Pull Request ' to select the target project and branch, such as Cloudera/hue's Master branch. Fill in the comment, stating the created issue, create pull request.

The next step is to wait, wait for the commit to be review, merge to Master branch, wait for your own name to appear inside contributors, then everything done!

PS: Although the spelling of a word changed to be a disgrace, but is a good start, I hope in the near future can really give open source projects (especially the popular Big data ecology in the Open source project) to contribute code, refueling!

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.