The way to learning big data------start learning with sandbox

Source: Internet
Author: User
Tags hortonworks sqoop

At first ...

When I first learned about the concept of big data, I was just feeling very tall and aroused my interest. At that time, do not know what this thing is to do, what is the use, of course, now seems to be very vague, but indeed a lot stronger than the beginning.

So the process of learning can be very difficult and sometimes very slow, but feel that this thing will be very useful in the future, the initial understanding of big data from the "Big Data Times" This book began, many of the concepts and predictions that I feel very magical.

But gradually some things in life were confirmed, I gradually accepted the contents of this book, I think this book is still worth seeing.

In the domestic technology seems to be relatively new, do not seem to be a lot of people, because of this, the information will be scarce, learning difficulties have risen, but this is not the reason we give up, right?

With Platform management tools

Needless to say, learn more things is serious, in the company internship for some time, feeling one of the difficulties of beginners is to build a platform.

So we can look at some of the more popular platform management tools:

HDP, CDH

And I used in the company is HDP, so I'll probably say HDP good

What is HDPHDP?

HDP full name is called Hortonworks Data Platform.

The Hortonworks data platform is an open source data platform based on Apache Hadoop, providing services such as big Data cloud storage, data processing and analytics. The platform is designed to handle multiple-source and multi-format data and make it easy and cost-effective to process. HDP also offers an open, stable and highly scalable platform that makes it easier to integrate Apache Hadoop's data flow business with existing data architectures. The platform includes a variety of Apache Hadoop projects and Hadoop Distributed File System (HDFS), MapReduce, Pig, Hive, HBase, zookeeper, and various other components, making Hadoop's platform easier to manage, More open and scalable.

The official website address is: http://zh.hortonworks.com/

Architecture of the HDP

 

Installation and use of the Hortonworks sandbox:

Official online explanation: Hortonworks Sandbox, you can use it to try out the latest HDP features and functions.

It can be installed on a VM, so it provides great convenience for us to learn about big data-related content.

: http://zh.hortonworks.com/downloads/#sandbox

The installation method is very simple, using the corresponding virtual machine software, direct import is OK.

 Note: My laptop is 12g of memory, and the minimum memory required by HDP2.5 is 8G, and if you don't have enough memory, you can choose a lower version of the sandbox.

After installation, turn on the virtual machine.

The startup process may take a long time to wait patiently.

Start as shown in the following:

Open the browser and enter http://127.0.0.1:8888/.

You can view more information by opening the view advanced options when you enter.

The lower right corner has the following content:

* Service disabled by default. To enable the service, need to log in as an Ambari admin.

The Ambari admin password can be set to following this tutorial

This requires us to SSH login, add the password of the admin account, and use this admin account to log in to the virtual machine.

SSH tool login using address 127.0.0.1 port is 2222

You can also use the browser to log in:

Enter 127.0.0.1:4200 in the browser to access

User name: Root

Password: Hadoop

Login needs to change the password, here the password is more complex, simple password may not pass (but after I test, when you log in later, you can run passwd root, change to any password you want)

Then run the Ambari-admin-password-reset command to modify the Ambari Admin account password.

After the change, we enter 172.0.0.1:8080 in the browser and log in with the admin account.

Cut a picture,

The introduction of Ambari is as follows:

  

Apache Ambari is a web-based tool that supports the provisioning, management, and monitoring of Apache Hadoop clusters.  Ambari currently supports most Hadoop components, including HDFs, MapReduce, Hive, Pig, Hbase, Zookeper, Sqoop, and Hcatalog. Apache Ambari supports centralized management of HDFS, MapReduce, Hive, Pig, Hbase, Zookeper, Sqoop, and Hcatalog. is also one of the top 5 Hadoop management tools.

We'll use it to learn in the future!

The way to learning big data------start learning with sandbox

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.