1. Python Big Data application-Deploy Hadoop

Source: Internet
Author: User

Python Big Data App Introduction

Introduction: At present, the industry mainstream storage and analysis platform for the Hadoop-based open-source ecosystem, mapreduce as a data set of Hadoop parallel operation model, in addition to provide Java to write MapReduce task, but also compatible with the streaming way, You can use any scripting language to write MapReduce tasks, with the advantage of being simple and flexible to develop.

Hadoop environment Deployment 1, the deployment of Hadoop requires master access to all slave host implementation without password login, that is, configure the account public key authentication. 2, Master host installation JDK environment
3, Master host installation Hadoop3.1, download Hadoop, extract to/usr/local directory 3.2, modify the Java environment variables in hadoop-env.sh
export JAVA_HOME=/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.41.x86_64
3.3, modify the Core-site.xml (Hadoop core configuration file)
<configuration><property>        <name>hadoop.tmp.dir</name>        <value>/data/tmp/hadoop-${user.name}</value></property><property>        <name>fs.default.name</name>        <value>hdfs://192.168.1.1:9000</value></property></configuration>
3.4. Modify Hdfs-site.xml (Configuration entry for Hadoop's HDFS components)
<configuration><property>        <name>dfs.name.dir</name>        <value>/data/tmp/name</value></property><property>        <name>dfs.data.dir</name>        <value>/data/hdfs/data</value></property><property>        <name>dfs.datanode.max.xcievers</name>        <value>4096</value></property><property>        <name>dfs.replication</name>        <value>2</value></property></configuration>
3.5. Modify Mapred-site.xml (Configure the properties of the Map-reduce component, including Jobtracker and Tasktracker)
<configuration><property>        <name>mapred.job.tracker</name>        <value>192.168.1.1:9001</value></property></configuration>
3.6. Modify the Masters,slaves configuration file

Masters file

192.168.1.1

Slaves file

192.168.1.1192.168.1.2192.168.1.3
4, slave host configuration 4.1, configuration and master host-like JDK environment, the target path remains the same 4.2, the master host configuration of the Hadoop environment to the slave host 5, configuration firewall

Master Host

iptables -I INPUT -s 192.168.1.0/24 -p tcp --dport 50030 -j ACCEPTiptables -I INPUT -s 192.168.1.0/24 -p tcp --dport 50070 -j ACCEPTiptables -I INPUT -s 192.168.1.0/24 -p tcp --dport 9000 -j ACCEPTiptables -I INPUT -s 192.168.1.0/24 -p tcp --dport 90001 -j ACCEPT

Slave host

iptables -I INPUT -s 192.168.1.0/24 -p tcp --dport 50075 -j ACCEPTiptables -I INPUT -s 192.168.1.0/24 -p tcp --dport 50060 -j ACCEPTiptables -I INPUT -s 192.168.1.1 -p tcp --dport 50010 -j ACCEPT
6, test results 6.1, execute the start command on Master host (under the installation directory)
./bin/start-all.sh

The results shown are as follows, indicating a successful start

6.2. Test the MapReduce sample on the master host
./bin/hadoop jar hadoop-examples-1.2.1.jar pi 10 100

The results shown are as follows, indicating a successful configuration

7. Add: Visit the Management page provided by Hadoop

Map/reduce Management Address: 192.168.1.1:50030

HDFs Management Address: 192.168.1.1:50070

1. Python Big Data application-Deploy Hadoop

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.