Java uses Hadoop to explore the world of big Data

Source: Internet
Author: User

What is Big Data

PB = 1024TB

7123913827189tb

Reids

No sharing

Hdfs

Pros: Ideal for storing large files

Tfs

HDFS Architecture

NameNode: The entire Hadoop Explorer, only one, DataNode down.

stored as image files Fsimage and edites

Secondary periodic merging of log files and image files

DataNode is responsible for storing data

Organize file contents with fixed size block as the basic unit the default size is 64M

Mapreduce

Jobtracker is mainly responsible for resource monitoring and job scheduling.

Tasktrachker

Slot into the map slot Reduce slot

Task

Map Task Reduce Tack

Configuring a single Hadoop pseudo-distributed environment

1 Edit ~/.BASHRC

Export Hadoop_home=/usr/local/hadoop//hadoop Installation path

Export hadoop_install= $HADOOP _home

Export Hadoop_mapred_home= $HADOOP _home

Export Hadoop_common_home= $HADOOP _home

Export Hadoop_hdfs_home= $HADOOP _home

Export Yarn_home= $HADOOP _home

Export hadoop_common_lib_native_dir= $HADOOP _home/lib/native

Export path= $PATH: $HADOOP _home/sbin: $HADOOP _home/bin

Allow settings to take effect after saving

SOURCE ~/.BASHRC

./bin/hadoop jar./share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar grep input Output ' dfs[a-z. +

Data localization

Operation movement, data not moving

Requirements: Query which account has the most money

Moneys[] ....//moneys = 56789778687687;

max = 0L;

for (I=0l:moneys) {

if (I>max) {

Max=i;

}

}

Mapreduce

Map1 Map 2 MAP4

1233 4223423 423432

1000 800 1200

1200

./etc/hadoop/core-site.xml

<configuration>

<property>

<name>hadoop.tmp.dir</name>

<value>file:/usr/local/hadoop/tmp</value>

<description>abase for other temporary directories.</description>

</property>

<property>

<name>fs.defaultFS</name>

<value>hdfs://localhost:9000</value>

</property>

</configuration>

Hdfs-site.xml

<configuration>

<property>

<name>dfs.replication</name>

<value>1</value>

</property>

<property>

<name>dfs.namenode.name.dir</name>

<value>file:/usr/local/hadoop/tmp/dfs/name</value>

</property>

<property>

<name>dfs.datanode.data.dir</name>

<value>file:/usr/local/hadoop/tmp/dfs/data</value>

</property>

</configuration>

Youku Java Video Combo address: Http://i.youku.com/i/UMTI4MTEzNTA0MA==?spm=a2hww.20023042.uerCenter.5~5!2~A

Java uses Hadoop to explore the world of big Data

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.