Find out if there is a pipeline query for this software: sudo apt-cache search SSH | grep ssh
If installed: sudo apt-get install xxxxx
After installing SSH to generate a file is executed: ssh-keygen-t rsa-p ""-F ~/.ssh/id_rsa
Finally, configure the Core-site.xml, Hdfs-site.xml, mapred-site.xml in the three files in the Soft/haoop/etc/hadoop directory
-----------------------------------------------------
View port: NETSTAT-LNPT netstat or netstat-plut. View all ports: Netstat-ano
--------------------------------------------------------------
Where to put the files with Hadoop fs-put xxxx/xxxx/xxxxx/xxx
Put files on cluster above: Hadoop--config/soft/hadoop/etc/hadoop_cluster fs-put/home/ubuntu/hell.txt/user/ubuntu/data/
Download the file on the cluster: Hadoop--config/soft/hadoop/etc/hadoop_cluster fs-get/user/ubuntu/data/hello.txt bb.txt
View the health of files: HDFs--config/soft/hadoop/etc/hadoop/etc/hadoop_cluster fsck/user/ubuntu/data/hello.txt
Remote replication via SCP: Scp-r/xxx/x
Format File system: HDFs--config/soft/hadoop/etc/hadoop_cluster Namenode-format
Touch is to build a text file
Log in to another virtual machine SSH s2 from one virtual machine, if it is SSH s2 ls ~ is to display a column of the appearance. If you perform
SSH s2 ls ~ | Xargs is to display a horizontal content
View cluster Status: Hadoop--config/soft/hadoop/etc/hadoop_cluster FS-LSR/
Putting the file on the cluster is the Hadoop--config/soft/hadoop/etc/hadoop_cluster fs-put xxxxx followed by the location of the add-on path.
View the process SSH S2 JPS. PS-AF is also the viewing process. The kill process is the port number behind the kill-9 process
Su root root user
--------------------------------------------------
HDFs concept: Namenode & Datanode
Namenode: Image file + edit log, stored on local disk, and data node information, without block information. Block information is rebuilt by Datanode when cluster is started
Datanode:work node, storage retrieve block periodically sends block list to Namenode
Under Usr/local/sbin switch to the SU root user to build the script, write the execution script you want
Modify BlockSize size, default is 128m
It's in [Hdfs-site.xml]
Dfs.blocksize = 8m Set block size is 8M
1. Test method: Put file > 8m, view block size by WebUI
---------------------------------------------------------
Hadoop: Reliable, scalable, distributed computing framework, open source software
Four modules: 1, common----hadoop-commom-xxx.jar
2. HDFs
3. MapReduce
4. Yarn
Hadoop is fully distributed:
1. HDFs--->namenode, Datanode, Secondarynode (auxiliary name node)
2. Yarn---->resourcemanager (Resource manager), NodeManager (node Manager)
---------------------------------------------------
Configure the static IP to enter the network inside etc to edit the sudo nano interfaces:
# This file describes the network interfaces available on your system
# and how to activate them. For more information, see Interfaces (5).
# The Loopback network interface
Auto Lo
Iface Lo inet Loopback
# The Primary network interface
Auto Eth0
Iface eth0 inet DHCP
Iface eth0 inet Static (set to static IP)
Address 192.168.92.148 (client's IP)
netmask:255.255.255.0 (client)
Gateway 192.168.92.2 (NAT gateway address)
Dns-nameservers 192.168.92.2
Finally, restart the network card: sudo/etc/init.d/networking restart
-------------------------------------------------
Client shutdown Command:
1, sudo poweroff
2, sudo shutdown-h o
3, sudo halt
------------------------------
Configure text mode
Go inside the/boot/grub and check it out.
Then enter Cd/etc/default inside to execute gedit Grub
Write grub_cmdline_linux_default= "text" below #grub_cmdline_linux_default= "quiet"
Written under # Uncomment to disable graphical terminal (GRUB-PC only):
Grub_terminal=console//Open Comment
After change, execute sudo update-grub and finally perform a restart of sudo reboot
-----------------------------------------
To start all data nodes:
hadoop-daemons.sh start Namenode//Execute the Start name node on the name node server
hadoop-daemons.sh start Datanode//execute on specified Datanode, start all data nodes
hadoop-daemon.sh start Secondsrynamenode//Start the Secondary name node
-------------------------------------------------------
HDFs getconf can look up the node configuration information. For example, HDFs Getconf-namenode can know that it is running on a S1 client
-----------------------------------------------------------------
Four modules:
1, common
Hadoop-coommon-xxx.jar
Core-site.xml
Core-default.xml
2. HDFs
Hdfs-site.xml
Hdfs-defailt.xml
3. MapReduce
Mapre-site.xml
Mapred-default.xml
4. Yarn
Yarn-site.xml
Yarn-default.xml
----------------------------------
Common ports:
1, namenode RPC//8020 WebUI//50070
2, Datanode RPC//8032 WebUI//50075
3, 2nn WebUI//50090
4, HistoryserverWebUI//19888
5, Resourcmanager webui//8088
--------------------------------------
Dfs.hosts: decided to be able to connect Namenode
Dfs.hosts.exclude: Decided not to connect Namenode
Dfs.hosts Dfs.hosts.exclude
---------------------------------------------
00//Cannot connect
01//Cannot connect
10//can connect
11//can even retire
---------------------------------------------
Safe Mode
1, namenode start, merge image and edit into new image, and generate a new edit log
2, the entire intelligent Safe mode, the client can only read
3. Check if Nameode is in safe mode
HDFs Dfsadmin-safemode Get//view Safe Mode
HDFs Dfsadmin-safemode Enter//Enter Safe mode
HDFs Dfsadmin-safemode Leave//Leave Safe Mode
HDFs Dfsadmin-safemode Wait//await Safe mode
4. Manually Save namespaces: Dfsadmin-savenamespace
5. Manually save the image file: HDFs dfsadmin-fetchimage
6. Save metadata: (Save under Hadoop_home: hadoop/logs/) HDFs Dfsadmin-metasave Xxx.dsds
7, start-balancer.sh: Start the Equalizer, the purpose of the cluster data storage on the more average, improve the performance of the whole cluster (generally we start the equalizer in the case of increasing the node)
8. Hadoop Fs-count Statistics Directory
--------------------------------------------------
Hadoop Snapshot Snapshot: It is to save the current picture. Generic directory The default scenario is that snapshots cannot be created. HDFs Dfsadmin-allowsnapshot/user/ubuntu/data must be executed. Allow snapshots to be created followed by the address path where you want to create the snapshot. Once the snapshot is allowed to be created, we can execute the Hadoop fs-createsnapshot/user/ubuntu/data snap-1 to create the snapshot. Snap-1 is the name of the snapshot you created. View the snapshot of the word direct Hadoop fs-ls-r/user/ubuntu/data/.snapshot/. And you can't disable snapshots in the case of a snapshot creation.
1. Create a snapshot of Hadoop FS [-createsnapshot <snapshotDir> [<snapshotname>]
2. Delete Snapshot Hadoop fs [-deletesnapshot<snapshotdir> <oldName> <newname>]
3. Rename Snapshot Hadoop fs [-renamesnappshot<snapshotdir> <oldName> <newname>]
4. Allow directory snapshot Hadoop dfsadmin [-allowsnapshot <snapshotdir>]
5. Disable directory snapshots Hadoop dfsamdin[-disallowsnapshot<snapshotdir>]
------------------------------------------
Recycling Station
1, the default is 0 seconds, which means that the Recycle Bin is disabled
2, set the file Recycle Bin dwell time [corep-site.xml] fs.trash.interval=1//min count
3. Files deleted through shell command will enter trash
4, each user has his own Recycle Bin (directory) namely:/user/ubuntu/. Trash
5, the programmatic deletion does not enter the Recycle Bin, deletes immediately, can call. The Movetotrash () method, which returns false, indicates that the Recycle Bin is disabled or is already in the station
Recycle Bin: Hadoop default Recycle Bin is off, time unit: minutes corresponds to the current user folder. Trash directory. RM when the file is moved to this directory
[Core-site.xml]
<porperty>
<name>fs.trash.interval</name>
<value>30</value>
</property>
Recycle Bin: Recovers files. Move the files from the. Trash directory: Hadoop fs-mv/user/ubuntu/. trash/xx/x/x data/
Empty Recycle Bin: Hadoop fs-expunge
Test Delete Recycle Bin: Hadoop fs-rm-r/user/ubuntu/. Trash
-----------------------------------
Quota: Quota
1. Directory quotas: HDFs dfsadmin-setquota n/dir//n > 0, directory quotas. 1: Indicates an empty directory and cannot place any elements
2. Space quota: HDFs Dfsadmin-setspacequota
Hadoop fs = = =HDFs DFS//File System Operations Command
-clsspacequota//Clear space quotas
-clsquota//Clear Directory quotas
---------------------------------------------------
OIV can view the contents of the image file-i is the input file-O is the output file. XML is a processor
What to do: HDFs oiv-i fsimage_000000000000000054-o ~/a.xml-p xml
View edit_xxx Edit log file: HDFs oev-i xxx_edit-o xxx.xml-p xml
is the image file here in/hadoop/dfs/name/current?
cat:fsimage_0000000000000054
BG% is for software to run in the background
-----------------------------------------------------------
Refresh node: HDFs dfsadmin-refreshnodes
-----------------------------------------
Hadoop Fragmented Notes