Knowledge System:
First, the Linux Foundation
Ii. background knowledge and origins of Hadoop
Third, build the Hadoop environment
Iv. the architecture of Apache Hadoop
V. HDFS
Vi. MapReduce
Vii. Programming cases of MapReduce
Viii. NoSQL Database: HBase
IX. Data analysis Engine: Hive
X. Data analysis Engine: Pig
XI. Data acquisition Engine: Sqoop and Flume
12. Integrated Management tools: HUE
13. Implementation of Hadoop HA and the Alliance of HDFs
14. NoSQL Database: Redis
XV, real-time processing framework: Apache Storm
The first chapter, the Linux Foundation
First, the experimental environment of Linux
(*) Version: RedHat 7.4 64-bit self-netcat server (test: Spark streaming)
(*) Vm:12
(*) Type: Redhat Linx 7 64-bit
(*) network card: Host mode only
(*) 5 virtual machines: Install JDK, configure hostname, turn off firewall
192.168.157.11 BIGDATA11
192.168.157.12 Bigdata12
192.168.157.13 Bigdata13
192.168.157.14 BIGDATA14
192.168.157.15 BIGDATA15
ii. Configuring the directory structure of Linux and Linux
1. Learn about Linux
2. Turn off the firewall
To view the status of a firewall: Systemctl status Firewalld.service
Shut down firewall: Systemctl stop Firewalld.service
Disable Firewall (permanent) systemctl disable Firewalld.service
3. Set the hostname (config file)/etc/hosts
Vi/etc/hosts
192.168.157.11 BIGDATA11
III, VI editor: Notepad equivalent to Linux
three modes:
press i--> ENTER: Insert mode
press: (colon) into command mode
2, insert mode: Press ESC to return to edit mode
3, command mode
(*) W: Save
(*) Q: Exit
(*) Wq: Save exit
(*) Open line numbers: Set number
Close line number: Set Nonumber
(*) NewLine: Set wrap
set nowrap
Four, the file Directory Operation command (key grasp: The back operation of HDFs, very convenient)
(*) LS display file and directory list
-L list File details
-a lists all files in the current directory, including hidden files
hidden files:. Bash_profile setting Environment variables: java_home, Hadoop_home
Hide directory:. SSH----> Configure password-free login for Hadoop and Spark
query for files in Hadoop: HDFs dfs-ls/
(*) current directory: PWD
/root---> Root user's home directory (shortcut: ~)
(*) mkdir Create directory
- P Parent directory does not exist in case of Mister into parent directory
Convention: Mkdir/root/tools---> All installation packages
mkdir/root/training---> Installation directory
Create a directory in Hadoop: HDFs dfs-mkdir/aaa
(*) CD switch directory
(*) touch generates an empty file
Echo generates a file with content----> more common usage: View environment variables
Echo $JAVA _home
(*) Cat, TAC display text file contents
The cat starts from the first line, and the TAC starts with the last line.
Example: Cat a.txt
View the contents of a file in Hadoop: HDFs dfs-cat/a.txt
(*) CP Copy file or directory
CP a.txt Data.txt
copying data in Hadoop: HDFs dfs-cp/a.txt/b.txt
RM Delete File (*)
-R Delete all files in this directory at the same time
- f Force Removal of files or directories
RM-RF a.txt
Delete files in Hadoop: HDFs dfs-rmr/a.txt
(*) Kill: Kill
Parameters:-9 forced kill
-3
(*) Tar command: Package compression
Five, Linux Rights Management (Hadoop's HDFs permissions very much like)
1. Type of permission: R Read
W Write
X execution
2. Use ls-l or LL command to view permissions
Vi. Installing common software: When installing the JDK, speak the tar command
TAR-ZXVF jdk-8u144-linux-x64.tar.gz-c ~/training/
Set environment variables: VI ~/.bash_profile
java_home=/root/training/jdk1.8.0_144
Export Java_home
Path= $JAVA _home/bin: $PATH
Export PATH
Effective environment variable: source ~/.bash_profile
VII. Case Study: Deadlock analysis in Java---> Introduction of a tool (Kill-3 command)
Objective: To find the deadlock performance diagnosis
Java provides a very powerful performance diagnostic Tool: Thread Dump (text message)
To view the Java process using the JPS command
1, Linux:kill-3 PID (is the Java process number)
2. Windows: Press Ctrl+break (FN+B) key
Java Deadlock Code:
PackageDeadLock; Public classDeadLock {FinalObject Locka =NewObject (); FinalObject lockb =NewObject (); Public Static voidMain (string[] args) {//TODO auto-generated Method StubDeadLock dl =NewDeadLock (); Dl.startlock (); } Public voidStartlock () {Threada a=NewThreada (LOCKA,LOCKB); THREADB b=Newthreadb (LOCKA,LOCKB); A.start (); B.start (); }}classThreadaextendsthread{PrivateObject locka=NULL; PrivateObject lockb=NULL; PublicThreada (Object a,object b) { This. locka=A; This. lockb=b; } Public voidrun () {synchronized(Locka) {System.out.println ("* * * Thread A: * *: Lock A"); Try{Thread.Sleep (3000); } Catch(Exception e) {//Todo:handle Exception } synchronized(lockb) {System.out.println ("* * * Thread A: * *: Lock B"); }} System.out.println ("* * * Thread A: * *: Finished"); }}classThreadbextendsthread{PrivateObject locka=NULL; PrivateObject lockb=NULL; Publicthreadb (Object a,object b) { This. locka=A; This. lockb=b; } Public voidrun () {synchronized(lockb) {System.out.println ("* * * Thread B: * *: Lock B"); Try{Thread.Sleep (3000); } Catch(Exception e) {//Todo:handle Exception } synchronized(Locka) {System.out.println ("* * * Thread B: * *: Lock A"); }} System.out.println ("* * * Thread A: * *: Finished"); }}
Big Data Learning (i) Linux basics