Nutch-1.3 distributed terminal operation process

Source: Internet
Author: User

 

Kaiwii @ master :~ /Nutch-1.2/bin $./hadoop namenode-format
11/08/13 19:52:20 info namenode. namenode: startup_msg:
/*************************************** *********************
Startup_msg: Starting namenode
Startup_msg: host = Master/127.0.1.1
Startup_msg: ARGs = [-format]
Startup_msg: version = 0.20.2
Startup_msg: Build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20-r 911707; compiled
By 'chrisdo 'on Fri Feb 19 08:07:34 UTC 2010
**************************************** ********************/
Re-format filesystem in/home/kaiwii/tmp/hadoop-kaiwii/dfs/name? (Y or N) y
11/08/13 19:52:23 info namenode. fsnamesystem: fsowner = kaiwii, kaiwii, ADM, dialout, CDROM, floppy, audio, dip, video, plugdev, fuse, lpadmin, admin
11/08/13 19:52:23 info namenode. fsnamesystem: supergroup = supergroup
11/08/13 19:52:23 info namenode. fsnamesystem: ispermissionenabled = true
11/08/13 19:52:23 info common. Storage: Image File of size 96 saved in 0 seconds.
11/08/13 19:52:23 info common. Storage: storage directory/home/kaiwii/tmp/hadoop-kaiwii/dfs/name has been successfully formatted.
11/08/13 19:52:23 info namenode. namenode: shutdown_msg:
/*************************************** *********************
Shutdown_msg: Shutting Down namenode at Master/127.0.1.1
**************************************** ********************/
Kaiwii @ master :~ /Nutch-1.2/bin $./start-all.sh
Starting namenode, logging to/home/kaiwii/hadoop-0.20.2/logs/hadoop-kaiwii-namenode-master.out
Localhost: Starting datanode, logging to/home/kaiwii/hadoop-0.20.2/logs/hadoop-kaiwii-datanode-master.out
Localhost: Starting secondarynamenode, logging to/home/kaiwii/hadoop-0.20.2/logs/hadoop-kaiwii-secondarynamenode-master.out
Starting jobtracker, logging to/home/kaiwii/hadoop-0.20.2/logs/hadoop-kaiwii-jobtracker-master.out
Localhost: Starting tasktracker, logging to/home/kaiwii/hadoop-0.20.2/logs/hadoop-kaiwii-tasktracker-master.out
Kaiwii @ master :~ /Nutch-1.2/bin $./hadoop dfsadmin-Report
Configured capacity: 0 (0 KB)
Present capacity: 0 (0 KB)
DFS remaining: 0 (0 KB)
DFS used: 0 (0 KB)
DFS used %: %
Under replicated blocks: 0
Blocks with primary upt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 0 (0 total, 0 dead)

Kaiwii @ master :~ /Nutch-1.2/bin $./hadoop dfsadmin-Report
Configured capacity: 0 (0 KB)
Present capacity: 0 (0 KB)
DFS remaining: 0 (0 KB)
DFS used: 0 (0 KB)
DFS used %: %
Under replicated blocks: 0
Blocks with primary upt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 0 (0 total, 0 dead)

Kaiwii @ master :~ /Nutch-1.2/bin $./hadoop dfsadmin-Report
Configured capacity: 0 (0 KB)
Present capacity: 0 (0 KB)
DFS remaining: 0 (0 KB)
DFS used: 0 (0 KB)
DFS used %: %
Under replicated blocks: 0
Blocks with primary upt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 0 (0 total, 0 dead)

Kaiwii @ master :~ /Nutch-1.2/bin $ JPs
11910 secondarynamenode
JPS 12305
11973 jobtracker
11737 namenode
12048 tasktracker
Kaiwii @ master :~ /Nutch-1.2/bin $./hadoop dfsadmin-Report
Configured capacity: 0 (0 KB)
Present capacity: 0 (0 KB)
DFS remaining: 0 (0 KB)
DFS used: 0 (0 KB)
DFS used %: %
Under replicated blocks: 0
Blocks with primary upt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 0 (0 total, 0 dead)

Kaiwii @ master :~ /Nutch-1.2/bin $./stop-all.sh
Stopping jobtracker
Localhost: Stopping tasktracker
Stopping namenode
Localhost: No datanode to stop
Localhost: Stopping secondarynamenode
Kaiwii @ master :~ /Nutch-1.2/bin $./hadoop namenode-format
11/08/13 20:02:30 info namenode. namenode: startup_msg:
/*************************************** *********************
Startup_msg: Starting namenode
Startup_msg: host = Master/127.0.1.1
Startup_msg: ARGs = [-format]
Startup_msg: version = 0.20.2
Startup_msg: Build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20-r 911707; compiled
By 'chrisdo 'on Fri Feb 19 08:07:34 UTC 2010
**************************************** ********************/
11/08/13 20:02:30 info namenode. fsnamesystem: fsowner = kaiwii, kaiwii, ADM, dialout, CDROM, floppy, audio, dip, video, plugdev, fuse, lpadmin, admin
11/08/13 20:02:30 info namenode. fsnamesystem: supergroup = supergroup
11/08/13 20:02:30 info namenode. fsnamesystem: ispermissionenabled = true
11/08/13 20:02:30 info common. Storage: Image File of size 96 saved in 0 seconds.
11/08/13 20:02:30 info common. Storage: storage directory/home/kaiwii/tmp/hadoop-kaiwii/dfs/name has been successfully formatted.
11/08/13 20:02:30 info namenode. namenode: shutdown_msg:
/*************************************** *********************
Shutdown_msg: Shutting Down namenode at Master/127.0.1.1
**************************************** ********************/
Kaiwii @ master :~ /Nutch-1.2/bin $./start-all.sh
Starting namenode, logging to/home/kaiwii/hadoop-0.20.2/logs/hadoop-kaiwii-namenode-master.out
Localhost: Starting datanode, logging to/home/kaiwii/hadoop-0.20.2/logs/hadoop-kaiwii-datanode-master.out
Localhost: Starting secondarynamenode, logging to/home/kaiwii/hadoop-0.20.2/logs/hadoop-kaiwii-secondarynamenode-master.out
Starting jobtracker, logging to/home/kaiwii/hadoop-0.20.2/logs/hadoop-kaiwii-jobtracker-master.out
Localhost: Starting tasktracker, logging to/home/kaiwii/hadoop-0.20.2/logs/hadoop-kaiwii-tasktracker-master.out
Kaiwii @ master :~ /Nutch-1.2/bin $./hadoop dfsadmin-Report
Configured capacity: 0 (0 KB)
Present capacity: 0 (0 KB)
DFS remaining: 0 (0 KB)
DFS used: 0 (0 KB)
DFS used %: %
Under replicated blocks: 0
Blocks with primary upt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 0 (0 total, 0 dead)

Kaiwii @ master :~ /Nutch-1.2/bin $./hadoop dfsadmin-Report
Configured capacity: 20368445440 (18.97 GB)
Present capacity: 13561008128 (12.63 GB)
DFS remaining: 13560983552 (12.63 GB)
DFS used: 24576 (24 KB)
DFS used %: 0%
Under replicated blocks: 0
Blocks with primary upt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 1 (1 Total, 0 dead)

Name: 127.0.0.1: 50010
Decommission status: normal
Configured capacity: 20368445440 (18.97 GB)
DFS used: 24576 (24 KB)
Non DFS used: 6807437312 (6.34 GB)
DFS remaining: 13560983552 (12.63 GB)
DFS used %: 0%
DFS remaining %: 66.58%
Last contact: Sat Aug 13 20:07:32 PDT 2011

Kaiwii @ master :~ /Nutch-1.2/bin $./hadoop dfsadmin-Report
Configured capacity: 20368445440 (18.97 GB)
Present capacity: 13561008143 (12.63 GB)
DFS remaining: 13560983552 (12.63 GB)
DFS used: 24591 (24.01 KB)
DFS used %: 0%
Under replicated blocks: 0
Blocks with primary upt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 1 (1 Total, 0 dead)

Name: 127.0.0.1: 50010
Decommission status: normal
Configured capacity: 20368445440 (18.97 GB)
DFS used: 24591 (24.01 KB)
Non DFS used: 6807437297 (6.34 GB)
DFS remaining: 13560983552 (12.63 GB)
DFS used %: 0%
DFS remaining %: 66.58%
Last contact: Sat Aug 13 20:07:35 PDT 2011

Kaiwii @ master :~ /Nutch-1.2/bin $./hadoop DFS-copyfromlocal ../URLs
Kaiwii @ master :~ // Nutch-1.2/bin $./hadoop DFS-LSR
-RW-r -- 1 kaiwii supergroup 18 2011-08-13 20:08/user/kaiwii/URLs
Kaiwii @ master :~ /Nutch-1.2/bin $ nutch crawl URLs-Dir crawler-depth 3-topn 10
Bash: nutch: Command not found
Kaiwii @ master :~ /Nutch-1.2/bin $./nutch crawl URLs-Dir crawler-depth 3-topn 10
11/08/13 20:09:30 info crawl. Crawl: Crawl started in: Crawler
11/08/13 20:09:30 info crawl. Crawl: rooturldir = URLs
11/08/13 20:09:30 info crawl. Crawl: threads = 10
11/08/13 20:09:30 info crawl. Crawl: depth = 3
11/08/13 20:09:30 info crawl. Crawl: indexer = Lucene
11/08/13 20:09:30 info crawl. Crawl: topn = 10
11/08/13 20:09:30 info crawl. Injector: starting at 20:09:30
11/08/13 20:09:30 info crawl. Injector: crawldb: crawled/crawldb
11/08/13 20:09:30 info crawl. Injector: urldir: URLs
11/08/13 20:09:30 info crawl. Injector: Converting injected URLs to crawl dB entries.
11/08/13 20:09:30 warn mapred. jobclient: Use genericoptionsparser for parsing the arguments. Applications shocould implement tool for the same.
11/08/13 20:09:36 info mapred. fileinputformat: total input paths to process: 1
11/08/13 20:09:43 info mapred. jobclient: running job: job_201511132007_0001
11/08/13 20:09:44 info mapred. jobclient: Map 0% reduce 0%
11/08/13 20:10:30 info mapred. jobclient: Map 100% reduce 0%
11/08/13 20:10:52 info mapred. jobclient: Map 100% reduce 100%
11/08/13 20:10:55 info mapred. jobclient: job complete: job_201511132007_0001
11/08/13 20:10:57 info mapred. jobclient: counters: 18
11/08/13 20:10:57 info mapred. jobclient: Job counters
11/08/13 20:10:57 info mapred. jobclient: Launched reduce tasks = 1
11/08/13 20:10:57 info mapred. jobclient: Launched map tasks = 2
11/08/13 20:10:57 info mapred. jobclient: Data-local map tasks = 2
11/08/13 20:10:57 info mapred. jobclient: filesystemcounters
11/08/13 20:10:57 info mapred. jobclient: file_bytes_read = 6
11/08/13 20:10:57 info mapred. jobclient: hdfs_bytes_read = 28
11/08/13 20:10:57 info mapred. jobclient: file_bytes_written = 82
11/08/13 20:10:57 info mapred. jobclient: hdfs_bytes_written = 86
11/08/13 20:10:57 info mapred. jobclient: Map-Reduce framework
11/08/13 20:10:57 info mapred. jobclient: Reduce input groups = 0
11/08/13 20:10:57 info mapred. jobclient: Combine output records = 0
11/08/13 20:10:57 info mapred. jobclient: Map input records = 1
11/08/13 20:10:57 info mapred. jobclient: reduce shuffle bytes = 6
11/08/13 20:10:57 info mapred. jobclient: reduce output records = 0
11/08/13 20:10:57 info mapred. jobclient: spilled records = 0
11/08/13 20:10:57 info mapred. jobclient: map output bytes = 0
11/08/13 20:10:57 info mapred. jobclient: Map input bytes = 18
11/08/13 20:10:57 info mapred. jobclient: Combine input records = 0
11/08/13 20:10:57 info mapred. jobclient: map output records = 0
11/08/13 20:10:57 info mapred. jobclient: Reduce input records = 0
11/08/13 20:10:57 info crawl. Injector: merging injected URLs into crawl dB.
11/08/13 20:10:57 warn mapred. jobclient: Use genericoptionsparser for parsing the arguments. Applications shocould implement tool for the same.
11/08/13 20:11:03 info mapred. fileinputformat: total input paths to process: 1
11/08/13 20:11:05 info mapred. jobclient: running job: job_201511132007_0002
11/08/13 20:11:07 info mapred. jobclient: Map 0% reduce 0%
11/08/13 20:11:33 info mapred. jobclient: Map 100% reduce 0%
11/08/13 20:11:48 info mapred. jobclient: Map 100% reduce 100%
11/08/13 20:11:50 info mapred. jobclient: job complete: job_201511132007_0002
11/08/13 20:11:50 info mapred. jobclient: counters: 18
11/08/13 20:11:50 info mapred. jobclient: Job counters
11/08/13 20:11:50 info mapred. jobclient: Launched reduce tasks = 1
11/08/13 20:11:50 info mapred. jobclient: Launched map tasks = 1
11/08/13 20:11:50 info mapred. jobclient: Data-local map tasks = 1
11/08/13 20:11:50 info mapred. jobclient: filesystemcounters
11/08/13 20:11:50 info mapred. jobclient: file_bytes_read = 6
11/08/13 20:11:50 info mapred. jobclient: hdfs_bytes_read = 86
11/08/13 20:11:50 info mapred. jobclient: file_bytes_written = 44
11/08/13 20:11:50 info mapred. jobclient: hdfs_bytes_written = 215
11/08/13 20:11:50 info mapred. jobclient: Map-Reduce framework
11/08/13 20:11:50 info mapred. jobclient: Reduce input groups = 0
11/08/13 20:11:50 info mapred. jobclient: Combine output records = 0
11/08/13 20:11:50 info mapred. jobclient: Map input records = 0
11/08/13 20:11:50 info mapred. jobclient: reduce shuffle bytes = 0
11/08/13 20:11:50 info mapred. jobclient: reduce output records = 0
11/08/13 20:11:50 info mapred. jobclient: spilled records = 0
11/08/13 20:11:50 info mapred. jobclient: map output bytes = 0
11/08/13 20:11:50 info mapred. jobclient: Map input bytes = 0
11/08/13 20:11:50 info mapred. jobclient: Combine input records = 0
11/08/13 20:11:50 info mapred. jobclient: map output records = 0
11/08/13 20:11:50 info mapred. jobclient: Reduce input records = 0
11/08/13 20:11:50 info crawl. Injector: finished at 20:11:50, elapsed: 00:02:20
11/08/13 20:11:50 info crawl. Generator: generator: starting at 20:11:50
11/08/13 20:11:50 info crawl. Generator: generator: selecting best-scoring URLs due for Fetch.
11/08/13 20:11:50 info crawl. Generator: generator: Filtering: True
11/08/13 20:11:50 info crawl. Generator: generator: normalizing: True
11/08/13 20:11:50 info crawl. Generator: generator: topn: 10
11/08/13 20:11:50 warn mapred. jobclient: Use genericoptionsparser for parsing the arguments. Applications shocould implement tool for the same.
11/08/13 20:11:55 info mapred. fileinputformat: total input paths to process: 1
11/08/13 20:11:57 info mapred. jobclient: running job: job_201511132007_0003
11/08/13 20:11:58 info mapred. jobclient: Map 0% reduce 0%
11/08/13 20:12:19 info mapred. jobclient: Map 100% reduce 0%
11/08/13 20:12:29 info mapred. jobclient: Map 100% reduce 100%
11/08/13 20:12:31 info mapred. jobclient: job complete: job_201511132007_0003
11/08/13 20:12:31 info mapred. jobclient: counters: 17
11/08/13 20:12:31 info mapred. jobclient: Job counters
11/08/13 20:12:31 info mapred. jobclient: Launched reduce tasks = 1
11/08/13 20:12:31 info mapred. jobclient: Launched map tasks = 1
11/08/13 20:12:31 info mapred. jobclient: Data-local map tasks = 1
11/08/13 20:12:31 info mapred. jobclient: filesystemcounters
11/08/13 20:12:31 info mapred. jobclient: file_bytes_read = 6
11/08/13 20:12:31 info mapred. jobclient: hdfs_bytes_read = 86
11/08/13 20:12:31 info mapred. jobclient: file_bytes_written = 44
11/08/13 20:12:31 info mapred. jobclient: Map-Reduce framework
11/08/13 20:12:31 info mapred. jobclient: Reduce input groups = 0
11/08/13 20:12:31 info mapred. jobclient: Combine output records = 0
11/08/13 20:12:31 info mapred. jobclient: Map input records = 0
11/08/13 20:12:31 info mapred. jobclient: reduce shuffle bytes = 0
11/08/13 20:12:31 info mapred. jobclient: reduce output records = 0
11/08/13 20:12:31 info mapred. jobclient: spilled records = 0
11/08/13 20:12:31 info mapred. jobclient: map output bytes = 0
11/08/13 20:12:31 info mapred. jobclient: Map input bytes = 0
11/08/13 20:12:31 info mapred. jobclient: Combine input records = 0
11/08/13 20:12:31 info mapred. jobclient: map output records = 0
11/08/13 20:12:31 info mapred. jobclient: Reduce input records = 0
11/08/13 20:12:31 warn crawl. Generator: generator: 0 records selected for fetching, exiting...
11/08/13 20:12:31 info crawl. Crawl: stopping at depth = 0-no more URLs to fetch.
11/08/13 20:12:31 warn crawl. Crawl: No URLs to fetch-check your seed list and URL filters.
11/08/13 20:12:31 info crawl. Crawl: Crawl finished: Crawler
Kaiwii @ master :~ /Nutch-1.2/bin $

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.