class Node represents Web in the graph, the basic information includes: The number of links, the number of links, the score of the chain, and the meta-data. the score of the chain is obtained by dividing the chain score by the number of chains.
< Span style= "font-family: Chinese Italic" > linkdatum web
< Span style= "font-family: Chinese Italic" > linknode node two parts.
< Span style= "font-family: Chinese Italic" > loopset
< Span style= "Font-family:times New roman,serif" >web parse-data crawl-fetch ) generated, including three parts: The chain database, the chain database and the node library.
< Span style= "font-family: Chinese Italic" > set web w
the list of out-of-chain databases is w/outlinks/current ;
The old-out chain database is located in the directory W/outlinks/old ;
in the directory where the chain database is w/inlinks
the directory where the node library is W/nodes ;
The Ring database is located in the directory W/loops ;
directory where the path is w/routes ;
< Span style= "font-family: Chinese Italic" > The link dump database is in the same directory as w/linkdump
< Span style= "font-family: Chinese Italic" > The chain database is mapfile linkdatum
< Span style= "font-family: Chinese Italic" > into the chain database is mapfile linkdatum
< Span style= "font-family: Chinese Italic" > Node database is mapfile node
< Span style= "font-family: Chinese Italic" > ring database is mapfile text loopset
link dump database to MapFile , the key is a link Text , the value is Linknodes that represents the in-chain information for each link.
Web diagram basic types and storage structures in Nutch