Intermediary transaction http://www.aliyun.com/zixun/aggregation/6858.html ">seo diagnose Taobao guest cloud host technology Hall
In the automatic extraction of the description process, some content and objects in the text are very important, they will directly affect the quality of the digest. For example, the keywords appearing in the title, the frequency of the word, the position of the word, sentence length, sentence structure and typesetting characteristics, etc, they play an important role in the automatic generation of the description, the selection of abstract sentences and the organization of abstracts, which must be deeply understood and analyzed.
(1) Word frequency
Word frequency refers to how often words appear in the text. Jcy stressed that the words with the value of indexing are often medium-frequency words, high-frequency words are generally reflected sentence grammatical structure of the word, and low-frequency words is not suitable as a reference. Similarly, the vocabulary that plays an important role in the abstract is also a high frequency keyword (important word), and these words in the entire article belongs to the intermediate frequency word. By means of the number of these "if words", the weights of the sentences can be calculated to determine the abstract candidate sentences.
(2) Title
The title of the text information is an important embodiment of the text content, and the headings of the text reflect the main contents of the text in varying degrees. Therefore, the vocabulary in the title is the important material of the abstract, in which the keyword and the original content and the discussion topic are often closely related. Excluding the functional words in the title, the remaining keywords can be used as the "important words" of abstract sentences.
(3) Indicator words
There are many phrases (vocabularies) used to derive a summary sentence that reflects the content of a text, such as a phrase or a word. This kind of deixis has the following forms: "This article discusses", "The purpose of this article", "To sum up" and so on, these instructions after the sentence is often highly summed up the subject of literature. Therefore, the likelihood of these sentences being chosen as a summary candidate is very high.
(4) Location
The different position of the sentence on the article and the theme of the paragraph contribution is not the same, jcy on this has been some research, we think: the first sentence of the paragraph is the probability of the topic sentence of 85%, at the end of the sentence of the probability of 7%. Therefore, the sentence of these positions is very likely to be a summary sentence, in the process of automatic summarization, it is necessary to improve the weights of the sentences in these special positions.
(5) Syntactic structure
There are a variety of sentence forms in the article, such as declarative sentences, interrogative sentences, exclamatory sentences, and so on, but the main topic of the article is the statement, which also indicates that the abstract of the article is composed of statement sentences. Therefore, when choosing the summary sentence, we should extract the statement sentence as far as possible, and avoid the sentences such as interrogative sentences and exclamatory sentences entering the abstract.
(6) Sentence length
The summary is short and fine, that is to summarize the main content of the article in short text. Therefore, when you choose a summary sentence, you should choose those shorter sentences, the overly lengthy sentences are usually not suitable for the selection of the summary.
(7) Page layout features
In the premise of improving the web design software, the typesetting format of the machine readable document also put forward a high request. Editors often highlight the subject matter of the literature through special formats, such as enlarging the font size, changing to bold or special fonts, underlining, centering the text, adding labels, increasing indent, adding shadows, adding borders, hyperlinks, etc. When determining the weights of words or sentences, these special formatting features should be considered and the weights should be enlarged appropriately.
In-station optimization for the entire SEO project, occupy a very large proportion. External links can only be in the 7478.html "> Internal optimization based on the icing on the cake." Jcy's point of view: Search engine optimization is to consider how to do the site well, to meet the rules of the search engine, to avoid offending the search algorithm.
This article by Hangzhou SEO www.seojcy.cn webmaster feeds, reproduced please indicate the source.