Search engine to the site's first step is to extract the text information of the site. SEO personnel should try to reduce the difficulty of the search engine extract text, that is, the optimization of the site code, to enhance the spider's crawl speed, so that the real content of the site to improve the proportion. Reduce the HTML format code, can reduce the difficulty of extracting keywords.
Common site code optimization is:
1, the use of CSS to define font, color, size and page layout. There are a lot of websites that use CSS, and they also use style to define fonts and sizes in the visible text section. This is the redundancy of the code.
2. Use external documents. CSS and JavaScript are placed in an external file, which can be invoked as long as one line of code is placed in the HTML of the page. But when we look at the source code of the site, we often see a large chunk of CSS and JavaScript code, but also in the front of the HTML, which reduces the spider crawl page content speed.
3. Reduce or delete notes. Comments in code are just hints to programmers or designers, and they don't work for users or search engines.
4. Reduce tables, especially nested tables. Web pages use CSS typesetting, table usage is greatly reduced. But sometimes the use of table display is necessary, do not have to deliberately completely avoid, only yo do not appear multi-layer nested tables, generate a lot of useless code on it.
Shijiazhuang SEO in the Study of Web page code optimization to see the HTML file is best limited to below 100kb, the link on the page is also controlled under 100. Baidu's current recommendation for HTML files is no more than 128kb. In fact, search engines crawl large number of files, or even one or two trillion is no problem.
But when possible, the smaller the file, the better. Although search engines can crawl large files, they may not index the entire file, but they are part of the index file. Indexing an entire file is neither necessary nor a great resource waste. The file is too large, plus a lot of redundant code, may be the real content is pushed outside the actual index content, this is not worth the candle.