What functions should Seo-based Log Analysis Software provide?
Today is the first day in a row to go to bed after 12 o'clock.
Recently, I was thinking about how to use Seo-based log analysis tools to stay up late every day.
When a user enters a website address in the browser address bar, the web server records other user data while returning the user's required page, such as the brand of the user's browser, the IP address used by the user, the operating system even records whether the user entered the URL or jumped from another link. This part of the record is undoubtedly the most basic and important data, and many web data mining work is carried out from here as the starting point.
All those who have read Web server logs (such as Apache and IIS) know that when a user accesses a page, there are not only one line of logs, but many lines. With a little attention, you will find that the Web server generates a line of record for each file (images, JavaScript scripts, etc.) contained in the current access page. The records of this row constitute the original log file.
Log analysis is an essential basic skill for Seo. The most Seo-related information is the access record of the search engine and the traffic brought by the search engine. Most of the web log analysis tools used by website users in China currently use open-source tools such as AWStats. It can be said that the popularity of AWStats is closely related to the recommendations of log analysis enthusiasts, such as chelong and others.
Although I also use AWStats and other tools, there is no doubt that the log analysis tools dedicated to SEO on the market are still rare. At the same time, because AWStats adopts Perl and AWStats's own file format, it is not easy to provide Seo-based log analysis functions based on AWStats modifications.
So what functions should Seo-based log analysis Provide? This is my question over the past few days.
Currently, the following three parts have been implemented:
1. Extract page access records from the original log file. (remove. JS,. CSS,. jpg, and other records.) For details, see: "raw log-> page log"
2. The extracted page access records are weighted (Bloom filter) to extract unique access records. For details, see: "Page log-> sitemap"
3. For extracted page access records, extract search keyword records from Google and Baidu.
See: "Page log-> Search Keyword analysis"
Prepare the provided functions:
Common functions: remove meaningless requests (such as JPG and GIF) and analyze real access requests
Multi-day logs: Spider charts
Single-day log: the number of spider visits, the time period of spider visits (this is very important, you can determine the search engine update frequency)
Alpha
Seo-based Log Analysis