Baidu Crawl to the content, the user search and access, Baidu rewritten the URL (the second directory after the domain name), resulting in a large number of 404, negotiation without fruit, no way, can only repair themselves .
1. Demand:
Baidu gives the URL the correct URL
http://g.perofu.com.cn/x/222/1112345.html-301--"http://g.perofu.com.cn/x/111/1112345.htm
Rewrite the first three digits of the article ID to the directory after x, that is, the first three digits of the article ID, regardless of what is the number one after X
2. Website article rules:
http://g.perofu.com.cn/x/{article ID minus 4-bit}/{article id}.html
3. Error configuration:
#此location, just do the upstream,url will not change, can also get data, this has an impact on SEO
Location ~ ' ^/x/([\d]{3})/([\d]{3}) ([\d]{4}) \.html$ ' {
Rewrite ' ^/x/([\d]{3})/([\d]{3}) ([\d]{4}) \.html$ '/wap/x/$2/$2$3.html break;
Proxy_set_header Host ' g.perofu.com.cn ';
Proxy_next_upstream http_502 http_504 error timeout invalid_header;
Proxy_set_header x-forwarded-for $proxy _add_x_forwarded_for;
Proxy_set_header X-real-ip $remote _addr;
Proxy_redirect off;
Proxy_connect_timeout 10;
Proxy_read_timeout 60;
Proxy_pass Http://WAPATS;
}
4, the correct configuration:
(because the file is not in the same application, so a bit of trouble, may be a bit messy, mainly to see location bar)
Client--->
|
This machine ([public network: G.perofu.com.cn],location modify the URL, and then upstream to ATS)
|
Other Machines
(g.perofu.com.cn is only a virtual host, you need to configure location, determine the file path, otherwise it will be reported 404,/data/www/web/3g/is G.perofu.com.cn's home directory and no other location, except/, Need to add location, the root is written as/data/www/web/3g/wap/, using the method above, rewrite is added/wap, 301 is only rewrite the URL, request to other machines is unable to find the URL, because a directory is missing)
#http://g.perofu.com.cn/x/222/1112345.html---> http://g.perofu.com.cn/x/111/1112345.html
#upstream后端需要增加location, write root as/data1/www/web/3g/wap/, otherwise 404
Location ~ ' ^/x/([\d]{3})/([\d]{3}) ([\d]{4}) \.html$ ' {
Set $dir 1 $;
Set $dir 2;
Set $file 3 $;
if ($dir 1! = $dir 2) {
Rewrite ^/x/(. *) http://g.perofu.com.cn/x/$dir 2/$dir 2$file3.html permanent;
}
Proxy_set_header Host ' g.perofu.com.cn ';
Proxy_next_upstream http_502 http_504 error timeout invalid_header;
Proxy_set_header x-forwarded-for $proxy _add_x_forwarded_for;
Proxy_set_header X-real-ip $remote _addr;
Proxy_redirect off;
Proxy_connect_timeout 10;
Proxy_read_timeout 60;
Proxy_pass Http://WAPATS;
}
#其他机器
server {
server_name g.perofu.com.cn;
Set $adddir '/3g ';
Root/data/www/web$adddir;
Access_log/data/nginx/logs/3g.access.log Tpynormal;
#g. pconline.com.cn,301, need to add, can view the front-end location ~ ' ^/x/([\d]{3})/([\d]{3}) ([\d]{4}) \.html$ '
Location ~ ' ^/x/([\d]{3})/([\d]{3}) ([\d]{4}) \.html$ ' {
root/data/www/web/3g/wap/;
}
}
Baidu unilaterally modified site URL resulting in a large number of 404