In fact, I do not want to write this title, my intention is to cache the Yupoo API query data, this process found a reference method (Caching Google earth with Squid). Oh, so I also come back to the title party.
This reference has been widely circulated, Digg has been mentioned, I do not know where the original source is.
But.... You follow its instructions to set it and it does not work correctly!!
Anyway, let's talk about my needs first.
Recently Yupoo access speed is very slow, I have a lot of API requests can not be completed, guess either the other side limit the number of the same IP connection, or Yupoo again encountered a new round of traffic bottlenecks. After contacting Yupoo's Zola, it was confirmed that their load was too high and that the number of connections was not limited. So I'm going to try to do some caching on my side.
Because my side itself is using squid agent to solve the problem of the Cross-domain call API in Ajax, so nature is targeted at Squid's configuration file.
The request address of the Yupoo API is www.yupoo.com/api/rest/?method=xx&xxxxxxx ...
We all know squid will automatically cache static files, but for this dynamic Web page How to let it also cache it, so look for it on Google, find the above mentioned the cache of Google Earth blog article.
His approach is to:
ACL QUERY Urlpath_regex cgi-bin? Intranet
ACL Forcecache url_regex-i kh.google keyhole.com
No_cache Allow Forcecache
No_cache Deny QUERY
# ----
Refresh_pattern-i kh.google 1440 20% 10080 override-expire override-lastmod reload-into-ims ignore-reload
Refresh_pattern-i keyhole.com 1440 20% 10080 override-expire override-lastmod reload-into-ims ignore-reload
The principle is to use the No_cache allow and Refresh_pattern to set some caching rules, the Google Earth's request forcibly cached.
This article A, nature early someone to verify, but no one succeeded, the original author also audio from all without ... squid's mailing list is also mentioned. (see the title in the friend, don't hurry, read on, will not let you go empty-handed)
I also did not care, estimating people's skill problem. First try to rewrite the Yupoo API to solve the cache problem.
ACL QUERY Urlpath_regex cgi-bin?
ACL Forcecache url_regex-i yupoo.com
No_cache Allow Forcecache
No_cache Deny QUERY
Refresh_pattern-i yupoo.com 1440 50% 10080 override-expire override-lastmod reload-into-ims ignore-reload
Hey, sure enough, nnd is useless, the visit record is still a lump of tcp_miss
So looking over the document, looking for information, found that squid bugs are causing trouble, not too early has been fixed (strictly functional extension patch).
My squid is 2.6.13, turned over the source code, indeed has been patched.
To solve this problem requires several Refresh_pattern extension parameters (Ignore-no-cache ignore-private), which are not mentioned in the squid documentation and configuration examples, it seems that squid is not enough with the times.
Let's talk about the problem.
First look at the HTTP header information returned by the Yupoo API (cache related section)
Cache-control:no-cache, Must-revalidate
Pragma:no-cache
These two lines control the caching behavior of the browser, indicating that the browser must not be cached. Squid is also in accordance with the RfC, under normal circumstances will not cache these pages. Override-expire Override-lastmod Reload-into-ims Ignore-reload can't deal with it.
And that patch was against the two Cache-control:no-cache and Pragma:no-cache.
So the sentence of Refresh_pattern to be rewritten as
Refresh_pattern-i yupoo.com 1440 50% 10080 override-expire override-lastmod reload-into-ims ignore-reload Ignore-no-cache ignore-private
This is done, squid-k reconfigure look at Access.log, this back inside finally appeared
tcp_hit/200 tcp_mem_hit/200, which means that the caching rules do work, the excitement 555~~~~
====================
Add:
Then I looked at Google Earth Server hk1.google.com HTTP header, only
expires:wed, April 2008 20:56:20 GMT
Last-modified:fri, Dec 04:58:08 GMT
, so it seems that there is no need to ignore-no-cache Ignore-private also can work, may be the author of the Wrong writing here
Kh.google should be KH. Google was right.
Finally, the correct configuration of the cached Google Earth/map should be
ACL QUERY Urlpath_regex cgi-bin? Intranet
ACL Forcecache url_regex-i Kh. Google Mt.. Google Mapgoogle.mapabc keyhole.com
No_cache Allow Forcecache
No_cache Deny QUERY
# ----
Refresh_pattern-i Kh. Google 1440 20% 10080 override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache ignore-private
Refresh_pattern-i Mt.. Google 1440 20% 10080 override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache ignore-private
Refresh_pattern-i mapgoogle.mapabc 1440 20% 10080 override-expire override-lastmod reload-into-ims ignore-reload Ignore-no-cache ignore-private
Refresh_pattern-i keyhole.com 1440 20% 10080 override-expire override-lastmod reload-into-ims ignore-reload Ignore-no-cache ignore-private
Note:
KhX.google.com is Google Earth's image server
MtX.google.com is a picture server for Google Map
Mapgoogle.mapabc.com is a Google Ditu image server.
Http://nukq.malmam.com/archives/16