Understanding the HTTP Vary Header and Caching Proxies(Squid,etc.)

6567 ワード

Understanding the HTTP Vary Header and Caching Proxies(Squid,etc.)-Mark S.Klich
Understanding the HTTP Vary Header and Caching Proxies(Squid,etc.)
 By Mark Klich on September 25,2010 12:10 AM | No TrackBacks
I never paid much tention to the
HTTP Vary header.  In fact,I've been fortuname enough to avoid it for this long and never really had to care much about it.  Well,it turns out when you're configling a high-performance
reverse proxy,undersstanding the Vary header and what it means to your reverse proxy caching policies is absolution crucial.
Here's an interesting problem I recently soved that dealt with Squid、Apache、and that elusive Vary reponse header…
1-The Vary Baics
ポプラcaching proxies、like
スクウェア,usualy generate a
sh of the request from a number of input including the URI and the
contens of the Vary reponse header.  When a caching proxy receives a request for a request for a requose、it gathers these inputs、geners a hash、then check its cache to see if it already has a reource sitting on disk、or in memory、that matches the compect.  This is how Squid、and other caching proxies、fundentally know if they have a cache HIT or MISS(e.g.can Squid return the content it has cached or does it need to revalidate the request the.request ainsthe.inster.sever.inster.
That in mind,you can probably see how the Vary header is quite import when a caching proxy is looking for a cache HIT or MISS.  The Vary header is a way for the web-server to tell any intermediares(caching proxies)what the y shound use,if necessary,to figure out if the requested reource is fresh or stale.  Sample Vary headers include:
Vary: Accept-Encoding
Vary: Accept-Encoding,User-Agent
Vary: X-Some-Custom-Header,Host
Vary: *
Acctording to the HTTP spec,「the Vary field value indicates the set of request-header fields that fully determines,while the reponse is fresh,whether a cachei is permitted to use the reponse the the the replesent.revent。  Yep,that's pretty import(I discovered this the hard way).2-The Caching ProblemI configred Squid to act as a round-robin load balancer and caching proxy,sitting in front of about four Atwers.  Each Aache web-server was running a copy of my web-aplication,which I intented to have Squid cache where possible.  Centain requests,were for large JSON oject,and I explicitly configred Squid to cache requests ending in.json for 24-hors.I opened a web-browser and vited URL I expected to be cacheed
GET /path/big.json HTTP/1.1
Host: app.kolich.local
User-Agent: Firefox

HTTP/1.0 200 OK
Date: Fri, 24 Sep 2010 23:09:32 GMT
Content-Type: application/json;charset=UTF-8
Content-Language: en-US
Vary: Accept-Encoding,User-Agent
Age: 1235
X-Cache: HIT from cache.kolich.local
X-Cache-Lookup: HIT from cache.kolich.local:80
Content-Length: 25090
Connection: close
Ok、looks good  I opened a 2 nd web-browser on a different machine(hint:with a different User-Agent)and tried again.  This time、notice the X-Coache:MISS…
GET /path/big.json HTTP/1.1
Host: app.kolich.local
User-Agent: Chrome

HTTP/1.0 200 OK
Date: Fri, 24 Sep 2010 23:11:45 GMT
Content-Type: application/json;charset=UTF-8
Content-Language: en-US
Vary: Accept-Encoding,User-Agent
Age: 4
X-Cache: MISS from cache.kolich.local
X-Cache-Lookup: MISS from cache.kolich.local:80
Content-Length: 25090
Connection: close
Wow、look at.  I requested exactly the same reource,just from a different browser,and I saw a cache MISS.  This is oviouslynot what I want、I need the same cached reourcere to be served up from the cache regardless of who's making the request.  If left alone、thisisisis only caching a reponse e per User-Agent、not globally perreourrerererecotion:Check Your Vary HeadersRemember how I said the conteteteteted of the Vary header headarararararararararararararare e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e ers.  Although each request was for exactly the same reource,Squid determined that they were very different as cache wars concerned.  How did this happen  Well、take a peek at a Vary reponse header:
Vary: Accept-Encoding,User-Agent
This tells Squid that the request URI,the Accept Encocding request header,and the User-Agent request header sh be included in a hash when determining if an object is available its cache,or not.  Obviously,any reasonable hash of(URI,Accept-Enccoding,「Firefox」)shound not match the hash of(URI,Acceptt-Encocding,「Chrome」).  Hence why Squid seemed to think the request was for different object!To fix this、I located the source of the annoying“User-Agent”addition to my Vary reponse header、which happened to come from Apache's very own mod_deflate module.  The recommanded mod_deflate configration involves apending“User-Agent”to the Vary reponse header on any reponse that is not copresed by mod_。deflate.  I don't really see why this necessary、but the Apache folks seemid to think this was import.  ヘレ's the relevant lineas from the Apache suggated mod_deflate configration:
SetEnvIfNoCase Request_URI \.(?:gif|jpe?g|png|ico)$ no-gzip dont-vary
Header append Vary User-Agent env=!dont-vary
In any event、I removed the 2 nd line above、restated Aache and Squid began caching beautiful regardless of which client isted the request.  Essentially、I told Squid to stop caring about the User-Agent by removing“User-Agent”from my Vary reponse header、and problem solived!The joys of HTTP.