RE: [squid-users] Caching downloaded files from dynamic pages

From: Amos Jeffries <squid3@dont-contact.us>
Date: Tue, 25 Sep 2007 17:17:45 +1200 (NZST)

> Thank you for an answer and for the tip, however, that didn't get me far.
>
> I enabled DynamicContent with the instructions, so my squid.conf looks
> like
> this:
>
> ---
> #caching Fujitsu-Siemens
> acl fujitsusiemens dstdomain .fujitsu-siemens.com
> cache allow fujitsusiemens
> #stop everything else
> hierarchy_stoplist cgi-bin ?
> acl QUERY urlpath_regex cgi-bin \?
> cache deny QUERY
>
> ---
>
> And even though my store.log displays, that it actually saved something
> (filesize matches, 1263443 bytes), I cannot find the file and all
> downloads
> from Fujitsu-Siemens are coming from their site, not from squid (=slow and
> eating bandwith).
>

If there is any sort of changing query-string object that can currently
still stop squid identifying two URI as having identical content. Both get
fetched and stored seperately. You will need a custom helper to get around
that.

Also, try the cacheability engine on their site to see if their content is
being made uncacheable by them somehow. There are also some
refresh_pattern settings to force certain types of content to be kept
longer than is usually good or wanted by the owners.

access.log is kind of where you want to look for HIT/MISS details to see
if you are saving anything useful, not store. store.log can only tell you
that something _can_ be cached, not whether its generating any HITS.

Amos

> Snap from store.log:
> ---
> 1190668401.656 RELEASE -1 FFFFFFFF BBB9597D11CBE2D16B47A526F9F7589E 200
> 1190668079 -1 -1 application/octet-stream 1263443/1263443
> GET
> http://support.fujitsu-siemens.com/download/FileDownload/fileDownload.aspx?
> ---
>
> More good hints or ideas what I should try?
>
> Alexander
>
> -----Original Message-----
> From: Amos Jeffries [mailto:squid3@treenet.co.nz]
> Sent: 24. syyskuuta 2007 0:57
> To: Alexander K
> Cc: squid-users@squid-cache.org
> Subject: Re: [squid-users] Caching downloaded files from dynamic pages
>
>> I'm working in PC Repair service company. We are of course loading
>> often drivers, utilities, updates etc from manufacturer webpages
>> (Acer, Lenovo, Fujitsu-Siemens, etc).
>>
>> Since we are on limited connection and we want to speed up downloads,
>> we want to use squid to cache those files. So you need to download
>> each file only once. We are building up a server for this so
>> CPU/RAM/DISK won't be a problem.
>>
>> I have had a quick test with Squid and have found it working well.
>>
>> However, I am having problem with manufacturers, which does not have
>> hard-coded http/ftp url, but dynamic pages which are "streaming" file
>> to the browser.
>>
>> One problematic is Fujitsu-Siemens.
>>
>> When I am downloading drivers from Lenovo, they work fine, since they
>> are using "normal" urls, for example:
>> http://download.boulder.ibm.com/ibmdl/pub/pc/pccbbs/mobiles/1rg807ww.e
>> xe
>>
>> But I am unable to cache downloaded drivers and software from
>> Fujitsu-Siemens pages.
>>
>> For example, when downloading this file:
>>
>> http://support.fujitsu-siemens.com/Download/ShowDescription.asp?Softwa
>> reGUID
>> =50AD6EEC-53F0-4B6E-9C13-53E2CB51D36B&OSID=DD13C337-8EFF-4CFB-A589-729
>> 71D7BC
>> BCE&Status=True&Component=Flash%20Bios%20for%20AMILO%20Pro%20V8210
>>
>> It will show up in squid access.log like this:
>>
>> --
>> 1190578174.875 1422 192.168.11.5 TCP_MISS/200 22284 GET
>> http://support.fujitsu-siemens.com/Download/ShowDescription.asp? -
>> DIRECT/80.70.172.14 text/html
>> 1190578176.921 765 192.168.11.5 TCP_MISS/302 646 POST
>> http://support.fujitsu-siemens.com/Download/Download.asp -
>> DIRECT/80.70.172.14 text/html
>> 1190578177.093 172 192.168.11.5 TCP_MISS/302 1090 GET
>> http://support.fujitsu-siemens.com/Download/StreamFileToBrowser.asp? -
>> DIRECT/80.70.172.14 text/html
>> 1190578184.078 6985 192.168.11.5 TCP_MISS/200 1263991 GET
>>
> http://support.fujitsu-siemens.com/download/FileDownload/fileDownload.aspx?
>> - DIRECT/80.70.172.14 application/octet-stream
>> --
>>
>> I cannot figure out how to tell Squid to cache that content. :( Is
>> that possible at all?
>>
>> Happy for assistance,
>> Alexander
>>
>
> This should help a little. Although if they add a UID to the query string
> content still may be fetched directly.
>
> http://wiki.squid-cache.org/ConfigExamples/DynamicContent
>
> Amos
>
>
>
>
Received on Mon Sep 24 2007 - 23:18:03 MDT

This archive was generated by hypermail pre-2.1.9 : Mon Oct 01 2007 - 12:00:03 MDT