Re: Some ideas (was: bad ftp URLs make cache messy)

From: Jan Klabacka <jkl@dont-contact.us>
Date: Wed, 12 Mar 1997 10:28:20 +0100

Someone sometimes wrote:
>
> I'm using squid 1.1.8, and when errors occur on ftp URLs tranfers, the cache
> gets messy keeping the mis-transfered URL in its cache.
>
> I went through the list'digets and found about the client programm that
> allows to force reload in the cache... I tried... it worked... but it's a one
> shot process as you have to do that for each bad URL.
>
> So this can't be a solution when you are facing several dozens of users
> complaining about why they can't retrieve URLs (that someone else messed up in
> the cache).
>
> You may track down specific errors in log files and then do some kind of
> robot the reload the messy ftp URL... even it would be cleaner to just remove
> it from the cache!
>
> Is there a better solution to this problem ?
> Or should it be in the ToDo list of Squid group?
>

I have some comment (also related to similar behaviour of squid in
other aspects) and also some ideas.

Why squid can not check (and remember) file size on ftp server and
after such interruption (or any erroneous interruption, or maybe
even after all finished ftp transfers) just remove such incomplete
file from cache (because it is anyway just rubbish) ? I.e., why robot
if it is squid who writes log and thus knows about incomplete file
and can remove just with unlink() and some small change in its files
(or only not to commit some details about fetched file to log).

Anyway, I am not using ftp cache at all (although it speeds up
things), because of strange directory listings (not the worse
problem), but also that it adds extension to files without them (for
instance READMEs - so that with Netscape they can only be stored to
disk). This is really strange.

Another idea (I am probably not using proper terms, but I guess it
could be understandable): If error happens during fetch of object
(mostly nonexistent) and squid reports error like requested url
cannot be retrieved (I do not remember exactly), cache remembers
this occasion for reasonably long time (configurable I guess). But,
very often it is some intermittent error on the Net and immediate
retry is usually successfull. Question: why squid does not check such
URLs next several times it is requested (could be also configurable)?
I do not know if it can help generally to have some possibility to
make this check several times and than remember negative response
(maybe in this case for longer time). It applies not only to
nonexistent objects on servers, but also to servers which are down
(and maybe it should have even different behaviour for both of that
respectively). Also, what will happen if parent squid will remember
negative response with different rules than child. Maybe I am totally
missing some configuration options ?

Very often during fetching of http URL message like Transfer
interrupted is shown. But, transfer continues without problems.
Subsequent retrievals show the same error message, but page is still
displayed without problems. It seems to me that it is cached response
from server (I mean Transfer interrupted message), but why than whole
page is shown OK. I know I should write URL where it happens, but I
do not remember (I did not think I will ever make note).

Yet another idea: anonymizing option in squid.1.1.8 is always
removing type of browser. I know that it is some security risk to
know type of client, but:
1. it breaks functionality of client sometimes (there are servers
which behave differently with different clients)
2. it is not client, but squid on behalf of client who is contacting
server - i.e. both IP address and hostname are hidden from world
3. anonymizing - as far as I can see - has different reason than to
hide client types (and thus break functionality). It should firstly
hide site where request came from.
I think it should be at least configurable which headers are to be
removed during anonymizing (at least during compilation time -
something similar like mime-types).

Regards

Jan Klabacka
Received on Wed Mar 12 1997 - 01:55:26 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:34:40 MST