On Thu, 26 Oct 2000, Henrik Nordstrom wrote:
> Junfeng Yang wrote:
> >
> > Hi, dear Squid guy, I have a question about Squid. how do you deal with
> > the MD5 collision? for example, two urls may have the same message digest.
>
> Then they both "fight" for the cache. The last requested is the one that
> will be in the cache. If it happens to be another object in the MD5
> digest slot when an object is stored then the other object is evicted
> from the cache.
... and, as log analysis shows, MD5 collisions for URLs are
practically impossible:
Date: Tue, 27 Jul 1999 17:09:44 -0600 (MDT)
From: Alex Rousskov <rousskov@ircache.net>
To: adrian@creative.net.au
Cc: squid-dev@ircache.net
Subject: Re: Collisions in URLs using MD5
On Wed, 28 Jul 1999 adrian@creative.net.au wrote:
> Does anyone have figures for collisions of URL names when md5'ed ?
> I'm curious to know what it is like in the real world ..
I did some experiments in June 1997 using URLs from our SV cache. I
varied the length/size of an MD5 digest (in bytes) and varied the number
of days in the access log.
trace length, number of number of MD5 collisions for
days unique URLs a given URL digest length
4 5 6 16
------------- ----------- ------ --- --- ---
1 375066 13 0 0 0
5 1494774 257 1 0 0
10 2619168 817 2 0 0
Thus, for six byte and longer URL digests, there were no collisions in
the given set. A four byte URL digest gives negligible number of
collisions (817 or 0.04% for a 10 day trace). The standard MD5 digest
length is 16 bytes.
Alex.
-- To unsubscribe, see http://www.squid-cache.org/mailing-lists.htmlReceived on Fri Oct 27 2000 - 07:39:32 MDT
This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:56:00 MST