Re: Squid memory footprint (was: MemPools rewrite) from Andres Kroonmaa on 2000-11-03 (squid-dev)

From: Andres Kroonmaa <andre@dont-contact.us>
Date: Fri, 3 Nov 2000 11:47:26 +0200

On 2 Nov 2000, at 23:58, Henrik Nordstrom <hno@hem.passagen.se> wrote:

> Andres Kroonmaa wrote:
> >
> > I wonder how can you eliminate StoreEntry? IMHO it contains crucial
> > information that allows squid to skip disk accesses. Moving parts
> > of this data into squidfs doesn't seem to change much in ram usage.
> > Moving this crucial information onto disks implies enormous performance
> > penalty, doesn't it?
>
> Enormous is perhaps a bit too strong word here. It very much depends on
> the hit ratio on the fs meta data to be able to look up that a file does
> not exists.

I expect FS meta hitrate be low - it is increasingly easy to build boxes
that have disk/ram ratio of 200 and more (200G disk, 1G ram).
There's been lots of talk about reqiserfs, which by all means sounds good,
but I've not seen any mention of it being supported on any other OS but
Linux or BSD. Even if we can implement it within squid, I see several
cases that makes me worry about dependance on disk access for lookups.

ICP - shouldn't ever touch disks to return hit/miss. 2 boxes with equal
load and average hitrate of 30% would see 3 times more ICP traffic than
actual fetches from peers. We'd rather not allow this to touch disks.

OK, use digests for ICP. But digest generation? We do it currently quite
rarely, because it has quite some cpu overhead? ICP is wanted when instant
knowledge is wanted between peers that may be part of loadsharing setup.
Delay between digest generations is not acceptable there. Can we add
objects to digest as they arrive? Where we keep it between reloads?

Based on what do we generate a digest, at startup? First it sounds like we
need to access/stat every object on disks to get the MD5 key for digest.
ICP shouldn't return hits for stale objects, so object timestamps are
needed during ICP request. refcounts and timestamps are also needed for
the replacement policys to work.

long time ago Squid moved from diskbased (ala apache) lookups to rambased
lookups, now it seems we are moving towards diskbased again, although
with much more efficient disk access, and to reduce ram usage.
Still, disk performance hasn't changed much, and if we make squid
performance dependant on disks too much, then I'm really worried.

> But yes, it is a penalty from not keeping a in-core index. But
> fortunately it is a penalty that can be addressed.

I'm worried, probably because I don't see how you solve them.
But I'm curious to know.

------------------------------------
Andres Kroonmaa <andre@online.ee>
Delfi Online
Tel: 6501 731, Fax: 6501 708
Pärnu mnt. 158, Tallinn,
11317 Estonia
Received on Fri Nov 03 2000 - 02:50:07 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:12:54 MST