Re: hierarchy_stoplist & cache digests

From: Henrik Nordstrom <hno@dont-contact.us>
Date: Thu, 13 May 1999 00:52:04 +0200

Alex Rousskov wrote:

> Sounds reasonable to me. Only _positively_ cached responses should be
> digested. Unfortunately, StoreEntry does not have a "HTTP status" flag
> so we cannot digest 200 responses only without adding more flags to
> StoreEntry (and your patch does add a flag). Ideally, what objects to
> digest should be controlled via squid.conf.

In my opinion _posetively_ includes that we can accept a IMS query on
that object, which currently limits us to status 200 objects. So perhaps
a simple flag is sufficient (lets call that flag "standard cachable
object" for the rest of this discussion)

We can then build onto this with an at-store-time ACL-style check for
controlling if the object classifies as "standard cachable object" or
not to allow for user configurable flexibility.

Then, if we want to allow for changes to get active before objects are
fully recycled then update the flag on swapins, refreshes or any other
occations when the objects full metadata is known.

This also reminds me of the old issue of refresh patterns, in-memory
timestams and cache digest. In my opinion only one object timestamp
needs to be kept in memory: when the object needs to be refreshed next.
All else can be read from the swap file when needed, and the in-memory
timestamp can be updated on swapins to allow for changes in squid.conf.

Summary:
* Add one flag for telling if this is an normal cachable object which it
is reasonable to report as hit to peers (both ICP and digest). provided
that it also is fresh of course. The flag should be configurable with a
ACL like check (preferably having access to object information like
content-type and/or length besides the URL and other "standard"
information).
* Remove the current 3 object timestamps (Date, Last-Modified, Expires)
and replace them with a single "Fresh Until" timestamp based on refresh
patterns.

Both these new values ("normal cachable" flag, and "fresh until"
timestamp) is estimated values based on the settings when the object
last was touched. Updated each time the object is hit to allow for more
rapid propagation of configuration changes to the cache policy.

Objects which are both fresh enough and classified as "normal cachable"
gets digested, and is reported as ICP hits.

This should provide a simple framework which saves memory (2 timestamps
or 8 bytes less / object on most platforms), makes digesting a lighter
task and allows for detailed control of what gets digested or reported
as ICP hits.

Drawbacks:
* Not entirely obvious how to upgrade a existing cache without either
doing from disk rebuild, or temporarily resetting "fresh until" to the
default TTL as used when building digests.
* Needs changes here and there in the code.

On the positive side can be noted that the on disk store format is
flexible enoght to easily allow the change while preserving old objects.

Hmm.. thinking a bit further on what can be removed from memory: We do
not really need the "last referenced" timestamp either. The LRU list is
sufficient, and can be preserved by writing out the swap index based on
the LRU list when ever we write a clean swap index.

/Henrik
Received on Tue Jul 29 2003 - 13:15:58 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:12:08 MST