Nigel Metheringham writes:
>
> I'd like a list of things to watch so that I can tell when our cluster of
> cache machines is running out of steam. What points, other than general
> response time, which tends to depend very much one what you are looking
> at, show a cache system that is just getting more hits than it can cope
> with??
hmm - this is an ugly question :)
what we have here:
A script runs every 5 minutes that gets the cache-info page, it then does
all sorts of things like print out disk objects etc - this is the same one
as http://ircache.nlanr.net/Cache/Statistics/Vitals/ runs, but it has all
sorts of stuff that does graphing etc.
Also - we then run a script called response-times on the access.log files
that does the following (I didn't write these - Duane did, but he considers
them "non-release-code" so don't bother him with requests! This is just
giving credit when it's due!):
tcp-svc-time-hist.pl < ~/access.log > svc-time-hist
system("~/response-times.gnuplot | ppmchange rgb:0/0/0 rgb:ffff/ffff/ffff rgb:0303/0303/0303 rgb:0/0/0 | ppmtogif > /home/httpd/html/cache/vitals/response-time-$DATE.gif 2>/dev/null");
tcp-svc-time-hist.pl looks like this:
--------------
#!/usr/bin/perl
# $Id: tcp-svc-time-hist.pl,v 1.1 1997/05/16 04:56:13 wessels Exp $
# Make a log-based histogram of HTTP request service times.
#
# perl tcp-svc-time-hist.pl < /usr/local/squid/logs/access.log > hist
# gnuplot
# > set logscale x
# > plot 'hist' using 1:4 with lines
$SF = 10;
while (<>) {
@F = split;
next unless ($F[3] =~ /^TCP_/);
next if ($F[1] == 0);
$bin = int( $SF * log($F[1]) + 0.5);
$H[$bin]++;
}
$sum1 = 0;
$sum2 = 0;
for ($i=0; $i<=$#H; $i++) {
$sum1 += $H[$i];
}
for ($i=0; $i<=$#H; $i++) {
$sum2 += $H[$i];
printf "%14.5f %9d %10.5f %10.5f\n",
(exp($i/$SF)/1000),
$H[$i],
$H[$i]/$sum1,
$sum2/$sum1;
}
# print the median
$sum3 = 0;
for ($i=0; $i<=$#H; $i++) {
$sum3 += $H[$i];
$S[$i] = $sum3;
last if ($sum3 / $sum1 >= 0.5);
}
$X = $S[$i-1];
$Z = $S[$i];
$Y = $sum2 / 2;
print "#i=$i\n";
print "#X=$X\n";
print "#Z=$Z\n";
print "#Y=$Y\n";
die if ($Y < $X);
die if ($Y > $Z);
$B = ($i -1) + ($Y-$X) / ($Z - $X);
print "#B=$B\n";
printf "# median is %f seconds\n", exp($B/$SF) / 1000;
---------------------
response-times.gnuplot looks like this:
------------------
#!/usr/bin/gnuplot
set term pbm small color
#set size 0.88,0.88
set xlabel 'Time transaction took (seconds)'
set ylabel 'Number of documents'
set logscale x
plot 'svc-time-hist' using 1:2 title 'cache' with lines
------------------
Received on Mon Jun 30 1997 - 02:11:40 MDT
This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:35:35 MST