Re: [squid-users] Squid slows under load

From: J. Pilfold-Bagwell <jpb_at_bordengrammar.kent.sch.uk>
Date: Sat, 05 Mar 2011 16:46:59 +0000

Hi again,

Went into work the next day and carried on only to find that one of the
SSDs was spewing CRC failures into the log. I don;t recall seeing any
of these in the previous days and the logs didn;t contain any entries
prior to my email so I assume that was a new problem.

I've pulled the SSDs and replaced then with 160GB SATA II disks. I then
ran curl-loader with 150 client load and the resulting 5 minute stats
dump is posted below.

sample_start_time = 1299342747.28318 (Sat, 05 Mar 2011 16:32:27 GMT)
sample_end_time = 1299343047.45842 (Sat, 05 Mar 2011 16:37:27 GMT)
client_http.requests = 115.309931/sec
client_http.hits = 0.003333/sec
client_http.errors = 0.000000/sec
client_http.kbytes_in = 25.411849/sec
client_http.kbytes_out = 195.575242/sec
client_http.all_median_svc_time = 0.012346 seconds
client_http.miss_median_svc_time = 0.013086 seconds
client_http.nm_median_svc_time = 0.000000 seconds
client_http.nh_median_svc_time = 0.000000 seconds
client_http.hit_median_svc_time = 0.000000 seconds
server.all.requests = 0.299982/sec
server.all.errors = 0.000000/sec
server.all.kbytes_in = 1.573241/sec
server.all.kbytes_out = 0.333314/sec
server.http.requests = 0.266651/sec
server.http.errors = 0.000000/sec
server.http.kbytes_in = 1.103269/sec
server.http.kbytes_out = 0.253319/sec
server.ftp.requests = 0.000000/sec
server.ftp.errors = 0.000000/sec
server.ftp.kbytes_in = 0.000000/sec
server.ftp.kbytes_out = 0.000000/sec
server.other.requests = 0.033331/sec
server.other.errors = 0.000000/sec
server.other.kbytes_in = 0.469973/sec
server.other.kbytes_out = 0.076662/sec
icp.pkts_sent = 0.000000/sec
icp.pkts_recv = 0.000000/sec
icp.queries_sent = 0.000000/sec
icp.replies_sent = 0.000000/sec
icp.queries_recv = 0.000000/sec
icp.replies_recv = 0.000000/sec
icp.replies_queued = 0.000000/sec
icp.query_timeouts = 0.000000/sec
icp.kbytes_sent = 0.000000/sec
icp.kbytes_recv = 0.000000/sec
icp.q_kbytes_sent = 0.000000/sec
icp.r_kbytes_sent = 0.000000/sec
icp.q_kbytes_recv = 0.000000/sec
icp.r_kbytes_recv = 0.000000/sec
icp.query_median_svc_time = 0.000000 seconds
icp.reply_median_svc_time = 0.000000 seconds
dns.median_svc_time = 4.177065 seconds
unlink.requests = 0.000000/sec
page_faults = 0.000000/sec
select_loops = 202.268185/sec
select_fds = 498.770865/sec
average_select_fd_period = 0.002005/fd
median_select_fds = 0.000000
swap.outs = 0.063330/sec
swap.ins = 0.000000/sec
swap.files_cleaned = 0.000000/sec
aborted_requests = 0.049997/sec
syscalls.polls = 202.268185/sec
syscalls.disk.opens = 0.063330/sec
syscalls.disk.closes = 0.063330/sec
syscalls.disk.reads = 0.000000/sec
syscalls.disk.writes = 0.406643/sec
syscalls.disk.seeks = 0.000000/sec
syscalls.disk.unlinks = 0.000000/sec
syscalls.sock.accepts = 220.630446/sec
syscalls.sock.sockets = 99.554185/sec
syscalls.sock.connects = 0.216654/sec
syscalls.sock.binds = 99.554185/sec
syscalls.sock.closes = 209.474431/sec
syscalls.sock.reads = 147.911360/sec
syscalls.sock.writes = 242.719155/sec
syscalls.sock.recvfroms = 2.516520/sec
syscalls.sock.sendtos = 0.079995/sec
cpu_time = 8.350000 seconds
wall_time = 300.017524 seconds
cpu_usage = 2.783171%

Whether the SSD replacement will give a real world cure will be seen Monday.

Thanks again,

Julian

On 03/03/11 17:59, Pieter De Wit wrote:
> Hi Julian,
>
> The one stat that I can't see here is disk access. I know you said
> that you have SSD's, but what is the disk stats for your logging
> volume and the squid volume ? If you totally bypass the proxy, does it
> improve ? (could be that the squid server is getting shaped ?)
>
> Cheers,
>
> Pieter
>
> On 4/03/2011 06:46, Julian Pilfold-Bagwell wrote:
>> Hi All,
>>
>> I've been having some problems with Squid and Dansguardian for a
>> while now and despite lots of time on Google, haven't found a solution.
>>
>> The problem started a week or so back when I noticed that squid was
>> slowing. A quick look through the logs showed it was running out of
>> file descriptors so I upped the level to take account. The server
>> was ancient so I bought in an HP Proliant DL120 (dual Pentium 2.80Ghz
>> G6950 CPU & 4GB of RAM). At the same time, I bought in 2 x 60GB SSD
>> drives to use as cache space with the system on a RAID 1 array with
>> 160GB SATA II disks.
>>
>> On this, I installed Ubuntu server 10.04.2 LTS with Squid 2.7 (from
>> apt) and Dansguardian 2.10.1.1. The kernel version is
>> 2.6.32-24-server and the server authenticates via a Samba PDC (v
>> 3.5.6) using OpenLDAP/Winbind. The Samba version on the proxy
>> machine is v 3.4.7 as supplied from the Ubuntu repo.
>>
>> This however also seems to run out of steam. My first thought was
>> that it may have been running out of RAM so I ran htop. Both CPUs
>> were topping out at 20% and out of the 4GB of RAM, 1.3GB was used.
>> Next I checked the load on the NIC and found that it was running on
>> average 400kB/s, with the odd burst at 5MB/s. As the load increased,
>> web pages were taking up to 30-45 seconds to load. I bypassed
>> Dansguardian and went in on 3128 with no change in performance.
>>
>> Following the recommendations on other sites discovered via Google, I
>> tuned and tweaked settings with no real benefit and I can't see that
>> I changed anything to cause it to happen. The log files look fine, I
>> have 10000 file descriptors available and cachemgr shows plenty of
>> spares. There are 50% more NTLM authenticators than are in use at any
>> given time.
>>
>> The config file for Squid is shown below. I have had the number of
>> authenticators set to 400 as I have 350 users but the number in use
>> still peaked at around 50. If I've been a numpty and done something
>> glaringly obvious, I'd be grateful if someone could point it out. If
>> not, ask for info and I'll provide it.
>>
>> Thanks,
>>
>> Jools
>>
>>
>> ## Squid.conf
>> ## Start with authentication for clients
>>
>> auth_param ntlm program /usr/bin/ntlm_auth
>> --helper-protocol=squid-2.5-ntlmssp
>> auth_param ntlm_param children 100
>> auth_param ntlm keep_alive on
>>
>> auth_param basic program /usr/bin/ntlm_auth
>> --helper-protocol=squid-2.5-basic
>> auth_param basic children 100
>> auth_param basic realm Squid proxy-caching web server
>> auth_param basic credentialsttl 2 hours
>>
>> ## Access Control Lists for filter bypass ##
>> acl realtek dstdomain .realtek.com.tw
>> acl tes dstdomain .tes.co.uk
>> acl glogster dstdomain .glogster.com
>> acl adobe-installer dstdomain .adobe.com # allow installs from adobe
>> download manager
>> acl actihealth dstdomain .actihealth.com .actihealth.net # Allow
>> direct access for PE dept activity monitors
>> acl spybotupdates dstdomain .safer-networking.org .spybotupdates.com
>> # Allow updates for Spybot S&D
>> acl sims-update dstdomain .kcn.org.uk .capitaes.co.uk
>> .capitasolus.co.uk .sims.co.uk # Allow SIMS to update itself directly
>> acl kcc dstdomain .kenttrustweb.org.uk # Fix problem with county
>> acl frenchconference dstdomain flashmeeting.e2bn.net
>> acl emsonline dstdomain .emsonline.kent.gov.uk
>> acl clamav dstdomain .db.gb.clamav.net
>> acl ubuntu dstdomain .ubuntu.com .warwick.ac.uk
>> acl windowsupdate dstdomain windowsupdate.microsoft.com
>> acl windowsupdate dstdomain .update.microsoft.com
>> acl windowsupdate dstdomain download.windowsupdate.com
>> acl windowsupdate dstdomain redir.metaservices.microsoft.com
>> acl windowsupdate dstdomain images.metaservices.microsoft.com
>> acl windowsupdate dstdomain c.microsoft.com
>> acl windowsupdate dstdomain www.download.windowsupdate.com
>> acl windowsupdate dstdomain wustat.windows.com
>> acl windowsupdate dstdomain crl.microsoft.com
>> acl windowsupdate dstdomain sls.microsoft.com
>> acl windowsupdate dstdomain productactivation.one.microsoft.com
>> acl windowsupdate dstdomain ntservicepack.microsoft.com
>> acl windowsupdate dstdomain download.adobe.com
>> acl comodo dstdomain download.comodo.com
>> acl simsb2b dstdomain emsonline.kent.gov.uk
>> acl powerman dstdomain pmstats.org
>> acl ability dstdomain ability.com
>> acl fulston dstdomain fulstonmanor.kent.sch.uk
>> acl httpsproxy dstdomain .retiredsanta.com .atunnel.com .btunnel.com
>> .ctunnel.com .dtunnel.com .ztunnel.com .partyaccount.com
>>
>> ## Access Control for filtered users ##
>> acl all src 0.0.0.0/0.0.0.0
>> acl manager proto cache_object
>> acl localhost src 127.0.0.1/255.255.255.255
>> acl to_localhost dst 127.0.0.0/8
>> acl SSL_ports port 443
>> acl ntlm_users proxy_auth REQUIRED
>>
>> acl SSL_ports port 443 # https
>> acl SSL_ports port 563 # snews
>> acl SSL_ports port 873 # rsync
>> acl Safe_ports port 80 # http
>> acl Safe_ports port 21 # ftp
>> acl Safe_ports port 443 # https
>> acl Safe_ports port 70 # gopher
>> acl Safe_ports port 210 # wais
>> acl Safe_ports port 1025-65535 # unregistered ports
>> acl Safe_ports port 280 # http-mgmt
>> acl Safe_ports port 488 # gss-http
>> acl Safe_ports port 591 # filemaker
>> acl Safe_ports port 777 # multiling http
>> acl Safe_ports port 631 # cups
>> acl Safe_ports port 873 # rsync
>> acl Safe_ports port 901 # SWAT
>> acl purge method PURGE
>> acl CONNECT method CONNECT
>>
>> ## Allow/Deny Lists ##
>> http_access allow manager localhost
>> http_access deny manager
>> http_access allow purge localhost
>> http_access deny purge
>> http_access deny !Safe_ports
>> http_access deny CONNECT !SSL_ports
>>
>> http_access allow emsonline
>> http_access allow clamav
>> http_access allow realtek
>> http_access allow ubuntu
>> http_access allow tes
>> http_access allow glogster
>> http_access allow kcc
>> http_access allow fulston
>> http_access allow comodo
>> http_access allow ability
>> http_access allow powerman
>> http_access allow windowsupdate
>> http_access allow simsb2b
>> http_access allow adobe-installer
>> http_access allow actihealth
>> http_access allow spybotupdates
>> http_access allow sims-update
>> http_access allow frenchconference
>> http_access allow ntlm_users
>> http_access deny httpsproxy
>> http_access allow localhost
>> http_access deny all
>> icp_access deny all
>>
>> ## Cache Settings ##
>> log_fqdn off
>> half_closed_clients off
>> maximum_object_size 1024 KB
>> cache_access_log none
>> cache_store_log none
>> http_port 3128
>> redirect_children 750
>> hierarchy_stoplist cgi-bin ?
>> cache_mem 128 MB
>> memory_replacement_policy lru
>> cache_replacement_policy lru
>> cache_dir ufs /fastcache1 15000 16 256
>> cache_dir ufs /fastcache2 15000 16 256
>> refresh_pattern ^ftp: 1440 20% 10080
>> refresh_pattern ^gopher: 1440 0% 1440
>> refresh_pattern -i (/cgi-bin/|\?) 0 0% 0
>> refresh_pattern (Release|Package(.gz)*)$ 0 20% 2880
>> refresh_pattern . 0 20% 4320
>> acl shoutcast rep_header X-HTTP09-First-Line ^ICY.[0-9]
>> upgrade_http0.9 deny shoutcast
>> acl apache rep_header Server ^Apache
>> broken_vary_encoding allow apache
>> extension_methods REPORT MERGE MKACTIVITY CHECKOUT
>> cache_effective_user proxy
>> ## Hash out effective group as it stops access to winbind privileged
>> pipe and breaks authentication - jpb
>> # cache_effective_group proxy
>> max_filedescriptors 10000
>> dns_nameservers 172.20.0.253 172.31.49.46 172.31.81.46
>> hosts_file /etc/hosts
>> coredump_dir /var/spool/squid
>>
>
>

-- 
Julian Pilfold-Bagwell,
Network Manager,
Borden Grammar School,
Avenue of Remembrance,
Sittingbourne,
Kent,
ME10 4DB.
Tel: 01795 424192
Received on Sat Mar 05 2011 - 16:47:09 MST

This archive was generated by hypermail 2.2.0 : Sat Mar 05 2011 - 12:00:01 MST