re-sending and adding -dev list
performance drops going from 3.0 -> 3.1 -> 3.2 and in addition squid 3.2
scales poorly (only goes up to 2x single-threaded performance going up to
4 cores and drops off again after that)
this makes it so that I actually get better performance on 3.0 than on
3.2, even with multiple workers
David Lang
On Mon, 21 Mar 2011, david_at_lang.hm wrote:
> Date: Mon, 21 Mar 2011 19:26:38 -0700 (PDT)
> From: david_at_lang.hm
> To: squid-users_at_squid-cache.org
> Subject: [squid-users] squid 3.2.0.5 smp scaling issues
>
> test setup
>
> box A running apache and ab
>
> test against local IP address >13000 requests/sec
>
> box B running squid, 8 2.3 GHz Opteron cores with 16G ram
>
> non acl/cache-peer related lines in the config are (including typos from me
> manually entering this)
>
> http_port 8000
> icp_port 0
> visible_hostname gromit1
> cache_effective_user proxy
> cache_effective_group proxy
> appaend_domain .invalid.server.name
> pid_filename /var/run/squid.pid
> cache_dir null /tmp
> client_db off
> cache_access_log syslog squid
> cache_log /var/log/squid/cache.log
> cache_store_log none
> coredump_dir none
> no_cache deny all
>
>
> results when requesting short html page squid 3.0.STABLE12 4200 requests/sec
> squid 3.1.11 2100 requests/sec
> squid 3.2.0.5 1 worker 1400 requests/sec
> squid 3.2.0.5 2 workers 2100 requests/sec
> squid 3.2.0.5 3 workers 2500 requests/sec
> squid 3.2.0.5 4 workers 2900 requests/sec
> squid 3.2.0.5 5 workers 2900 requests/sec
> squid 3.2.0.5 6 workers 2500 requests/sec
> squid 3.2.0.5 7 workers 2000 requests/sec
> squid 3.2.0.5 8 workers 1900 requests/sec
>
> in all these tests the squid process was using 100% of the cpu
>
> I tried it pulling a large file (100K instead of <50 bytes) on the thought
> that this may be bottlenecking on accepting the connections but with
> something that took more time to service the connections it could do better
> however what I found is that with 8 workers all 8 were using <50% of the CPU
> at 1000 requests/sec
>
> local machine would do 7000 requests/sec to itself
>
> 1 worker 500 requests/sec
> 2 workers 957 requests/sec
>
> from there it remained about 1000 requests/sec with the cpu utilization
> slowly dropping off (but not dropping as fast as it should with the number of
> cores available)
>
> so it looks like there is some significant bottleneck in version 3.2 that
> makes the SMP support fairly ineffective.
>
>
> in reading the wiki page at wili.squid-cache.org/Features/SmpScale I see you
> worrying about fairness between workers. If you have put in code to try and
> ensure fairness, you may want to remove it and see what happens to
> performance. what you are describing on that page in terms of fairness is
> what I would expect form a 'first-come-first-served' approach to multiple
> processes grabbing new connections. The worker that last ran is hot in the
> cache and so has an 'unfair' advantage in noticing and processing the new
> request, but as that worker gets busier, it will be spending more time
> servicing the request and the other processes will get more of a chance to
> grab the new connection, so it will appear unfair under light load, but
> become more fair under heavy load.
>
> David Lang
>
Received on Sun Mar 27 2011 - 06:02:51 MDT
This archive was generated by hypermail 2.2.0 : Sun Mar 27 2011 - 12:00:03 MDT