application slow but CPU is at 40% max

I have a strange situation on a production server. Connection for get queued but the CPU is only at 40%. Also the database runs fine at 30% CPU.

Some more history as requested in the comments:

  • In the peak hours the sites gets around 20,000 visitors an hour.
  • The site is an webforms application with a lot of AJAX/POSTs
  • The site uses a lot of User generated content
  • We measure the performance of the site with a testpage which does hit the database and the webservices used by the site. This page get served within a second on normal load. Whe define the application as slow when the request takes more than 4 seconds.
  • From the measurements we can see that the connectiontime is fast, but the processing time is large.
  • We can’t pinpoint the slowresponse the a single request, the site runs fine during normal hours but gets slow during peak hours
  • We had a problem that the site was CPU bound (aka running at 100%), we fixed that
  • We also had problems with exceptions maken the appdomain restart, we fixed that do
  • During peak hours I take a look at the performance counters. We can see behaviour that we have 600 current connections with 500 queued connections.
  • At peak times the CPU is around 40% (which makes me the think that it is not CPU bound)
  • Physical memory is around 60% used
  • At peak times the DatabaseServer CPU is around 30% (which makes me think it is not Database bound)

My conclusion is that something else is stopping the server from handling the requests faster. Possible suspects

  • Deadlocks (!syncblk only gives one lock)
  • Disk I/O (checked via sysinternals procesexplorer: 3.5 mB/s)
  • Garbage collection (10~15% during peaks)
  • Network I/O (connect time still low)

To find out what the proces is doing I created to minidumps.

I managed to create two MemoryDumps 20 seconds apart. This is the output of the first:

CPU utilization 6%
Worker Thread: Total: 95 Running: 72 Idle: 23 MaxLimit: 200 MinLimit: 100
Work Request in Queue: 1
Number of Timers: 64

and the output of the second:

CPU utilization 9%
Worker Thread: Total: 111 Running: 111 Idle: 0 MaxLimit: 200 MinLimit: 100
Work Request in Queue: 1589

As you can see there are a lot of Request in Queue.

Question 1: what does it mean that there are 1589 requests in queue. Does it mean something is blocking?

The !threadpool list contains mostly these entries: Unknown Function: 6a2aa293 Context: 01cd1558 AsyncTimerCallbackCompletion TimerInfo@023a2cb0

If I you into depth with the AsyncTimerCallbackCompletion

!dumpheap -type TimerCallback

Then I look at the objects in the TimerCallback and most of them are of types:


Question 2: Does it make any sense that those Objects hava a timer, and so much? Should I prevent this. And how?

Main Question do I miss any obvious problems why I’m queueing connections and not maxing out the CPU?

I succeeded in making a crashdump during a peak. Analyzing it with debugdiag gave me this warning:

Detected possible blocking or leaked critical section at webengine!g_AppDomainLock owned by thread 65 in Hang Dump.dmp
Impact of this lock
25.00% of threads blocked
(Threads 11 20 29 30 31 32 33 39 40 41 42 74 75 76 77 78 79 80 81 82 83)

The following functions are trying to enter this critical section

The following module(s) are involved with this critical section
\\?\C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\webengine.dll from Microsoft Corporation

A quick google search doesn’t give me any results. Does somebody has a clue?

asked Nov 19 ’10 at 16:30
71% accept rate
Did you try and measure the speed from Firebug? see which part loads the longest.. then start from there. – Arief Iman Santoso Nov 19 ’10 at 16:32
This is extremely difficult to diagnose using the spotty information you provided. Is there a reason you started by looking at crash dumps? Is your ASP.NET application crashing? If so, why do classify this as a performance problem? – Dan Esparza Nov 19 ’10 at 18:00

3 Answers

Too many ASP.NET queued requests will destroy performance. There are a very limited number of request threads.

Try to free up those threads by processing slow parts of your pages asynchronously or do anything else you can to bring down page execution times.

answered Nov 20 ’10 at 15:06
Yes, I Understand. However I don’t understand why it’s not processing the requests faster as the CPU is not maxed out. – wasigh Nov 20 ’10 at 15:58
My money is on the network / database round-trips. Can you put stopwatch code around each of these requests? – realworldcoder Nov 20 ’10 at 16:11

I’m with realworldcoder: IIS works by having Worker Processes handle the incoming requests. If the requests get stacked up, as it appears is happening, then performance takes a nose dive.

There are several possible things to do/check for.

  1. Fire up Activity Monitor on the SQL Server. You want to see what queries are taking the longest and, depending on the results, make changes to reduce their execution time. Long queries can cause the thread the page is executing under to block, reducing the number of connections you can support.
  2. Look at the number of queries, and the time they take to execute, for these page/ajax calls. I’ve seen pages with dozens of unnecessary queries that get executed for an Ajax call simply because .Net executes the entire page cycle even when only a particular method needed to be run. You might split those calls into regular web handlers (.ashx) pages that way you can better control exactly what happens.
  3. Consider increasing the number of worker processes IIS has to handle incoming requests. The default for a new app pool is 1 process with 20 threads. This is usually enough to handle tons of requests; however, if the requests are blocking due to waiting on the DB server or some other resource it can cause the pipeline to stack up. Bear in mind that this can have either a positive or negative impact to both performance and regular functioning of your application. So do some research then test, test, test.
  4. Consider reducing or eliminating your usage of session. Either way, look at the memory usage of it, potentially add more ram to your web server. Session data is serialized and deserialized for every page load (including ajax calls) regardless of whether the data is used or not. depending on what you are storing in session it can have a serious negative impact on your site. If you aren’t using it, then make sure it’s completely turned off in your web.config. Note that these issues only get worse if you store session off of the web server as you then become bound to the speed of the network when a page retrieves and stores it.
  5. Look at the sites performance counters around JIT (Just-In-Time) compiling. This should be nearly non-existent. I’ve seen sites brought to their knees by massive amounts of JIT. Once those pages were recoded to eliminate it, the sites started flying again.
  6. Look at different caching strategies (I don’t consider session a real caching solution). Perhaps there are things that you constantly request that you don’t really need to constantly pull out of the DB server. A friend of mine has a site where they cache entire web pages as physical files for dynamic content, including their discussion groups. This has radically increased their performance; but it is a major architectural change.

The above are just a couple things to look at. You basically need to get further into the details to find out exactly what is going on and most of the regular performance counters aren’t going to give you that clarity.

answered Nov 23 ’10 at 15:44
Chris Lively

up vote 0 down vote accepted

The worker processes handling the queue was the real dealbreaker. Probably connected with the website calling webservices on the same host. Thus creating a kind of deadlock.

I changed the machine.config to to following:

        minIoThreads="50" />

Standard this processModel is set to autoConfig=”true”

With the new config the webserver is handling the requests fast enough to not get queued.

answered Dec 10 ’10 at 13:05
此条目发表在Best Practices分类目录。将固定链接加入收藏夹。


Fill in your details below or click an icon to log in: 徽标

You are commenting using your account. Log Out /  更改 )

Google+ photo

You are commenting using your Google+ account. Log Out /  更改 )

Twitter picture

You are commenting using your Twitter account. Log Out /  更改 )

Facebook photo

You are commenting using your Facebook account. Log Out /  更改 )


Connecting to %s