Measuring .NET Application Performance with Counters

This section has been adapted from Microsoft's Patterns and Practices, Chapter 15, Measuring .NET Application Performance at http://msdn2.microsoft.com/en-US/library/ms998579.aspx#scalenetchapt15_topic11

To determine whether your application meets its performance objectives and to help identify bottlenecks, you need to measure your application's performance and collect metrics. In addition to the information contained in the Intercept performance events, additional performance metrics may be collected with counters. Metrics of particular interest tend to be response time, throughput, and resource utilization (how much CPU, memory, disk I/O, and network bandwidth your application consumes while performing its tasks).

The following section is divided into technology-focused sections that show you what to measure and how to measure it for each of the core Microsoft® .NET technologies including ASP.NET, Web services, Enterprise Services, .NET Remoting, interoperability, and ADO.NET data access.

Measuring enables you to identify how the performance of your application stands in relation to your defined performance goals and helps you to identify the bottlenecks that affect your application performance. It helps you identify whether your application is moving toward or away from your performance goals. Defining what you will measure, that is, your metrics, and defining the objectives for each metric is a critical part of your testing plan.

Performance objectives include the following:

Response time or latency - Response time is the amount of time taken to respond to a request
Throughput - Throughput is the number of requests that can be successfully served by your application per unit time. It can vary depending on the load (number of users) applied to the server. Throughput is usually measured in terms of requests per second. In some systems, throughput may go down when there are many concurrent users. In other systems, throughput remains constant under pressure but latency begins to suffer, perhaps due to queuing. Other systems have some balance between maximum throughput and overall latency under stress
Resource utilization

You identify resource utilization costs in terms of server and network resources. The primary resources are the following: CPU, Memory , Disk I/O and Network I/O. You can identify the resource cost on a per-operation basis. Operations might include browsing a product catalog, adding items to a shopping cart, or placing an order. You can measure resource costs for a given user load or you can average resource costs when the application is tested using a given workload profile

System Resources

When you need to measure how many system resources your application consumes, you need to pay particular attention to the following:

Processor - Processor utilization, context switches, interrupts and so on.
Memory - Amount of available memory, virtual memory, and cache utilization.
Network - Percent of the available bandwidth being utilized, network bottlenecks.
Disk I/O - Amount of read and write disk activity. I/O bottlenecks occur if read and write operations begin to queue.

Metric	Performance Object	Counter	Threshold	Significance
Processor	Processor	% Processor Time	The general figure for the threshold limit for processors is 85 percent.	This counter is the primary indicator of processor activity. High values many not necessarily be bad. However, if the other processor-related counters are increasing linearly such as % Privileged Time or Processor Queue Length, high CPU utilization may be worth investigating.
Processor	Processor	% Privileged Time	A figure that is consistently over 75 percent indicates a bottleneck	This counter indicates the percentage of time a thread runs in privileged mode. When your application calls operating system functions (for example to perform file or network I/O or to allocate memory), these operating system functions are executed in privileged mode.
Processor	Processor	% Interrupt Time	Depends on processor.	This counter indicates the percentage of time the processor spends receiving and servicing hardware interrupts. This value is an indirect indicator of the activity of devices that generate interrupts, such as network adapters. A dramatic increase in this counter indicates potential hardware problems.
Processor	System	Processor Queue Length	An average value consistently higher than 2 indicates a bottleneck.	If there are more tasks ready to run than there are processors, threads queue up. The processor queue is the collection of threads that are ready but not able to be executed by the processor because another active thread is currently executing. A sustained or recurring queue of more than two threads is a clear indication of a processor bottleneck. You may get more throughput by reducing parallelism in those cases. You can use this counter in conjunction with the Processor\% Processor Time counter to determine if your application can benefit from more CPUs. There is a single queue for processor time, even on multiprocessor computers. Therefore, in a multiprocessor computer, divide the Processor Queue Length (PQL) value by the number of processors servicing the workload. If the CPU is very busy (90 percent and higher utilization) and the PQL average is consistently higher than 2 per processor, you may have a processor bottleneck that could benefit from additional CPUs. Or, you could reduce the number of threads and queue more at the application level. This will cause less context switching, and less context switching is good for reducing CPU load. The common reason for a PQL of 2 or higher with low CPU utilization is that requests for processor time arrive randomly and threads demand irregular amounts of time from the processor. This means that the processor is not a bottleneck but that it is your threading logic that needs to be improved.
Processor	System	Context Switches/sec	As a general rule, context switching rates of less than 5,000 per second per processor are not worth worrying about. If context switching rates exceed 15,000 per second per processor, then there is a constraint.	Context switching happens when a higher priority thread preempts a lower priority thread that is currently running or when a high priority thread blocks. High levels of context switching can occur when many threads share the same priority level. This often indicates that there are too many threads competing for the processors on the system. If you do not see much processor utilization and you see very low levels of context switching, it could indicate that threads are blocked.
Memory	Memory	Available Mbytes	A consistent value of less than 20 to 25 percent of installed RAM is an indication of insufficient memory.	This indicates the amount of physical memory available to processes running on the computer. Note that this counter displays the last observed value only. It is not an average.
Memory	Memory	Page Reads/sec	Sustained values of more than five indicate a large number of page faults for read requests.	This counter indicates that the working set of your process is too large for the physical memory and that it is paging to disk. It shows the number of read operations, without regard to the number of pages retrieved in each operation. Higher values indicate a memory bottleneck. If a low rate of page-read operations coincides with high values for Physical Disk\% Disk Time and Physical Disk\Avg. Disk Queue Length, there could be a disk bottleneck. If an increase in queue length is not accompanied by a decrease in the pages-read rate, a memory shortage exists.
Memory	Memory	Pages/sec	Sustained values higher than five indicate a bottleneck.	This counter indicates the rate at which pages are read from or written to disk to resolve hard page faults. Multiply the values of the Physical Disk\Avg. Disk sec/Transfer and Memory\Pages/sec counters. If the product of these counters exceeds 0.1, paging is taking more than 10 percent of disk access time, which indicates that you need more RAM.
Memory	Memory	Pool Non-paged Bytes	Watch the value of Memory\Pool Non-paged Bytes for an increase of 10 percent or more from its value at system startup.	If there is an increase of 10 percent or more from its value at startup, a serious leak is potentially developing.
Memory	Server	Pool Non-paged Failures	Regular nonzero values indicate a bottleneck.	This counter indicates the number of times allocations from the non-paged pool have failed. It indicates that the computer's physical memory is too small. The non-paged pool contains pages from a process's virtual address space that are not to be swapped out to the page file on disk, such as a process' kernel object table. The availability of the non-paged pool determines how many processes, threads, and other such objects can be created. When allocations from the non-paged pool fail, this can be due to a memory leak in a process, particularly if processor usage has not increased accordingly.
Memory	Server	Pool Paged Failures	No specific value.	This counter indicates the number of times allocations from the paged pool have failed. This counter indicates that the computer's physical memory or page file is too small.
Memory	Server	Pool Non-paged Peak	No specific value.	This is the maximum number of bytes in the non-paged pool that the server has had in use at any one point. It indicates how much physical memory the computer should have. Because the non-paged pool must be resident, and because there has to be some memory left over for other operations, you might quadruple it to get the actual physical memory you should have for the system.
Memory	Memory	Cache Bytes	No specific value.	Monitors the size of cache under different load conditions. This counter displays the size of the static files cache. By default, this counter uses approximately 50 percent of available memory, but decreases if the available memory shrinks, which affects system performance.
Memory	Memory	Cache Faults/sec	No specific value.	This counter indicates how often the operating system looks for data in the file system cache but fails to find it. This value should be as low as possible. The cache is independent of data location but is heavily dependent on data density within the set of pages. A high rate of cache faults can indicate insufficient memory or could also denote poorly localized data.
Memory	Cache	MDL Read Hits %	The higher this value, the better the performance of the file system cache. Values should preferably be as close to 100 percent as possible.	This counter provides the percentage of Memory Descriptor List (MDL) Read requests to the file system cache, where the cache returns the object directly rather than requiring a read from the hard disk.
Disk I/O	PhysicalDisk	Avg. Disk Queue Length	Should not be higher than the number of spindles plus two.	This counter indicates the average number of both read and writes requests that were queued for the selected disk during the sample interval
Disk I/O	PhysicalDisk	Avg. Disk Read Queue Length	Should be less than two.	This counter indicates the average number of read requests that were queued for the selected disk during the sample interval.
Disk I/O	PhysicalDisk	Avg. Disk Write Queue Length	Should be less than two.	This counter indicates the average number of write requests that were queued for the selected disk during the sample interval.
Disk I/O	PhysicalDisk	Avg. Disk sec/Read		This counter indicates the average time, in seconds, of a read of data from the disk.
Disk I/O	PhysicalDisk	Avg. Disk sec/Transfer	Should not be more than 18 milliseconds.	This counter indicates the time, in seconds, of the average disk transfer. This may indicate a large amount of disk fragmentation, slow disks, or disk failures. Multiply the values of the Physical Disk\Avg. Disk sec/Transfer and Memory\Pages/sec counters. If the product of these counters exceeds 0.1, paging is taking more than 10 percent of disk access time, so you need more RAM.
Disk I/O	PhysicalDisk	Disk Writes/sec	Depends on manufacturer's specification.	This counter indicates the rate of write operations on the disk.
Network I/O	Network Interface	Bytes Total/sec	Sustained values of more than 80 percent of network bandwidth.	This counter indicates the rate at which bytes are sent and received over each network adapter. This counter helps you know whether the traffic at your network adapter is saturated and if you need to add another network adapter. How quickly you can identify a problem depends on the type of network you have as well as whether you share bandwidth with other applications.
Network I/O	Network Interface	Bytes Received/sec	No specific value.	This counter indicates the rate at which bytes are received over each network adapter. You can calculate the rate of incoming data as a part of total bandwidth. This will help you know that you need to optimize on the incoming data from the client or that you need to add another network adapter to handle the incoming traffic.
Network I/O	Network Interface	Bytes Sent/sec	No specific value.	This counter indicates the rate at which bytes are sent over each network adapter. You can calculate the rate of incoming data as a part of total bandwidth. This will help you know that you need to optimize on the data being sent to the client or you need to add another network adapter to handle the outbound traffic.
Network I/O	Server	Bytes Total/sec	Value should not be more than 50 percent of network capacity.	This counter indicates the number of bytes sent and received over the network. Higher values indicate network bandwidth as the bottleneck. If the sum of Bytes Total/sec for all servers is roughly equal to the maximum transfer rates of your network, you may need to segment the network.
Network I/O	Protocol related counters: Protocol_Object\Segments Received/sec Protocol_Object\Segments Sent/sec where Protocol_Object can be TCP, UDP, NetBEUI, NWLink IPX, NWLink NetBIOS, NWLink SPX, or other protocol layer performance objects.	Application-specific.	Protocol-related counters help you narrow down the traffic to various protocols because you might be using one or more protocols in your network. You may want to identify which protocol is consuming the network bandwidth disproportionately.
Network I/O	Processor	% Interrupt Time	Depends on processor.	This counter indicates the percentage of time the processor spends receiving and servicing hardware interrupts. This value is an indirect indicator of the activity of devices that generate interrupts, such as network adapters.

CLR and Managed Code

This section describes what you need to measure in relation to the CLR and managed code and how you capture the key metrics. This applies to all managed code, regardless of the type of assembly, for example, ASP.NET application, Web service, serviced component, and data access component. When measuring the processes running under CLR some of the key points to look for are as follows:

Memory - Measure managed and unmanaged memory consumption.
Working set - Measure the overall size of your application's working set.
Exceptions - Measure the effect of exceptions on performance. Describes the performance counters that provide information about the exceptions thrown by an application
Contention - Measure the effect of contention on performance.
Threading - Measure the efficiency of threading operations.
Code access security - Measure the effect of code access security checks on performance.

Metric	Performance Object	Counter	Threshold	Significance
Memory	Process	Private Bytes	The threshold depends on your application and on settings in the Machine config file. The default for ASP.NET is 60 percent available physical RAM or 800 MB, whichever is the minimum.	This counter indicates the current number of bytes allocated to this process that cannot be shared with other processes. This counter is used for identifying memory leaks.
Memory	.NET CLR Memory	% Time in GC	This counter should average about 5 percent for most applications when the CPU is 70 percent busy, with occasional peaks. As the CPU load increases, so does the percentage of time spent performing garbage collection. Keep this in mind when you measure the CPU.	This counter indicates the percentage of elapsed time spent performing a garbage collection since the last garbage collection cycle. The most common cause of a high value is making too many allocations, which may be the case if you are allocating on a per-request basis for ASP.NET applications. You need to study the allocation profile for your application if this counter shows a higher value.
Memory	.NET CLR Memory	# Bytes in all Heaps	No specific value.	This counter is the sum of four other counters — Gen 0 Heap Size, Gen 1 Heap Size, Gen 2 Heap Size, and Large Object Heap Size. The value of this counter will always be less than the value of Process\Private Bytes, which also includes the native memory allocated for the process by the operating system. Private Bytes - # Bytes in all Heaps is the number of bytes allocated for unmanaged objects. This counter reflects the memory usage by managed resources.
Memory	.NET CLR Memory	# Gen 0 Collections	No specific value.	This counter indicates the number of times the generation 0 objects are garbage-collected from the start of the application. Objects that survive the collection are promoted to Generation 1. You can observe the memory allocation pattern of your application by plotting the values of this counter over time.
Memory	.NET CLR Memory	# Gen 1 Collections	One-tenth the value of # Gen 0 Collections	This counter indicates the number of times the generation 1 objects are garbage-collected from the start of the application.
Memory	.NET CLR Memory	# Gen 2 Collections	One-tenth the value of # Gen 1 Collections	This counter indicates the number of times the generation 2 objects are garbage-collected from the start of the application. The generation 2 heap is the costliest to maintain for an application. Whenever there is a generation 2 collection, it suspends all the application threads. You should profile the allocation pattern for your application and minimize the objects in generation 2 heap.
Memory	.NET CLR Memory	# of Pinned Objects	No specific value.	When .NET-based applications use unmanaged code, these objects are pinned in memory. That is, they cannot move around because the pointers to them would become invalid. These can be measured by this counter. You can also pin objects explicitly in managed code, such as reusable buffers used for I/O calls. Too many pinned objects affect the performance of the garbage collector because they restrict its ability to move objects and organize memory efficiently.
Memory	.NET CLR Memory	Large Object Heap Size	No specific value.	The large object heap size shows the amount of memory consumed by objects whose size is greater than 85 KB. If the difference between # Bytes in All Heaps and Large Object Heap Size is small, most of the memory is being used up by large objects. The large object heap cannot be compacted after collection and may become heavily fragmented over a period of time. You should investigate your memory allocation profile if you see large numbers here.
Working Set	Process	Working Set	No specific value.	The working set is the set of memory pages currently loaded in RAM. If the system has sufficient memory, it can maintain enough space in the working set so that it does not need to perform the disk operations. However, if there is insufficient memory, the system tries to reduce the working set by taking away the memory from the processes which results in an increase in page faults. When the rate of page faults rises, the system tries to increase the working set of the process. If you observe wide fluctuations in the working set, it might indicate a memory shortage. Higher values in the working set may also be due to multiple assemblies in your application. You can improve the working set by using assemblies shared in the global assembly cache.
Exceptions	.NET CLR Exceptions	# of Exceptions Thrown / sec	This counter value should be less than 5 percent of Request/sec for the ASP.NET application. If you see more than 1 request in 20 throw an exception, you should pay closer attention to it.	This counter indicates the total number of exceptions generated per second in managed code. Exceptions are very costly and can severely degrade your application performance. You should investigate your code for application logic that uses exceptions for normal processing behavior. Response.Redirect, Server.Transfer, and Response.End all cause a ThreadAbortException in ASP.NET applications.
Contention	.NET CLR LocksAndThreads	Contention Rate / sec	No specific value.	This counter displays the rate at which the runtime attempts to acquire a managed lock but without a success. Sustained nonzero values may be a cause of concern. You may want to run dedicated tests for a particular piece of code to identify the contention rate for the particular code path.
Contention	.NET CLR LocksAndThreads	Current Queue Length	No specific value.	This counter displays the last recorded number of threads currently waiting to acquire a managed lock in an application. You may want to run dedicated tests for a particular piece of code to identify the average queue length for the particular code path. This helps you identify inefficient synchronization mechanisms.
Threading	.NET CLR LocksAndThreads	# of current physical Threads	No specific value.	This counter indicates the number of native operating system threads currently owned by the CLR that act as underlying threads for .NET thread objects. This gives you the idea of how many threads are actually spawned by your application. This counter can be monitored along with System\Context Switches/sec. A high rate of context switches almost certainly means that you have spawned a higher than optimal number of threads for your process. If you want to analyze which threads are causing the context switches, you can analyze the Thread\Context Switches/sec counter for all threads in a process and then make a dump of the process stack to identify the actual threads by comparing the thread IDs from the test data with the information available from the dump.
Threading	Thread	% Processor Time	No specific value.	This counter gives you the idea as to which thread is actually taking the maximum processor time. If you see idle CPU and low throughput, threads could be waiting or deadlocked. You can take a stack dump of the process and compare the thread IDs from test data with the dump information to identify threads that are waiting or blocked.
Threading	Thread	Context Switches/sec	No specific value.	The counter needs to be investigated when the System\Context Switches/sec counter shows a high value. The counter helps in identifying which threads are actually causing the high context switching rates.
Threading	Thread	Thread State	The counter tells the state of a particular thread at a given instance.	You need to monitor this counter when you fear that a particular thread is consuming most of the processor resources.
Code Access Security	.NET CLR Security	Total RunTime Checks	No specific value.	This counter displays the total number of runtime code access security checks performed since the start of the application. This counter used together with the Stack Walk Depth counter is indicative of the performance penalty that your code incurs for security checks.
Code Access Security	.NET CLR Security	Stack Walk Depth	No specific value.	This counter displays the depth of the stack during that last runtime code access security check. This counter is not an average. It just displays the last observed value.

ASP.NET

To effectively determine ASP.NET performance, you need to measure the following:

Throughput - This includes the number of requests executed per second and throughput related bottlenecks, such as the number of requests waiting to be executed and the number of requests being rejected.
Cost of throughput - This includes the cost of processor, memory, disk I/O, and network utilization.
Queues - This includes the queue levels for the worker process and for each virtual directory hosting a .NET Web application.
Response time and latency - The response time is measured at the client as the amount of time between the initial request and the response to the client (first byte or last byte). Latency generally includes server execution time and the time taken for the request and response to be sent across the network.
Cache utilization - This includes the ratio of cache hits to cache misses. It needs to be seen in larger context because the virtual memory utilization may affect the cache performance.
Errors and exceptions - This includes numbers of errors and exceptions generated.
Sessions - You need to be able to determine the optimum value for session timeout and the cost of storing session data locally versus remotely. You also need to determine the session size for a single user.
Loading - This includes the number of assemblies and application domains loaded, and the amount of committed virtual memory consumed by the application.
View state size - This includes the amount of view state per page. The View state can constitute a significant portion of your Web page output, particularly if you are using large controls such as the DataGrid or a tree control. It is important to measure the size of view state because this can have a big impact on the size of the overall response and on the response time. You can measure view state by enabling page level tracing.
Page size - This includes the size of individual pages. If you need to identify the size of the response sent for a request made for a particular page, one option is to enable logging of bytes sent in the IIS log. Note Do not enable this on your production server as this significantly increases the log file size.
Page cost - This includes the processing effort required to serve pages. You can measure the page cost in terms of time taken to serve the page and the processor cycles needed to completely execute the request for the page.
Worker process restarts - This includes the number of times the ASP.NET worker process recycles.

You measure ASP.NET performance primarily by using system performance counters.

Metric	Performance Object	Counter	Threshold	Significance
Throughput	ASP.NET Applications	Requests/Sec	Depends on your business requirements.	The throughput of the ASP.NET application on the server. It is one the primary indicators that help you measure the cost of deploying your system at the necessary capacity.
Throughput	Web Service	ISAPI Extension Requests/sec	Depends on your business requirements.	The rate of ISAPI extension requests that are simultaneously being processed by the Web service. This counter is not affected by the ASP.NET worker process restart count, although the ASP.NET Applications\Requests/Sec counter is. To measure the throughput cost in terms of the amount of system resources that your requests consume, you need to measure CPU utilization, memory consumption, and the amount of disk and network I/O. This also helps in measuring the cost of the hardware needed to achieve a given level of performance. For more information about how to measure resource costs, see 'System Resources' earlier in this chapter.
Requests	ASP.NET	Requests Current	No specific value.	The number of requests currently handled by the ASP.NET ISAPI. This includes those that are queued, executing, or waiting to be written to the client. ASP.NET begins to reject requests when this counter exceeds the requestQueueLimit defined in the processModel configuration section. If ASP.NET\Requests Current is greater than zero and no responses have been received from the ASP.NET worker process for a duration greater than the limit specified by <processModel responseDeadlockInterval=/>, the process is terminated and a new process is started.
Requests	ASP.NET Applications	Requests Executing	No specific value.	The number of requests currently executing. This counter is incremented when the HttpRuntime begins to process the request and is decremented after the HttpRuntime finishes the request.
Requests	ASP.NET Applications	Requests Timed Out	No specific value.	The number of requests that have timed out. You need to investigate the cause of request timeouts. One possible reason is contention between various threads. A good instrumentation strategy helps capture the problem in the log. To investigate further, you can debug and analyze the process using a run-time debugger such as WinDbg.
Queues	ASP.NET	Requests Queued	No specific value.	The number of requests currently queued. In IIS 6.0, this indicates a shortage of worker threads. Requests are rejected when ASP.NET\Requests Current exceeds the requestQueueLimit (default = 5000) attribute for <processModel> element defined in the Machine.config file. This can happen when the server is under very heavy load. The queue between IIS and ASP.NET is a named pipe through which the request is sent from one process to the other. When running in IIS 6.0, there is a queue where requests are posted to the managed thread pool from native code. There is also a queue for each virtual directory. You should investigate the ASP.NET Applications\Requests In Application Queue and ASP.NET\Requests Queued to investigate performance issues.
Queues	ASP.NET Applications	Requests In Application Queue	No specific value.	There is a separate queue that is maintained for each virtual directory. The limit for this queue is defined by the appRequestQueueLimit attribute for <httpRunTime> element in Machine.config. When the queue limit is reached the request is rejected with a 'Server too busy' error.
Queues	ASP.NET	Requests Rejected	No specific value.	The number of requests rejected because the request queue was full. ASP.NET worker process starts rejecting requests when ASP.NET\Requests Current exceeds the requestQueueLimit defined in the processModel configuration section. The default value for requestQueueLimit is 5000.
Queues	ASP.NET	Requests Wait Time	1,000 milliseconds. The average request should be close to zero milliseconds waiting in queue.	The number of milliseconds the most recent request was waiting in the named pipe queue between the IIS and the ASP.NET worker process. This does not include any time spent in the queue for a virtual directory hosting the Web application.
Response Time and Latency	TTFB		Depends on your business requirements.	This is the time interval between sending a request to the server and receiving the first byte of the response. The value varies depending on network bandwidth and server load. Use client tools such as ACT to obtain this metric.
Response Time and Latency	TTLB		Depends on your business requirements.	This is the time interval between sending a request to the server and receiving the last byte of the response. Again, the value varies depending upon network bandwidth and server load. Use client tools such as ACT to obtain this metric.
Response Time and Latency	ASP.NET	Request Execution Time	The value is based on your business requirements.	This is the number of milliseconds taken to execute the last request. The execution time begins when the HttpContext for the request is created, and stops before the response is sent to IIS. Assuming that user code does not call HttpResponse.Flush, this implies that execution time stops before sending any bytes to IIS, or to the client.
Cache Utilization	ASP.NET Applications	Cache Total Entries		The current number of entries in the cache which includes both user and internal entries. ASP.NET uses the cache to store objects that are expensive to create, including configuration objects and preserved assembly entries.
Cache Utilization	ASP.NET Applications	Cache Total Hit Ratio	With sufficient RAM, you should normally record a high (greater than 80 percent) cache hit ratio.	This counter shows the ratio for the total number of internal and user hits on the cache.
Cache Utilization	ASP.NET Applications	Cache Total Turnover Rate	No specific value.	The number of additions and removals to and from the cache per second (both user and internal.) A high turnover rate indicates that items are being quickly added and removed, which can impact performance.
Cache Utilization	ASP.NET Applications	Cache API Hit Ratio	No specific value.	Ratio of cache hits to misses of objects called from user code. A low ratio can indicate inefficient use of caching techniques.
Cache Utilization	ASP.NET Applications	Cache API Turnover Rate	No specific value.	The number of additions and removals to and from the output cache per second. A high turnover rate indicates that items are being quickly added and removed, which can impact performance.
Cache Utilization	ASP.NET Applications	Output Cache Entries	No specific value.	The number of entries in the output cache. You need to measure the ASP.NET Applications\ Output Cache Hit Ratio counter to verify the hit rate to the cache entries. If the hit rate is low, you need to identify the cache entries and reconsider your caching mechanism.
Cache Utilization	ASP.NET Applications	Output Cache Hit Ratio	No specific value.	The total hit-to-miss ratio of output cache requests.
Cache Utilization	ASP.NET Applications	Output Cache Turnover Rate	No specific value.	The number of additions and removals to the output cache per second. A high turnover rate indicates that items are being quickly added and removed, which can impact performance.
Errors and Exceptions	ASP.NET Applications	Errors Total/sec	No specific value.	The total number of exceptions generated during preprocessing, parsing, compilation, and run-time processing of a request. A high value can severely affect your application performance. This may render all other results invalid.
Errors and Exceptions	ASP.NET Applications	Errors During Execution	No specific value.	The total number of errors that have occurred during the processing of requests.
Errors and Exceptions	ASP.NET Applications	Errors Unhandled During Execution/sec	No specific value.	The total number of unhandled exceptions per second at run time.
Loading	.NET CLR Loading	Current appdomains	The value should be same as number of Web applications plus one. The additional one is the default application domain loaded by the ASP.NET worker process.	The current number of application domains loaded in the process.
Loading	.NET CLR Loading	Current Assemblies	No specific value.	The current number of assemblies loaded in the process. ASP.NET Web pages (.ASPX files) and user controls (.ascx files) are 'batch compiled' by default, which typically results in one to three assemblies, depending on the number of dependencies. Excessive memory consumption may be caused by an unusually high number of loaded assemblies. You should try to minimize the number of Web pages and user controls without compromising the efficiency of workflow. Assemblies cannot be unloaded from an application domain. To prevent excessive memory consumption, the application domain is unloaded when the number of recompilations (.ASPX, .ascx, .asax) exceeds the limit specified by <compilation numRecompilesBeforeAppRestart=/>. Note If the <%@ page debug=%> attribute is set to true, or if <compilation debug=/> is set to true, batch compilation is disabled
Loading	.NET CLR Loading	Bytes in Loader Heap	No specific value.	This counter displays the current size (in bytes) of committed memory across all application domains. Committed memory is the physical memory for which space has been reserved in the paging file on disk.
Worker Process Restarts	ASP.NET	Worker Process Restarts	Depends on your business requirements.	The number of times the Web application recycles and the worker process recycles.

Web Services

This section describes what you need to do to measure ASP.NET Web service performance and how you capture the key metrics. To measure Web service performance, you can use many of the same counters used to measure ASP.NET application performance. To effectively determine ASP.NET performance, you need to measure the following:

Throughput - Measure the number of requests executed per second and throughput-related bottlenecks, such as the number of requests waiting to be executed, and the number of requests being rejected.
Cost of throughput - Measure processor, memory, disk I/O, and network utilization.
Queues - Measure the queue levels for the worker process and for each virtual directory hosting a .NET Web service.
Request Execution Time - Measure the time taken to execute the request at the server.
Latency or Response Time - Measure the time taken for Web method execution and for the response to be returned to the client.
Cache utilization - Measure the ratio of cache hits to cache misses. This needs to be seen in larger context because the virtual memory utilization may affect the cache performance.
Errors and exceptions - Measure the numbers of errors and exceptions generated.
Sessions - Determine and measure the optimum value for session timeout, and the cost of storing session data locally versus remotely. You also need to determine the session size for a single user.
XML serialization - Measure the cost of XML serialization. Web services use the XmlSerializer class to serialize and de-serialize data. You can calculate the cost of serializing a particular object in terms of memory overhead and the size of data by using the following code snippet.

Metric	Performance Object	Counter	Threshold	Significance
Throughput	ASP.NET Applications	Requests/Sec	Depends on your business requirements.	The throughput of the ASP.NET application on the server. It is one the primary indicators that help you measure the cost of deploying your system at the necessary capacity.
Throughput	Web Service	ISAPI Extension Requests/sec	Depends on your business requirements.	The rate of ISAPI extension requests that are simultaneously being processed by the Web service. This counter is not affected by the ASP.NET worker process restart count, although the ASP.NET Applications\Requests/Sec counter is. To measure the throughput cost in terms of the amount of system resources that your requests consume, you need to measure CPU utilization, memory consumption, and the amount of disk and network I/O. This also helps in measuring the cost of the hardware needed to achieve a given level of performance. For more information about how to measure resource costs, see 'System Resources' earlier in this chapter.
Requests	ASP.NET	Requests Current	No specific value.	The number of requests currently handled by the ASP.NET ISAPI. This includes those that are queued, executing, or waiting to be written to the client. ASP.NET begins to reject requests when this counter exceeds the requestQueueLimit defined in the processModel configuration section. If ASP.NET\Requests Current is greater than zero and no responses have been received from the ASP.NET worker process for a duration greater than the limit specified by <processModel responseDeadlockInterval=/>, the process is terminated and a new process is started.
Requests	ASP.NET Applications	Requests Executing	No specific value.	The number of requests currently executing. This counter is incremented when the HttpRuntime begins to process the request and is decremented after the HttpRuntime finishes the request.
Requests	ASP.NET Applications	Requests Timed Out	No specific value.	The number of requests that have timed out. You need to investigate the cause of request timeouts. One possible reason is contention between various threads. A good instrumentation strategy helps capture the problem in the log. To investigate further, you can debug and analyze the process using a run-time debugger such as WinDbg.
Queues	ASP.NET	Requests Queued	No specific value.	The number of requests currently queued. In IIS 6.0, this indicates a shortage of worker threads. Requests are rejected when ASP.NET\Requests Current exceeds the requestQueueLimit (default = 5000) attribute for <processModel> element defined in the Machine.config file. This can happen when the server is under very heavy load. The queue between IIS and ASP.NET is a named pipe through which the request is sent from one process to the other. When running in IIS 6.0, there is a queue where requests are posted to the managed thread pool from native code. There is also a queue for each virtual directory. You should investigate the ASP.NET Applications\Requests In Application Queue and ASP.NET\Requests Queued to investigate performance issues.
Queues	ASP.NET Applications	Requests In Application Queue	No specific value.	There is a separate queue that is maintained for each virtual directory. The limit for this queue is defined by the appRequestQueueLimit attribute for <httpRunTime> element in Machine.config. When the queue limit is reached the request is rejected with a 'Server too busy' error.
Queues	ASP.NET	Requests Rejected	No specific value.	The number of requests rejected because the request queue was full. ASP.NET worker process starts rejecting requests when ASP.NET\Requests Current exceeds the requestQueueLimit defined in the processModel configuration section. The default value for requestQueueLimit is 5000.
Queues	ASP.NET	Requests Wait Time	1,000 milliseconds. The average request should be close to zero milliseconds waiting in queue.	The number of milliseconds the most recent request was waiting in the named pipe queue between the IIS and the ASP.NET worker process. This does not include any time spent in the queue for a virtual directory hosting the Web application.
Response Time and Latency	TTFB		Depends on your business requirements.	This is the time interval between sending a request to the server and receiving the first byte of the response. The value varies depending on network bandwidth and server load. Use client tools such as ACT to obtain this metric.
Response Time and Latency	TTLB		Depends on your business requirements.	This is the time interval between sending a request to the server and receiving the last byte of the response. Again, the value varies depending upon network bandwidth and server load. Use client tools such as ACT to obtain this metric.
Response Time and Latency	ASP.NET	Request Execution Time	The value is based on your business requirements.	This is the number of milliseconds taken to execute the last request. The execution time begins when the HttpContext for the request is created, and stops before the response is sent to IIS. Assuming that user code does not call HttpResponse.Flush, this implies that execution time stops before sending any bytes to IIS, or to the client.
Cache Utilization	ASP.NET Applications	Cache Total Entries		The current number of entries in the cache which includes both user and internal entries. ASP.NET uses the cache to store objects that are expensive to create, including configuration objects and preserved assembly entries.
Cache Utilization	ASP.NET Applications	Cache Total Hit Ratio	With sufficient RAM, you should normally record a high (greater than 80 percent) cache hit ratio.	This counter shows the ratio for the total number of internal and user hits on the cache.
Cache Utilization	ASP.NET Applications	Cache Total Turnover Rate	No specific value.	The number of additions and removals to and from the cache per second (both user and internal.) A high turnover rate indicates that items are being quickly added and removed, which can impact performance.
Cache Utilization	ASP.NET Applications	Cache API Hit Ratio	No specific value.	Ratio of cache hits to misses of objects called from user code. A low ratio can indicate inefficient use of caching techniques.
Cache Utilization	ASP.NET Applications	Cache API Turnover Rate	No specific value.	The number of additions and removals to and from the output cache per second. A high turnover rate indicates that items are being quickly added and removed, which can impact performance.
Cache Utilization	ASP.NET Applications	Output Cache Entries	No specific value.	The number of entries in the output cache. You need to measure the ASP.NET Applications\ Output Cache Hit Ratio counter to verify the hit rate to the cache entries. If the hit rate is low, you need to identify the cache entries and reconsider your caching mechanism.
Cache Utilization	ASP.NET Applications	Output Cache Hit Ratio	No specific value.	The total hit-to-miss ratio of output cache requests.
Cache Utilization	ASP.NET Applications	Output Cache Turnover Rate	No specific value.	The number of additions and removals to the output cache per second. A high turnover rate indicates that items are being quickly added and removed, which can impact performance.
Errors and Exceptions	ASP.NET Applications	Errors Total/sec	No specific value.	The total number of exceptions generated during preprocessing, parsing, compilation, and run-time processing of a request. A high value can severely affect your application performance. This may render all other results invalid.
Errors and Exceptions	ASP.NET Applications	Errors During Execution	No specific value.	The total number of errors that have occurred during the processing of requests.
Errors and Exceptions	ASP.NET Applications	Errors Unhandled During Execution/sec	No specific value.	The total number of unhandled exceptions per second at run time.
Loading	.NET CLR Loading	Current appdomains	The value should be same as number of Web applications plus one. The additional one is the default application domain loaded by the ASP.NET worker process.	The current number of application domains loaded in the process.
Loading	.NET CLR Loading	Current Assemblies	No specific value.	The current number of assemblies loaded in the process. ASP.NET Web pages (.ASPX files) and user controls (.ascx files) are 'batch compiled' by default, which typically results in one to three assemblies, depending on the number of dependencies. Excessive memory consumption may be caused by an unusually high number of loaded assemblies. You should try to minimize the number of Web pages and user controls without compromising the efficiency of workflow. Assemblies cannot be unloaded from an application domain. To prevent excessive memory consumption, the application domain is unloaded when the number of recompilations (.ASPX, .ascx, .asax) exceeds the limit specified by <compilation numRecompilesBeforeAppRestart=/>. Note If the <%@ page debug=%> attribute is set to true, or if <compilation debug=/> is set to true, batch compilation is disabled.
Loading	.NET CLR Loading	Bytes in Loader Heap	No specific value.	This counter displays the current size (in bytes) of committed memory across all application domains. Committed memory is the physical memory for which space has been reserved in the paging file on disk.
Worker Process Restarts	ASP.NET	Worker Process Restarts	Depends on your business requirements.	The number of times the Web application recycles and the worker process recycles.

Remoting

This section describes what you need to do to measure .NET Remoting performance and how you capture the key metrics. To effectively determine .NET Remoting performance, you need to measure the following:

Throughput - Measure the throughput of the remote component.
Serialization cost and amount of data - Measure the cost of serializing parameters and return values. Serializing data across .NET Remoting boundaries can be a major overhead, particularly if you use the SoapFormatter. The amount of data passed over the wire and the processor and memory overhead required to serialize the data can be significant, especially in server applications under heavy load. Large amounts of data serialization can lead to network congestion, processor, and memory bottlenecks. To optimize and tune serialization, you can measure the costs associated with serializing individual parameters. You can measure byte sizes and the overhead placed on garbage collection. In this way, for a given load you can calculate the total amount of data and the memory overhead by multiplying the values with the number of concurrent users.
Number of TCP connections - Measure the number of TCP connections established with the remote host. To identify the total number of TCP connections to a given server, you can monitor the TCP\ Connections Established performance counter.
Contention for singleton objects - Measure the impact of locking and queuing.

Metric	Performance Object	Counter	Threshold	Significance
Throughput	.NET CLR Remoting	Remote Call/sec	No specific value	Measures the current rate of incoming .NET Remoting requests. More than one remote call may be required to complete a single operation. You need to divide the counter with the amount of requests to complete a single operation. This gives you the rate of operations completed per second. You might need to instrument your code to observe the request execution time.
Throughput	ASP.NET Applications	Requests/Sec	No specific value.	If your remote component is hosted in IIS, you can measure the throughput by observing this counter. You need to divide the counter with the amount of requests to complete a single operation. This gives you the rate of operations completed per second.
Contention	.NET CLR LocksAndThreads	Contention Rate / sec	No specific value.	This counter displays the rate at which the runtime attempts to acquire a managed lock but without a success. Sustained nonzero values may be a cause of concern. You may want to run dedicated tests for a particular piece of code to identify the contention rate for the particular code path.
Contention	.NET CLR LocksAndThreads	Current Queue Length	No specific value.	This counter displays the last recorded number of threads currently waiting to acquire a managed lock in an application. You may want to run dedicated tests for a particular piece of code to identify the average queue length for the particular code path. This helps you identify inefficient synchronization mechanisms.

Interop

This section describes what you need to do to measure interoperability performance and how to capture the key metrics. You can use the measuring techniques discussed in this section to measure P/Invoke and COM interop performance. To effectively determine interop performance, you need to measure the following:

Marshaling time per request
Processor and memory utilization
Chattiness of marshaled interfaces - You can measure interface chattiness by measuring the number of times your code switches from managed to unmanaged code and back again.

<<td WIDTH="81"> System

Metric	Performance Object	Counter	Threshold	Significance
Processor	Processor	% Processor Time	The general figure for the threshold limit for processors is 85 percent.	This counter is the primary indicator of processor activity. High values many not necessarily be bad. However, if the other processor-related counters are increasing linearly such as % Privileged Time or Processor Queue Length, high CPU utilization may be worth investigating.
Processor	Processor	% Privileged Time	A figure that is consistently over 75 percent indicates a bottleneck	This counter indicates the percentage of time a thread runs in privileged mode. When your application calls operating system functions (for example to perform file or network I/O or to allocate memory), these operating system functions are executed in privileged mode.
Processor	Processor	% Interrupt Time	Depends on processor.	This counter indicates the percentage of time the processor spends receiving and servicing hardware interrupts. This value is an indirect indicator of the activity of devices that generate interrupts, such as network adapters. A dramatic increase in this counter indicates potential hardware problems.
Processor	System	Processor Queue Length	An average value consistently higher than 2 indicates a bottleneck.	If there are more tasks ready to run than there are processors, threads queue up. The processor queue is the collection of threads that are ready but not able to be executed by the processor because another active thread is currently executing. A sustained or recurring queue of more than two threads is a clear indication of a processor bottleneck. You may get more throughput by reducing parallelism in those cases. You can use this counter in conjunction with the Processor\% Processor Time counter to determine if your application can benefit from more CPUs. There is a single queue for processor time, even on multiprocessor computers. Therefore, in a multiprocessor computer, divide the Processor Queue Length (PQL) value by the number of processors servicing the workload. If the CPU is very busy (90 percent and higher utilization) and the PQL average is consistently higher than 2 per processor, you may have a processor bottleneck that could benefit from additional CPUs. Or, you could reduce the number of threads and queue more at the application level. This will cause less context switching, and less context switching is good for reducing CPU load. The common reason for a PQL of 2 or higher with low CPU utilization is that requests for processor time arrive randomly and threads demand irregular amounts of time from the processor. This means that the processor is not a bottleneck but that it is your threading logic that needs to be improved.
Processor	Context Switches/sec	As a general rule, context switching rates of less than 5,000 per second per processor are not worth worrying about. If context switching rates exceed 15,000 per second per processor, then there is a constraint.	Context switching happens when a higher priority thread preempts a lower priority thread that is currently running or when a high priority thread blocks. High levels of context switching can occur when many threads share the same priority level. This often indicates that there are too many threads competing for the processors on the system. If you do not see much processor utilization and you see very low levels of context switching, it could indicate that threads are blocked.
Memory	.NET CLR Memory	% Time in GC	This counter should average about 5 percent for most applications when the CPU is 70 percent busy, with occasional peaks. As the CPU load increases, so does the percentage of time spent performing garbage collection. Keep this in mind when you measure the CPU.	This counter indicates the percentage of elapsed time spent performing a garbage collection since the last garbage collection cycle. The most common cause of a high value is making too many allocations, which may be the case if you are allocating on a per-request basis for ASP.NET applications. You need to study the allocation profile for your application if this counter shows a higher value.
Memory	.NET CLR Memory	# Bytes in all Heaps	No specific value.	This counter is the sum of four other counters — Gen 0 Heap Size, Gen 1 Heap Size, Gen 2 Heap Size, and Large Object Heap Size. The value of this counter will always be less than the value of Process\Private Bytes, which also includes the native memory allocated for the process by the operating system. Private Bytes - # Bytes in all Heaps is the number of bytes allocated for unmanaged objects. This counter reflects the memory usage by managed resources.
Memory	.NET CLR Memory	# Gen 0 Collections	No specific value.	This counter indicates the number of times the generation 0 objects are garbage-collected from the start of the application. Objects that survive the collection are promoted to Generation 1. You can observe the memory allocation pattern of your application by plotting the values of this counter over time.
Memory	.NET CLR Memory	# Gen 1 Collections	One-tenth the value of # Gen 0 Collections	This counter indicates the number of times the generation 1 objects are garbage-collected from the start of the application.
Memory	.NET CLR Memory	# Gen 2 Collections	One-tenth the value of # Gen 1 Collections	This counter indicates the number of times the generation 2 objects are garbage-collected from the start of the application. The generation 2 heap is the costliest to maintain for an application. Whenever there is a generation 2 collection, it suspends all the application threads. You should profile the allocation pattern for your application and minimize the objects in generation 2 heap.
Memory	.NET CLR Memory	# of Pinned Objects	No specific value.	When .NET-based applications use unmanaged code, these objects are pinnedin memory. That is, they cannot move around because the pointers to them would become invalid. These can be measured by this counter. You can also pin objects explicitly in managed code, such as reusable buffers used for I/O calls. Too many pinned objects affect the performance of the garbage collector because they restrict its ability to move objects and organize memory efficiently.
Memory	.NET CLR Memory	Large Object Heap Size	No specific value.	The large object heap size shows the amount of memory consumed by objects whose size is greater than 85 KB. If the difference between # Bytes in All Heaps and Large Object Heap Size is small, most of the memory is being used up by large objects. The large object heap cannot be compacted after collection and may become heavily fragmented over a period of time. You should investigate your memory allocation profile if you see large numbers here.
Chattiness of Marshaled Interfaces	.NET CLR Interop	# of marshalling	No specific value.	This tells you the number of transitions from managed to unmanaged code and back again. If this number is high, determine whether you can redesign this part of the application to reduce the number of transitions needed.
Chattiness of Marshaled Interfaces	.NET CLR Interop	# of Stubs	No specific value.	Displays the current number of stubs that the CLR has created.

ADO.NET/Data Access

This section describes what you need to do to measure ADO.NET data access performance and how you capture the key metrics. To effectively determine ADO.NET data access performance, you need to measure the following:

Connection pooling - Measure the utilization and effectiveness of pooling. Currently there is no direct way to measure the effectiveness of OLEDB and ODBC pools by using System Monitor because the counters provided are not reliable. However, you can monitor the number of logins per second with the SQL Server: General Statistics\Logins/sec counter. You can also monitor user connections on the database server to evaluate pool performance. You should observe that SQL Server: General Statistics \Logins/sec drops to zero. This indicates that connections are getting repeatedly reused and that the pool is working effectively.
Queries - Measure response times and query efficiency.
Indexes - Measure the effectiveness of index searches.
Cache - Measure the effectiveness of caching and cache utilization levels.
Transactions - Measure transactions per second, concurrent transactions.
Locks - Measure the impact of table-locking and row-locking, average time spent waiting for a lock, and number of deadlocks.

Metric	Performance Object	Counter	Threshold	Significance
Connection Pooling - SqlConnection	.NET CLR Data	SqlClient : Current # connection pools	No specific value.	Current number of pools associated with the process.
Connection Pooling - SqlConnection	.NET CLR Data	SqlClient: Current # pooled connections	No specific value.	Current number of connections in all pools associated with the process.
Connection Pooling - SqlConnection	.NET CLR Data	SqlClient: Peak # pooled connections	No specific value.	The highest number of connections in all pools since the process started.
Connection Pooling - SqlConnection	.NET CLR Data	SqlClient: Total # failed connects	No specific value.	The total number of connection open attempts that have failed for any reason.
Indexes	SQL Server: Access Methods	Index Searches/sec	No specific value.	Number of index searches. Index searches are used to start range scans, single index record fetches, and to reposition within an index.
Indexes	SQL Server: Access Methods	Full Scans/sec	No specific value.	The rate of full table or full index scans. Lower numbers are better.
Cache	SQL Server: Cache Manager	Cache Hit Ratio	No specific value.	Ratio between cache hits and lookups.
Cache	SQL Server: Cache Manager	Cache Use Counts/sec	No specific value.	Times each type of cache object has been used.
Cache	SQL Server: Memory Manager	SQL Cache Memory (KB)	No specific value.	Total amount of dynamic memory the server is using for the dynamic SQL cache.
Cache	Memory	Cache Faults/sec	This indicates how often the operating system looks for data in the file system cache but fails to find it.	This value should be as small as possible. A high rate of cache faults may indicate insufficient memory or poorly organized or heavily fragmented disks.
Transactions	SQL Server: Databases	Transactions/sec	No specific value.	Number of transactions started for the database. This is the primary indicator of database throughput.
Transactions	SQL Server: Databases	Active Transactions	No specific value.	Number of active transactions for the database.
Locks	SQL Server: Locks	Lock Requests/sec	No specific value.	Number of new locks and lock conversions requested from the lock manager.
Locks	SQL Server: Locks	Lock Timeouts/sec	No specific value.	Number of lock requests that timed out. This includes internal requests for NOWAIT locks.
Locks	SQL Server: Locks	Lock Waits/sec	No specific value.	Number of lock requests that could not be satisfied immediately and required the caller to wait before being granted the lock.
Locks	SQL Server: Locks	Number of Deadlocks/sec	No specific value.	Number of lock requests that resulted in a deadlock. A typical reason for this could be interference between long-running queries and multiple row updates. This number has to be very low. This translates to significant extra work because a deadlock implies that there must be a retry or compensating action at some higher level of the business logic. High values indicate that there is a scope to improve your design that manages the transaction isolation levels and queries.
Locks	SQL Server: Locks	Average Wait Time (ms)	No specific value.	The average amount of wait time (milliseconds) for each lock request that resulted in a wait.
Locks	SQL Server: Latches	Average Latch Wait Time (ms)	No specific value.	The average amount of time that a request for a latch had to wait within the database. Latches are lightweight, short-term row locks, so higher numbers indicate contention for resources.

Last update: Thursday, December 09, 2010 09:40:14 AM