This topic contains information about analyzing blockages during each phase of an image installation. For more information, see Optimizing Performance and Scalability.

In This Topic

Analyzing Blockages in Each Phase of Installation
- PXE Boot Phase
- TFTP Download Phase
- Image Apply Phase
Using Performance Monitoring

Analyzing Blockages in Each Phase of Installation

PXE Boot Phase

The Pre-Boot Execution Environment (PXE) boot phase encompasses the initial boot performed by the client computer. This includes obtaining an IP address lease, locating a valid Windows Deployment Services server, and downloading a network boot program (NBP) by using Trivial File Transfer Protocol (TFTP). The amount of data transferred over the network during this phase is minimal, and the end-to-end operation typically succeeds in a matter of seconds.

Given the speed at which operations in this phase are completed, you have a few options when it comes to performance tuning. The Windows Deployment Services PXE server can handle several hundred requests per second in sustained throughput. Slight performance decreases can occur if the domain controller is located across a latent network link or is overloaded. In larger environments, consider locating Dynamic Host Configuration Protocol (DHCP) and Windows Deployment Services roles on separate physical computers.

TFTP Download Phase

The TFTP download phase of the installation process is when the boot image is downloaded to the client computer. Performance in this phase is tied directly to the following factors (in order of importance):

Latency between the client computer and the server (measured by the average response time between the server and the client)
Size of the boot image. For this reason, increasing boot image size will cause the TFTP download times to increase and will reduce reliability. Typically, the longer it takes to download the boot image, the more likely it is that something could go wrong.

Note
TFTP block size
Other network conditions (such as workload, the quality of the hardware that is installed, and electromagnetic noise considerations)

Diagnosing TFTP Download Performance Problems

The simplest way to diagnose long download times (observed from the client computer as a progress bar below an IP address) is to look at the average response time between the client and the server it is downloading from. To do this, in Windows PE, open the Command Prompt window, type ping <server’s IP address>, and then note the average latency measured. The output will look similar to the following, where the average latency is less than 1 millisecond (which is good):

C:\Windows\system32>ping 10.197.160.93
Pinging 10.197.160.93 with 32 bytes of data:
Reply from 10.197.160.93: bytes=32 time=2ms TTL=60
Reply from 10.197.160.93: bytes=32 time<1ms TTL=60
Reply from 10.197.160.93: bytes=32 time<1ms TTL=60
Reply from 10.197.160.93: bytes=32 time<1ms TTL=60
Ping statistics for 10.197.160.93:
	Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
	Minimum = 0ms, Maximum = 2ms, Average = 0ms

High round-trip time values indicate latency on the network, which is an indicator that TFTP download performance will be poor. To improve this performance, consider doing one or more of the following:

Use a Windows Deployment Services server that is closer to each client.
Remove stress and load from the network segment.
If the client connects to the server after multiple network hops, use the output from the tracert command to identify the latent segment, and consider rerouting TFTP traffic to avoid the hop.

You can also diagnose TFTP download performance problems by examining a network trace of the download activity. Generally, the best practice is to obtain this trace from the client and server simultaneously to assess exactly where the blockage is occurring (server, client, or network). To do this, add a client and a third computer to a hub, start network traces from the server and the third computer, and then boot the client computer from the network.

Addressing TFTP Download Performance Problems

In the preceding example, the average latency is less than 1 millisecond, which is good. If the average latency between the client and the server is longer than 5 milliseconds, TFTP performance will be seriously degraded.

You may be able to decrease the impact of latency on TFTP download times by increasing the TFTP block size. This means that more data will be sent each time, which cuts down on the number of round-trips. For instructions, see How to Modify the BCD Store Using Bcdedit.

Reducing the size of the boot image can also speed up TFTP downloads. To accomplish this, do the following:

Use the tools in the Windows Automated Installation Kit (AIK) to create a custom boot image that contains the Windows Setup binary files and Microsoft Windows Preinstallation Environment (Windows PE). For instructions, see Creating Images. Ensure this image has been prepared by using PEIMG.exe /prep. For more information, see PEImg Command-Line Options (http://go.microsoft.com/fwlink/?LinkId=120707).
Ensure that the Windows image (.wim) file that contains the boot image does not contain extra space. A best practice is to use the ImageX /export command to export your boot image to a "clean" .wim file before adding the image to the Windows Deployment Services server. For more information, see ImageX Command-Line Options (http://go.microsoft.com/fwlink/?LinkId=120708).
Ensure that the .wim file that contains the boot image is using the maximum compression format, LZX. To do this, run Imagex /info ImageFile <ImageNumber|ImageName>. For more information, see ImageX Command-Line Options (http://go.microsoft.com/fwlink/?LinkId=120708).
In situations where a server is overburdened, use PXE boot referrals to direct booting clients to different PXE servers for TFTP downloads. For more information, see Managing Network Boot Programs.
Alter your physical network topology by doing one or more of the following:
- Add a PXE server closer to the client computer.
- Move the client computer closer to the PXE server.
- Repair the existing network infrastructure (in the case of high-packet loss).
- Upgrade to better cabling (Cat 5e is recommended).
- Check the condition of the switches between the client computer and the PXE server to ensure that packets are not being dropped.

Image Apply Phase

The image apply phase of the installation process involves transferring an install image from the Windows Deployment Services server to the client. This transfer occurs through either Server Message Block (SMB) or multicasting and is the most time-consuming part of the installation process.

Diagnosing Performance Problems in the Image Apply Phase

To begin, test several client computers on your network, and compare the performance with the test results outlined in the "Performance and Scalability Expectations" section in Optimizing Performance and Scalability. You can also enable logging to gather information. For more information, see the "Windows Deployment Services Client Logs" section in Logging and Tracing. If there are substantial variances between the expected results and your results, you probably have a performance blockage. To troubleshoot common blockages, ask yourself the following questions:

Do performance problems occur only at certain times of the day? This may indicate a scalability problem that is probably caused by an overused network or an overburdened server.
Do performance problems occur only for clients on a particular subnet or network location? If so, determine whether there is a network issue on that segment.
Do performance problems occur only for clients that access a particular server? If so, check the server’s performance statistics as well as the network segment that connects the clients to the server to see whether the server is overused.

Performance problems that occur across a larger group of computers generally indicate either a concurrency problem (scalability) or a blockage in the network or server. To investigate, measure the amount of time it takes to download a file (of approximately the same size as the install image) from the server to the client, in Windows PE. Or try to download the install image after it has been placed in a shared folder on the server. If the time it takes to download a large file exceeds the expectations, you should analyze the switch utilization and observe other network metrics to identify the network conditions that are impacting download times.

If you suspect that the server is the blockage, use the steps in the Using Performance Monitoring section later in this chapter to identify the root cause of the blockage.

Addressing Performance Problems in the Image Apply Phase

Performance problems in this phase are generally caused by network congestion, or inadequate resources on the server or client. If network congestion is the issue, consider doing the following:

Creating more bandwidth on the network. This may mean upgrading your network infrastructure to support greater bandwidth and higher throughput. For example, it might mean moving from 100 MB to 1 GB, upgrading cabling, replacing hubs with routers or switches, or reducing the number of clients that can access a particular network segment simultaneously.
Adding additional Windows Deployment Services servers to the network to handle the network demand. This means segmenting network infrastructure so that smaller groups of clients are answered by each server.
Balancing the server load by adding dedicated image servers. For more information, see Storing and Replicating Images Using DFS.
Reducing image size. Because larger images mean longer installation times and greater network strain, you should consider creating images that contain minimum customization, drivers, and applications; or consider creating specialized images for each department, hardware type, or function. For more information, see the "Reducing the Size of Images" section in the Servicing Images topic.

Most Windows Deployment Services server blockages occur because of inadequate bandwidth (at the network adapter), slow disk subsystems, or insufficient available physical memory. To identify the source of the blockage, use the information in the next section, Using Performance Monitoring. Typical causes of performance problems on individual client computers include the following:

Problems with the physical network connection between the client computer and the network topology
Problems with the switching equipment
A bad disk controller interface on the client computer
A bad network adapter on the client computer
Insufficient RAM on the client computer (512 MB of RAM is the minimum requirement for Windows Vista)
Poorly performing system drivers

Using Performance Monitoring

You can use Windows Reliability and Performance Monitor to diagnose performance problems with Windows Deployment Services. Note, however, that this is not a complete solution. Because most performance and scalability issues in Windows Deployment Services are network related, network analysis tools may be of greater use. Nevertheless, Windows Reliability and Performance Monitor can be a powerful and quick tool for identifying resource issues on services associated with Windows Deployment Services.

The following are the most useful counters for diagnosing Windows Deployment Services performance. To open Reliability and Performance Monitor, click Start, type Performance in the Start Search box, and then press ENTER. To add these counters, expand Monitoring Tools , click Performance Monitor, and then click the green plus sign (+) in the right pane. In Available Counters, scroll to the counter you want to add, and then click Add. Review the following information to maximize your server's performance.

Network Interface (Bytes Sent/sec)
PhysicalDisk (Avg. Disk sec/Read, Avg. Disk sec/Write, and Current Disk Queue Length ). These disk counters highlight the current disk activity. The Avg. Disk sec/Read and the Avg. Disk sec/Write counter should generally take less than 10 milliseconds, and the maximum should not exceed 50 milliseconds. Anything outside these thresholds indicates that there is too little available disk space to respond to the demands that are being placed on the server. The Current Disk Queue Length counter indicates the backlog of pending input/output (I/O) requests. As you might expect, you do not want to see much here, if anything.
Process (Page Faults/sec). Page faults occur when there is not enough physical memory on the server to meet the server's demands. When this occurs, the server has to copy memory from the physical RAM to a swap file on the hard disk drive, and then make room to enable the requested memory allocation to complete. This is a very expensive operation because this swap requires a series of reads and writes on the hard disk drive, and this process must be completed before the operation that caused the fault can resume. On servers where there is not enough memory, page faults can occur frequently, which significantly reduces the amount of processor time that is available to complete any other operations. If there are significant time periods with a lot of page fault activity, you should consider adding memory to the server.
Processor (% Processor Time). You can tell from the % Processor Time counter whether there is enough processing power on the server to meet the demands being placed on it. If you see that processor utilization is high, use this counter for each individual process to determine the cause of the degraded performance. If the Windows Deployment Services server is configured to work with File Replication Service (FRS), and the Distributed File System Replication (DFSR) service is consuming a significant portion of processor time, you should consider increasing the boot configuration data (BCD) refresh interval to reduce the number of changes that FRS has to propagate between servers. If the server has multiple server roles, you may want to configure the roles so that they are better distributed across multiple servers.

A strong correlation between network utilization and disk reads (and disk throughput) indicates that the network card may be the cause of a reduction in image deployment times. In this case, if you are not concerned with disk throughput, consider upgrading the network infrastructure to support GB Ethernet, or refactoring the Windows Deployment Services server infrastructure so that it is spread across multiple servers.
WDS Multicast Server (all counters). The following list describes all of counters for multicasting.
- Active Clients. This counter shows the clients that are currently connected to a multicast session.
- Active Contents. Contents refers to the data that is being transmitted. When a client connects to a namespace, a “content” is created. The content is then removed if clients are not active in the content for 5 minutes or longer. You can have multiple contents for a single namespace if there are multiple network cards on the server.
- Active Namespaces. This counter is essentially equivalent to a multicast transmission. A namespace is the underlying object that gets created when you create a multicast transmission.
- Incoming Packets/Second (in Bytes). This counter shows the sum of all incoming data packets (per second) from all multicast sessions.
- Outgoing Packets/Second (in Bytes): This counter shows the sum of all outgoing data packets (per second) from all multicast sessions.
- Total Data Packets. This counter shows the total number of data packets sent by the multicast server.
- Total Master Client Switches. This counter shows the total number of times that the master client has been changed in a transmission. Note that the master client is the slowest client in a transmission — that is, the client that is not capable of installing any faster, whereas the other clients may be able to install at a faster rate.
- Total NACK Packets. A NACK packet is a negative acknowledgement. This counter shows the total number of NACK packets received from client computers.
- Total Repair Packets. This counter shows the total number of repair packets sent by the server. Note that the server sends repair packets in response to NACK packets. If the number in this counter is high, relative to the Total Data Packets counter, this indicates that packet loss is occurring between the clients and the server. Ideally, the ratio of total data packets to total repair packets should be greater than 100:1.
- Total Slowdown Request. Clients send slowdown requests when the server is sending data faster than the client can handle it. This is usually caused by slow disk performance on the clients, or by other resource pressure (such as insufficient memory, high CPU utilization, and so on).
WDS TFTP Server (all counters). The following list describes the two counters for TFTP.
- Active Requests. This counter shows the number of active TFTP transfers on the server.
- Transfer Rate/Second (in Bytes). This counter shows the total amount of data that the TFTP server is sending out per second.
WDS Server (all counters). The following list describes the counters for the Windows Deployment Services server.
- Active Requests. This counter shows the number of currently active requests on the Windows Deployment Services server, including remote procedure calls (RPCs) to the server and multicast requests.
- Processed/Second. This counter shows the number of requests processed in the last second.
- Requests/Second. This counter shows the number of requests received in the last second.

For more information about Reliability and Performance Monitor, see http://go.microsoft.com/fwlink/?LinkID=110854.

For information about how to view these counters, see the following Microsoft TechNet articles:

Add Counters Dialog Box (http://go.microsoft.com/fwlink/?LinkId=105531)
Creating Data Collector Sets (http://go.microsoft.com/fwlink/?LinkID=55157)

	Note

Troubleshooting Performance Problems