This topic contains information about analyzing blockages during each phase of an image installation. For more information, see Optimizing Performance and Scalability.
In This Topic
Analyzing Blockages in Each Phase of Installation
PXE Boot Phase
The Pre-Boot Execution Environment (PXE) boot phase encompasses the initial boot performed by the client computer. This includes obtaining an IP address lease, locating a valid Windows Deployment Services server, and downloading a network boot program (NBP) by using Trivial File Transfer Protocol (TFTP). The amount of data transferred over the network during this phase is minimal, and the end-to-end operation typically succeeds in a matter of seconds.
Given the speed at which operations in this phase are completed, you have a few options when it comes to performance tuning. The Windows Deployment Services PXE server can handle several hundred requests per second in sustained throughput. Slight performance decreases can occur if the domain controller is located across a latent network link or is overloaded. In larger environments, consider locating Dynamic Host Configuration Protocol (DHCP) and Windows Deployment Services roles on separate physical computers.
TFTP Download Phase
The TFTP download phase of the installation process is when the boot image is downloaded to the client computer. Performance in this phase is tied directly to the following factors (in order of importance):
- Latency between the client computer and the
server (measured by the average response time between the server
and the client)
- Size of the boot image. For this reason,
increasing boot image size will cause the TFTP download times to
increase and will reduce reliability. Typically, the longer it
takes to download the boot image, the more likely it is that
something could go wrong.
- TFTP block size
- Other network conditions (such as workload,
the quality of the hardware that is installed, and electromagnetic
Diagnosing TFTP Download Performance Problems
The simplest way to diagnose long download times (observed from the client computer as a progress bar below an IP address) is to look at the average response time between the client and the server it is downloading from. To do this, in Windows PE, open the Command Prompt window, type ping <server’s IP address>, and then note the average latency measured. The output will look similar to the following, where the average latency is less than 1 millisecond (which is good):
C:\Windows\system32>ping 10.197.160.93 Pinging 10.197.160.93 with 32 bytes of data: Reply from 10.197.160.93: bytes=32 time=2ms TTL=60 Reply from 10.197.160.93: bytes=32 time<1ms TTL=60 Reply from 10.197.160.93: bytes=32 time<1ms TTL=60 Reply from 10.197.160.93: bytes=32 time<1ms TTL=60 Ping statistics for 10.197.160.93: Packets: Sent = 4, Received = 4, Lost = 0 (0% loss), Approximate round trip times in milli-seconds: Minimum = 0ms, Maximum = 2ms, Average = 0ms
High round-trip time values indicate latency on the network, which is an indicator that TFTP download performance will be poor. To improve this performance, consider doing one or more of the following:
- Use a Windows Deployment Services server that
is closer to each client.
- Remove stress and load from the network
- If the client connects to the server after
multiple network hops, use the output from the tracert
command to identify the latent segment, and consider rerouting TFTP
traffic to avoid the hop.
You can also diagnose TFTP download performance problems by examining a network trace of the download activity. Generally, the best practice is to obtain this trace from the client and server simultaneously to assess exactly where the blockage is occurring (server, client, or network). To do this, add a client and a third computer to a hub, start network traces from the server and the third computer, and then boot the client computer from the network.
Addressing TFTP Download Performance Problems
In the preceding example, the average latency is less than 1 millisecond, which is good. If the average latency between the client and the server is longer than 5 milliseconds, TFTP performance will be seriously degraded.
You may be able to decrease the impact of latency on TFTP download times by increasing the TFTP block size. This means that more data will be sent each time, which cuts down on the number of round-trips. For instructions, see How to Modify the BCD Store Using Bcdedit.
Reducing the size of the boot image can also speed up TFTP downloads. To accomplish this, do the following:
- Use the tools in the Windows Automated
Installation Kit (AIK) to create a custom boot image that contains
the Windows Setup binary files and Microsoft Windows
Preinstallation Environment (Windows PE). For instructions,
Images. Ensure this image has been prepared by using
PEIMG.exe /prep. For more information, see PEImg
Command-Line Options (http://go.microsoft.com/fwlink/?LinkId=120707).
- Ensure that the Windows image (.wim) file
that contains the boot image does not contain extra space. A best
practice is to use the ImageX /export command to export your
boot image to a "clean" .wim file before adding the image to the
Windows Deployment Services server. For more information, see
ImageX Command-Line Options (http://go.microsoft.com/fwlink/?LinkId=120708).
- Ensure that the .wim file that contains the
boot image is using the maximum compression format, LZX. To do
this, run Imagex /info ImageFile
<ImageNumber|ImageName>. For more information, see ImageX
Command-Line Options (http://go.microsoft.com/fwlink/?LinkId=120708).
- In situations where a server is overburdened,
use PXE boot referrals to direct booting clients to different PXE
servers for TFTP downloads. For more information, see Managing Network Boot
- Alter your physical network topology by doing
one or more of the following:
- Add a PXE server closer to the client
- Move the client computer closer to the PXE
- Repair the existing network infrastructure
(in the case of high-packet loss).
- Upgrade to better cabling (Cat 5e is
- Check the condition of the switches between
the client computer and the PXE server to ensure that packets are
not being dropped.
- Add a PXE server closer to the client computer.
Image Apply Phase
The image apply phase of the installation process involves transferring an install image from the Windows Deployment Services server to the client. This transfer occurs through either Server Message Block (SMB) or multicasting and is the most time-consuming part of the installation process.
Diagnosing Performance Problems in the Image Apply Phase
To begin, test several client computers on your network, and compare the performance with the test results outlined in the "Performance and Scalability Expectations" section in Optimizing Performance and Scalability. You can also enable logging to gather information. For more information, see the "Windows Deployment Services Client Logs" section in Logging and Tracing. If there are substantial variances between the expected results and your results, you probably have a performance blockage. To troubleshoot common blockages, ask yourself the following questions:
- Do performance problems occur only at
certain times of the day? This may indicate a scalability
problem that is probably caused by an overused network or an
- Do performance problems occur only for
clients on a particular subnet or network location? If so,
determine whether there is a network issue on that segment.
- Do performance problems occur only for
clients that access a particular server? If so, check the
server’s performance statistics as well as the network segment that
connects the clients to the server to see whether the server is
Performance problems that occur across a larger group of computers generally indicate either a concurrency problem (scalability) or a blockage in the network or server. To investigate, measure the amount of time it takes to download a file (of approximately the same size as the install image) from the server to the client, in Windows PE. Or try to download the install image after it has been placed in a shared folder on the server. If the time it takes to download a large file exceeds the expectations, you should analyze the switch utilization and observe other network metrics to identify the network conditions that are impacting download times.
If you suspect that the server is the blockage, use the steps in the Using Performance Monitoring section later in this chapter to identify the root cause of the blockage.
Addressing Performance Problems in the Image Apply Phase
Performance problems in this phase are generally caused by network congestion, or inadequate resources on the server or client. If network congestion is the issue, consider doing the following:
- Creating more bandwidth on the
network. This may mean upgrading your network infrastructure to
support greater bandwidth and higher throughput. For example, it
might mean moving from 100 MB to 1 GB, upgrading cabling, replacing
hubs with routers or switches, or reducing the number of clients
that can access a particular network segment simultaneously.
- Adding additional Windows Deployment
Services servers to the network to handle the network demand.
This means segmenting network infrastructure so that smaller groups
of clients are answered by each server.
- Balancing the server load by adding
dedicated image servers. For more information, see Storing and Replicating
Images Using DFS.
- Reducing image size. Because larger
images mean longer installation times and greater network strain,
you should consider creating images that contain minimum
customization, drivers, and applications; or consider creating
specialized images for each department, hardware type, or function.
For more information, see the "Reducing the Size of Images" section
in the Servicing
Most Windows Deployment Services server blockages occur because of inadequate bandwidth (at the network adapter), slow disk subsystems, or insufficient available physical memory. To identify the source of the blockage, use the information in the next section, Using Performance Monitoring. Typical causes of performance problems on individual client computers include the following:
- Problems with the physical network connection
between the client computer and the network topology
- Problems with the switching equipment
- A bad disk controller interface on the client
- A bad network adapter on the client
- Insufficient RAM on the client computer (512
MB of RAM is the minimum requirement for Windows Vista)
- Poorly performing system drivers
Using Performance Monitoring
You can use Windows Reliability and Performance Monitor to diagnose performance problems with Windows Deployment Services. Note, however, that this is not a complete solution. Because most performance and scalability issues in Windows Deployment Services are network related, network analysis tools may be of greater use. Nevertheless, Windows Reliability and Performance Monitor can be a powerful and quick tool for identifying resource issues on services associated with Windows Deployment Services.
The following are the most useful counters for diagnosing Windows Deployment Services performance. To open Reliability and Performance Monitor, click Start, type Performance in the Start Search box, and then press ENTER. To add these counters, expand Monitoring Tools , click Performance Monitor, and then click the green plus sign (+) in the right pane. In Available Counters, scroll to the counter you want to add, and then click Add. Review the following information to maximize your server's performance.
- Network Interface (Bytes Sent/sec)
- PhysicalDisk (Avg. Disk sec/Read, Avg.
Disk sec/Write, and Current Disk Queue Length ). These disk
counters highlight the current disk activity. The Avg. Disk
sec/Read and the Avg. Disk sec/Write counter should generally take
less than 10 milliseconds, and the maximum should not exceed 50
milliseconds. Anything outside these thresholds indicates that
there is too little available disk space to respond to the demands
that are being placed on the server. The Current Disk Queue Length
counter indicates the backlog of pending input/output (I/O)
requests. As you might expect, you do not want to see much here, if
- Process (Page Faults/sec). Page faults
occur when there is not enough physical memory on the server to
meet the server's demands. When this occurs, the server has to copy
memory from the physical RAM to a swap file on the hard disk drive,
and then make room to enable the requested memory allocation to
complete. This is a very expensive operation because this swap
requires a series of reads and writes on the hard disk drive, and
this process must be completed before the operation that caused the
fault can resume. On servers where there is not enough memory, page
faults can occur frequently, which significantly reduces the amount
of processor time that is available to complete any other
operations. If there are significant time periods with a lot of
page fault activity, you should consider adding memory to the
- Processor (% Processor Time). You can
tell from the % Processor Time counter whether there is enough
processing power on the server to meet the demands being placed on
it. If you see that processor utilization is high, use this counter
for each individual process to determine the cause of the degraded
performance. If the Windows Deployment Services server is
configured to work with File Replication Service (FRS), and the
Distributed File System Replication (DFSR) service is consuming a
significant portion of processor time, you should consider
increasing the boot configuration data (BCD) refresh interval to
reduce the number of changes that FRS has to propagate between
servers. If the server has multiple server roles, you may want to
configure the roles so that they are better distributed across
A strong correlation between network utilization and disk reads (and disk throughput) indicates that the network card may be the cause of a reduction in image deployment times. In this case, if you are not concerned with disk throughput, consider upgrading the network infrastructure to support GB Ethernet, or refactoring the Windows Deployment Services server infrastructure so that it is spread across multiple servers.
- WDS Multicast Server (all counters).
The following list describes all of counters for multicasting.
- Active Clients. This counter shows the
clients that are currently connected to a multicast session.
- Active Contents. Contents refers to the data
that is being transmitted. When a client connects to a namespace, a
“content” is created. The content is then removed if clients are
not active in the content for 5 minutes or longer. You can have
multiple contents for a single namespace if there are multiple
network cards on the server.
- Active Namespaces. This counter is
essentially equivalent to a multicast transmission. A namespace is
the underlying object that gets created when you create a multicast
- Incoming Packets/Second (in Bytes). This
counter shows the sum of all incoming data packets (per second)
from all multicast sessions.
- Outgoing Packets/Second (in Bytes): This
counter shows the sum of all outgoing data packets (per second)
from all multicast sessions.
- Total Data Packets. This counter shows the
total number of data packets sent by the multicast server.
- Total Master Client Switches. This counter
shows the total number of times that the master client has been
changed in a transmission. Note that the master client is the
slowest client in a transmission — that is, the client that is not
capable of installing any faster, whereas the other clients may be
able to install at a faster rate.
- Total NACK Packets. A NACK packet is a
negative acknowledgement. This counter shows the total number of
NACK packets received from client computers.
- Total Repair Packets. This counter shows the
total number of repair packets sent by the server. Note that the
server sends repair packets in response to NACK packets. If the
number in this counter is high, relative to the Total Data Packets
counter, this indicates that packet loss is occurring between the
clients and the server. Ideally, the ratio of total data packets to
total repair packets should be greater than 100:1.
- Total Slowdown Request. Clients send slowdown
requests when the server is sending data faster than the client can
handle it. This is usually caused by slow disk performance on the
clients, or by other resource pressure (such as insufficient
memory, high CPU utilization, and so on).
- Active Clients. This counter shows the clients that are currently connected to a multicast session.
- WDS TFTP Server (all counters). The
following list describes the two counters for TFTP.
- Active Requests. This counter shows the
number of active TFTP transfers on the server.
- Transfer Rate/Second (in Bytes). This counter
shows the total amount of data that the TFTP server is sending out
- Active Requests. This counter shows the number of active TFTP transfers on the server.
- WDS Server (all counters). The
following list describes the counters for the Windows Deployment
- Active Requests. This counter shows the
number of currently active requests on the Windows Deployment
Services server, including remote procedure calls (RPCs) to the
server and multicast requests.
- Processed/Second. This counter shows the
number of requests processed in the last second.
- Requests/Second. This counter shows the
number of requests received in the last second.
- Active Requests. This counter shows the number of currently active requests on the Windows Deployment Services server, including remote procedure calls (RPCs) to the server and multicast requests.
For more information about Reliability and Performance Monitor, see http://go.microsoft.com/fwlink/?LinkID=110854.
For information about how to view these counters, see the following Microsoft TechNet articles: