HPC Monkey

Tuesday, October 20, 2009

Nodes refuse to PXE boot

Occasionally Windows HPC Server 2008 will start ignoring PXE boot requests, even from 'authorized' compute nodes. This happens in cases where you have your PXE Boot option set to only respond to known clients.

When this happens, you'll see on the compute node that the NIC will fail to obtain a DHCP address or PXE boot image. It should then time out and continue on in the boot order you've set in the BIOS. This will cause a deployment to fail.

This generally happens immediately after importing nodes via XML.

The workarounds:

1. Reboot the Head Node. Simple but disruptive. This fixes the issue, for now.

2. OR, Resart the HPC SDM Store service. I recommend doing this via Services.msc. Highlight the HPC SDM Store service and click Restart. This will stop and restart any dependent services as well.

Now your authorized Compute Nodes should PXE boot successfully.

I have also seen cases where the nodes are stuck in a loop, 'Waiting for Authorization' from the Head Node. This happens in cases where the PXE Boot option is set to all hosts, not known hosts. The workaround in this case is the same. Bounce the SDM Store service.

Monday, September 28, 2009

Multicasting Errors

Occasionally, I see image deployments fail to Multicast, and revert to Unicasting. I have seen this behavior appear after upgrading Server 2008 from SP1 to SP2, but I have also seen it without the upgrade.

Troubleshooting the problem
The server provisioning log shows the basic error "Multicast Failed with Exit Code 1", and Unicasting of the image begins via Robocopy. This doesn't tell you much. If you log into the Compute Node Console, you can watch the Multicast Copy or failure happen. You can also see it in your deployment logs, found in \SpoolDir\DeploymentLogs\\

The deployment log should show the Multicasting happening right after your successful Diskpart script. It should look like this:

09/16/09 15:21:33: COMMAND: CcpImg.exe -ipaddress:192.168.0.1 -namespace:"YOUR-SERVERNAME-CCP" -user:"*******" -filesrc:"Images\baseoobe.wim" -filedst:"%INSTALLDRIVE%\baseoobe.wim" -password:"*******"

Occasionally, the WDS Multicast Namespace will get randomly deleted. When this happens the client shows this error, with usage details.

09/16/09 15:21:31: **** Command execution finished, sending result ****

09/16/09 15:21:32: **** Result sent to server ****

09/16/09 15:21:33: COMMAND: CcpImg.exe -ipaddress:192.168.0.1 -namespace:""

usage:
CcpImg -unattendSrc:
-unattendDest:
-user:
-password:
CcpImg -wimSrc:
-extractDir:
CcpImg -ipAddress:

-fileSrc:
-fileDst:
-nameSpace:
-user:
-password:
09/16/09 15:21:34: **** Command execution finished, sending result ****
09/16/09 15:21:34: **** Result sent to server ****
09/16/09 15:21:39: COMMAND: robocopy "Z:\\Images" "C:\\\\" "baseoobe.wim" /R:5 /W:5
-user:"*******" -filesrc:"Images\baseoobe.wim" -filedst:"%INSTALLDRIVE%\baseoobe.wim" -password:"*******"

NOTE that the namespace is blank, represented by 2 double quotes.

The workaround - Reinstall WDS.

This does not require a reinstall of HPC.

1. From an administrative command prompt, run this command:

Servermanagercmd -remove WDS

2. Reboot the Head Node.

3. Log in again and launch another administrative command prompt.

Servermanagercmd -install WDS-Transport

4. Reboot the Head Node.

Your multicasting issues should be fixed.

Tuesday, July 21, 2009

HPC 2008 SP1 - Installation Caveats

SP1 is here
Windows HPC 2008 SP1 (version 2.1.1703) shipped on July 7, 2009. I have installed it on several HPC clusters and have the following tips for anyone considering installing the patch.

Download the patch here.

Close all HPC management components and run the patch on your Head Node. You will be warned that the system may require a reboot. This is not entirely true. The system WILL REBOOT without any warning upon applying the patch.

To deploy the latest version of HPC bits to the Compute Nodes you must re-deploy them. For most installs, this means simply re-imaging them.

To verify that you are running a consistent HPC version on your cluster, use the Column Chooser in Node Management to add 'version' to your view. This is a sortable column.

Note: If you are monitoring your HPC cluster with System Center Operations Manager 2007, there is an upgraded management pack available as well. Download here.

Wednesday, July 8, 2009

Customize your HPC Deployment Boot Image

Customizing driver support for Boot and Install images in HPC is quite easy. You just associate the drivers with an Image and they become available automatically. But the method used by the HPC tools to load the drivers in the Boot image tends to load them late in the boot process, via the drvload command.

What if your driver is not compatible with drvload?
I have come across instances where loading INF files via drvload prompts for a reboot to take effect. This breaks your HPC node deployment. You still have a need to get your chipset or network drivers loaded. What do you do?

Customize your Boot.WIM file manually
Using tools included in the Windows Automated Installation Kit (WAIK), you can inject drivers into your Boot.wim. Install WAIK and add its install directory to your system path.

NOTE: This customization must be done to the Boot.WIM that is provided by HPC after your Head Node is built and configured. You can't copy the Boot.WIM file from one Head Node to another, as there are network calls in each that are hard coded to call home to the Head Node that created it.

Follow the instructions included with WAIK to mount your boot.wim image.

Copy boot.wim from your data\boot\x64 share. Place it in a temp directory on a machine with WAIK installed. Create a new directory in the same folder with the name 'mount'. Then mount the image for Read/Write.

Imagex /mountrw boot.wim 1 mount

Inject drivers using PEIMG.

peimg /inf=C:\Temp\Drivers\*.inf mount\windows (subsititue actual driver location path).

Dismount the image and commit changes.

Imagex /unmount /commit mount

Copy the image back to your head node deployment share, overwriting boot.wim. Deploy your compute nodes.

Now your custom drivers will load along with Windows and your deployment will no longer be held up by reboot prompts.

Tuesday, June 23, 2009

Speed up your Multicast Deployment

Have you noticed that HPC Server 2008, in the default configuration, uses less than 10% of the bandwidth (on a Gigabit NIC) when sending multicast images on the private network? Just look at the network utilization in Perfmon during a deployment to see what I mean.

Punch it up
Here is an undocumented tweak that can increase the network utilization and speed deployment.

Registry Disclaimer:
If you edit your registry without backing it up, you could cause worldwide famine, gaps in the space/time continuum and potentially catch an STD.

On the Head Node, Edit this Registry key:

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\WDSServer\Providers\WDSMC\Profiles\Custom\TpCacheSize

Edit FROM 1190 TO 11,190 (Decimal).

Then Restart WDS Server Service.

You may need to experiment with other values between these numbers to prevent swamping your private network.

This tweak is undocumented and unsupported, so your mileage may vary. But my 3GB Compute Node WIM Image now multicasts in under 1 minute. That cut 10 minutes off my total deployment time!

Nodes Stuck in Draining

Stuck Node?
Occasionally when taking nodes offline, they will go into a Draining state and stay there. No amount of Canceling Operations will help. Here is what to do when this happens.

Get the Hotfix
See this Microsoft KB. Apply the patch to your Head Node. NOTE: Some users have reported forced reboots (without warning) of the Head Node when applying this patch. Apply it only when you can afford a reboot.

Force the Node Offline with Powershell
PS> Set-HPCNodeState -force -state "offline" -name "Cluster1-Node08"

To force multiple nodes offline in one command, use a wildcard "*" character, if your naming convention allows.

If that fails
You can also delete the node with Powershell. To delete the node:

PS> Remove-HpcNode -Name "Cluster1-Node08"

Then wait for it to check in again (as Unknown) and assign it the normal template. It shouldn't force a re-image of the node.

Sunday, June 21, 2009

Internode Connectivity Diagnostic Failures

Welcome to the inaugural posting of HPCMonkey! I hope these bits of experience prove helpful in managing your Windows HPC System.

Host Name Management
Did you know that in the current version of Microsoft Windows HPC Server 2008, all Host Name resolution is managed via the Hosts file? Really. Take a look on one of your Head Nodes or Compute Nodes. Look in C:\Windows\System32\Drivers\etc\Hosts. Open with Notepad or Wordpad.

The Implications
When the Head Node (HN) needs to communicate with a Compute Node (CN), it refers to its hosts file first, rather than using your internal DNS server, to look up the IP address. Generally this works fine, as the HN keeps a fairly current copy of the Hosts file. In the case where a CN needs to communicate with another CN, it too will refer to its own hosts file, rather than your internal DNS server. If this file is outdated, communication failures will occur, even if your DNS is up to date.

The Hard Learned Lesson
This hosts file is only updated about 10 minutes after all Provisioning activities are completed. So, if you are in the middle of provisioning say 100 nodes and 50 are complete, don't bother trying any diagnostics such as Internode Connectivity or MPI Ping-Pong. The tests will fail.

How to avoid this quirk in the future
Wait. Wait until all provisioning activities are complete, with nodes either going into an offline (successful deployment) state or into Unknown (failed deployment) state. Then wait another 10 minutes for the updated hosts file to be propagated to all CNs. Then you can start your diagnostic tests.

What will the future hold for HPC?
One would hope that a more robust and responsive hostname management system will be put into place, such as enabling DNS Services on the Head Node and allowing it to manage all hostname resolution within the cluster.