Occasionally Windows HPC Server 2008 will start ignoring PXE boot requests, even from 'authorized' compute nodes. This happens in cases where you have your PXE Boot option set to only respond to known clients.
When this happens, you'll see on the compute node that the NIC will fail to obtain a DHCP address or PXE boot image. It should then time out and continue on in the boot order you've set in the BIOS. This will cause a deployment to fail.
This generally happens immediately after importing nodes via XML.
The workarounds:
1. Reboot the Head Node. Simple but disruptive. This fixes the issue, for now.
2. OR, Resart the HPC SDM Store service. I recommend doing this via Services.msc. Highlight the HPC SDM Store service and click Restart. This will stop and restart any dependent services as well.
Now your authorized Compute Nodes should PXE boot successfully.
I have also seen cases where the nodes are stuck in a loop, 'Waiting for Authorization' from the Head Node. This happens in cases where the PXE Boot option is set to all hosts, not known hosts. The workaround in this case is the same. Bounce the SDM Store service.
Tuesday, October 20, 2009
Subscribe to:
Comments (Atom)