Tuesday, September 22, 2015

How to get WDS PXE boot on Windows Server 2012 R2 to work on VMWare ESXi 6

After struggling with getting Windows Deployment Services to work on VMWare ESXi 6, I thought I should share a simple tip.

A virtual client doing PXE boot on ESXi 6 (or 5.5) on a standard default installed WDS on Windows Server 2012 R2 does not work out of the box. You have to do some small magic in order for this to work.

Here is what happens with VM that tries to network boot from a WDS server using the standard boot.vim that is included in the Windows 2012 R2 ISO/DVD:
Error while obtaining an IP address from the DHCP server

As you see, the VM does not get an IP from the DHCP server, although it clearly used to have an IP otherwise you couldn't get so far. So the VM does a network boot (PXE with F12), and then it fails.
When you are doing this on a VM running on ESXi 6 (or 5.5) and you are using the VMXNET3 nic on the VM, this is failing. The reason is that the driver for VMXNET3 is not included. If you setup your VM with E1000 or E1000E nic, this is not a problem as the driver for those nic's are included in the default boot.vim boot image.

So how to add the driver?
This is very simple, but takes some steps to perform.
1. First you need to find a VM with VMWare Tools installed. In my case, the WDS server is a Windows 2012 R2 server running as a VM.
2. Second, you need to add the VMXNET3 driver to a package group, which you already should have created. I just copied all the drivers from C:\Windows\System32\DriverStore from my WDS server for a start. And that is fine, but VMXNET3 is not included there. So right click your Drivers folder in the console and choose "Add Driver Package...":

3. Just browse to your C:\Program Files\Common Files\VMWare\drivers\vmxnet3 and choose the vmxnet3ndis6.inf file and click Next 2 times:

4. In the end, choose to add it to your driver group of choice. 

5. Then you need to add the driver for VMXNET3 to the boot image. Go into the WDS console, find your boot image and right-click, click "Add drivers packages to image...":

6. Click Next.
7. On this step, click the "Search for package" button:

8. Somewhere in the bottom you should find vmxnet3ndis6[x64]. Just unselect all the others and choose only the vmxnet3ndist6 driver package. It's a bad idea to add unnecessary drivers to the boot image, adding all drivers from this screen will cause the PXE boot process to go into a BSOD.

9. Click Next 2 times and you are done!

Now you can PXE boot your VM without problems getting DHCP.

Monday, April 13, 2015

Direct Access - Connecting hangs forever

For those who struggle with the problem:

Sometimes when a Windows8 client starts to connect to DirectAccess from the outside, you notice that the GUI show "Connecting" and will never connect.

But if you test to connect to your internal network resources, you'll notice that you indeed can connect to those internal resources. How come?

There are most likely 2 causes for this:
The icon you see, is actually the DCA (DirectAccess Connectivity Assistant) that is now built-in in Windows 8 and later. On Windows 7, you had to manually install it or deploy it with SCCM.

And DCA is not really needed, unless you need to support OTP on Windows7. You could do without, but of course that makes troubleshooting the clients much harder.

Alright, so DA is working, DCA is not.

Let us check what the DCA is checking. As the name indicates, it is purely an cosmetic connectivity check, unless you are using OTP. On of the things the DA wizard do when you setup DA, is to specify in Step 1 what DCA should check for.
And if you notice this screen:

You would see that the DA wizard adds directaccess-WebProbeHost.yourinternaldomain.local and creates an A record in DNS for this. This points to the same IP as your webprobe addresse (which is NOT your NLS server).

So if your Win8 client hangs forever connection to DA, it means that one of those tests here are failing.

In one case, a customer of mine had actually shutdown the DA server and the IIS server hosting the webprobe site for an extended period due to some internal maintenance. In that meant more then 7 days, which is the default scavenging interval if you have enabled scavenging in Windows Server 2008/2012. And the directaccess-WebProbeHost record has a lower TTL then this. So it meant that the A record was automatically deleted.

I just manually created a static A record in DNS for directaccess-WebProbeHost, so I made sure that it will never happen again.

Another reason, might be if you have enabled ELB with BIG-IP/Kemp or something similar. See the excellent blog of Richard Hicks for this scenario. http://directaccess.richardhicks.com/2014/08/12/directaccess-clients-in-connecting-state-when-using-external-load-balancer/