why you shouldn’t decommission exchange 2003 in the middle of the day

After reading the post title I know what you’re thinking.  I won’t try to justify the ill-advised nature of this decision.  Suffice to say if I only scheduled production changes for after-hours I’d either fall hopelessly behind or never sleep.

We have these two old Exchange 2003 servers configured in a cluster.  Let’s call the hostname for this cluster OldMailServer.bruteforce.local.  We already migrated all mailboxes to our 2010 cluster (NewMailServer.bruteforce.local) many months ago.  We were fully aware that many network devices, including scanners, were still pointing to OldMailServer.bruteforce.local or its IP address (2003 IP).  I decided on the following steps to complete the decommission of Exchange 2003.

The plan:

  1. Re-point OldMailServer.bruteforce.local  in DNS to 2010 IP.
  2. Assign a new temporary IP address to the 2003 cluster
  3. Since devices are still sending mail to 2003 IP, we assign this as a secondary IP on the 2010 CAS array.
  4. We must add a static ARP table entry in our data center switches for 2003 IP since it is being shared by two CAS servers in a Windows NLB cluster.
  5. Stop the 2003 cluster group (basically shut down exchange services).
  6. Wait a week and then uninstall Exchange 2003.

How did this blow up in our faces?

We completed the first 5 steps and the calls started hitting our homicidal Help Desk.  There were two problems being reported:

  1. Many users were getting username / password prompts from outlook purporting to be from OldMailServer.bruteforce.local!!  Yes the server that is OFF.  The only place that host name exists now is in DNS.
  2. No devices configured to send mail to 2003 IP are able to hit it.

Here we go.  It took a couple hours but we eventually got this ironed out.

So what went wrong with our poor innocent coworkers?

The cause of the username / password prompts from OldMailServer was some sort of reverse DNS function of Outlook.  When I pointed OldMailServer.bruteforce.local to 2010 IP I foolishly allowed it to update the PTR record.  So now 2010 IP has two PTR records.  So if one were to be conversing with NewMailServer, and one was so inclined to do a reverse DNS lookup on its IP, in reply one might get NewMailServer and one might get OldMailServer.  I’m not exactly sure what Outlook was doing here.  But the solution was to kill the PTR record pointing to OldMailServer.  Can someone explain this to me?

So what went wrong with the scanners?

Now why was nothing able to hit NewMailServer using the 2003 IP you ask?  If you remember we had configured this as a secondary IP on the 2010 CAS array.  The CAS array is configured to listen and accept traffic on all IPs.  Ok great.  The problem was with step 4 in our plan.  The dreaded static ARP entry.  I noticed that when I added the secondary IP in Windows Network Load Balancing Manager, it created a new MAC address for it (MAC 2).  I asked our Network Engineer to edit the static ARP entry for 2003 IP to point to MAC 2.  This had the effect of making 2003 IP virtually unreachable.

Why I now hate Windows Network Load Balancing (more than before)

I found a server in the data center which could successfully ping 2003 IP.  I checked its ARP table and observed that 2003 IP was associated with MAC 1.  This is the MAC address of 2010 IP, the first IP configured in NLB.  Huh?  Ok lets change the static ARP entry in the switches to point 2003 IP to MAC 1.  Now both 2003 IP and 2010 IP have static ARP entries pointing to MAC 1.  Guess what?  It works.  Thank you Microsoft.  I should send a handwritten letter to the Windows Server product manager to thank him for this useless new MAC address created for NO REASON.

What did I learn?

Two things really

  1. If you add additional IPs to Windows NLB, they just use the original MAC.  The new MAC addresses created for them are meaningless.  Software Developers call this a bug.
  2. Don’t turn off your old Exchange server in the middle of the day.Strike that second one.  I looked at my calendar and you wouldn’t believe what I’ve scheduled for this week.

1 thought on “why you shouldn’t decommission exchange 2003 in the middle of the day

  1. It will also increase the ranking of your website on search engines and will drive more traffic to your website.
    20 percent of customers have the potential to spend five times as
    much as they do currently A relatively small amount of marketing effort
    creates the majority of output. Besides placing advertisers ads on your Blog, you can also make money Blogging by placing Google Adsense into your Blog.

Leave a comment