We had an issue recently with a customer using Server 2013 on premise and Office 365 for unified messaging (voicemail and auto attendants). One morning we got a call that the UM was broken…
Here are some symptoms:
As I researched these further – the various results from Dr. Google showed that it could be lots of different things. One thing was clear: secured communication couldn’t happen between Lync and Exchange Online.
The Lync Front End server threw this error:
The Lync Edge Server (the one that actually federates and communicates to Exchange Online) threw this error:
Hmm. Password hash maybe? Well, we do run Directory Sync and Password Sync, so I wondered if something there freaked out.
As we researched more, and walked through the timing of the issue something started to make more sense. We we were testing a migration strategy with a third party. We had turned Password Sync (not Directory Sync) off for a time period and we had turned it back on about the same time as the problem started. I thought that maybe something there could make sense.
Sure enough, the problem started about the same time we turned on Password Sync. We turned it back on at approximately 6:30 p.m. Central time. It turns out that the UM integration broke somewhere between 6:18 p.m. and 6:21 p.m. The timing was right, but it ended up being a red herring.
I looked through those messages a little more. Did you notice the “date skewed” there? Also, notice the event log on the Edge – EventID 14619. The issue was time.
The reason UM integration broke somewhere between 6:18pm and 6:21pm is because the Edge Server rebooted. When it rebooted, it lost its mind. It came up with -0600 as a time zone (which is correct) but the time was five hours off. We’d rebooted that server dozens of times, why did it choose this particular day to go stupid? Who knows. Yay, technology!
When I logged on the Lync Edge it was 9:31 p.m. Central time but the Edge thought it was 4:31 p.m.
I reset NTP and re-synchronized time. I then restarted the Lync Edge “Access Edge” Service so that federation with Exchange Online could come back online.
Everything now correctly showed 9:33 p.m. as the time. You can see that clearly in the event logs as well. Notice the time “jump” in the logs between 4:21:04 p.m. and 9:33:25 p.m.
Now everything works perfectly. Lync On-prem to Exchange Online UM integration is back to normal.
Indeed, we did have a Password hash issue but it wasn’t at all what we thought. Because of the date skew on the Edge, our Lync on premise would not securely federate with Exchange Online: “exap.um”. The “routing” all was fine, and the initial communication happened – but immediately failed – which led to the missed call email being delivered.