5 November 2013

Adding another Front End Server to the Enterprise Pool

Just recently I added another 2013 Front End Server to an Enterprise Pool and was subsequently presented with some interesting issues.

After adding the new Front End to the Topology and running the setup I moved on to migrating some test users and saw the following error:



Distributed Component Object Model (DCOM) operation SetMoveResourceData failed??

I checked the replication and all Lync servers were replicating.
Checking the Lync Topology servicers I found that my new Front End server wasnt in show room condition, see below:



The green check on the 3rd line shows replication OK but status not OK
So now I am wondering which services are not running, off to Front End number 3 to check the Lync services. As supected there are services that arent running.



Turns out both the Call Park Service and the Response Group Service wont start. Perhaps there is a hint as to why in the Event Log..



..and I found this clue. Seems that FE3 is unable to connect to the backend SQL. Also the little Inner exception reference (saw that in the DCOM reference).
Login failed for user was the real gem.

Checking with the SQL admin I got confirmation that FE3 was trying to connect to the Backend SQL database and failing the login about once every second (SQL guy not happy).

OK, so how come FE3 doesn't have the permissions on the Backend SQL server??
Seems that my temporary SQL Admin rights were not in place when adding FE3 to the Topology, this is when the SQL permissions are set. Interestingly there were no errors when I published the Topology.

How to fix these permissions? Either get unhappy SQL admin to re-publish the Topology with his SQL Admin account or get temporary SQL Admin permissions and DIY.

...yes, and voila



4 November 2013

OAuth certificate missing

Issue
Whilst deploying Lync Enterprise Edition with 3 Front End Servers I cam across an interesting issue. FE 1 was fine but when I fired up FE 2 and got to the certificate wizard the OAuth Certificate was missing.
One thing you will notice if there is no OAuth certificate is that the Lync Front End Service wont start. OK so where is the cert???

Found a good blog explaining the purpose of OAuth here (thanks Doug)
So fist thing was to see if the Front End Servers were replicating, and indeed they were BUT no OAuth. 

Checking the Cert Manager through MMC shows that the cert isn't in the personal store. Adding it there manually didn't help me much either...

Seems that it needs to be put there by the replication process.

I decided to move along (against my best judgement and the clock) and add the default cert to FE 2 and then come back to OAuth. Re-ran setup Step1 and Step 2, rebooted the server and after that still now OAuth.

Retracing my steps I noticed that the internal DNS records had not been added yet.

Resolution
You must add the Front End Pool FQDN with all the individual Front End Server IP addresses. Ensure that when you run nslookup that all the Front End IP's are found. If an IP is missing from DNS you wont get the OAuth Certificate....

Below is an error I got in the event log.

The replication of certificates from the central management store to ...2013, Replica Replicator Agent will continuously attempt to retry the replication.While this condition persists, the certificates on the local machine will not be updated

Object reference not set to an instance of an object.
at Microsoft.Rtc.Management.Common.Certificates.CertUtils.GetKeyFileName

1 November 2013

Edge Server wont Replicate

So "Lync Rescue" continues to keep me busy...

Issue
Got asked to look at why a Lync Edge Server wont replicate.

Cause
Seems that that the External IP addresses of the Edge server are resolving to Edge Server internal FQDN

Resolution
Correct DNS, the only DNS entry for the Edge Server should be the internal facing IP address resolving to the Edge Servers FQDN.