11 December 2013

Can't call UM - Event ID 44022

Issue

Unable to call UM

Evidence

Every time a call is made to the UM server the following error is logged on the Lync server front end:-
Log Name:      Lync Server
Source:        LS Exchange Unified Messaging Routing
Date:          10/12/2013 09:27:05 PM
Event ID:      44022
Task Category: (1040)
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      lyncFE.lynclab.local
Description:
An attempt to route to an Exchange UM server failed.
The attempt failed with response code 504: exchum.lynclab.local.
Request Target: [DefaultUM@exchum.company.local], Call Id: [9f9a3df33d2544a9a22ff2c17feedcb5].
Failure occurrences: 4, since 10/12/2013 08:36:55 PM.
Cause: An attempt to route to an Exchange UM server failed because the UM server was unable to process the request or did not respond within the allotted time.
Resolution:
Check this server is correctly configured to point to the appropriate Exchange UM server. Also check whether the Exchange UM server is up and whether it in turn is also properly configured."

Solution

Make sure the dialplan is configured to use TLS (and not TCP). 

5 November 2013

Adding another Front End Server to the Enterprise Pool

Just recently I added another 2013 Front End Server to an Enterprise Pool and was subsequently presented with some interesting issues.

After adding the new Front End to the Topology and running the setup I moved on to migrating some test users and saw the following error:



Distributed Component Object Model (DCOM) operation SetMoveResourceData failed??

I checked the replication and all Lync servers were replicating.
Checking the Lync Topology servicers I found that my new Front End server wasnt in show room condition, see below:



The green check on the 3rd line shows replication OK but status not OK
So now I am wondering which services are not running, off to Front End number 3 to check the Lync services. As supected there are services that arent running.



Turns out both the Call Park Service and the Response Group Service wont start. Perhaps there is a hint as to why in the Event Log..



..and I found this clue. Seems that FE3 is unable to connect to the backend SQL. Also the little Inner exception reference (saw that in the DCOM reference).
Login failed for user was the real gem.

Checking with the SQL admin I got confirmation that FE3 was trying to connect to the Backend SQL database and failing the login about once every second (SQL guy not happy).

OK, so how come FE3 doesn't have the permissions on the Backend SQL server??
Seems that my temporary SQL Admin rights were not in place when adding FE3 to the Topology, this is when the SQL permissions are set. Interestingly there were no errors when I published the Topology.

How to fix these permissions? Either get unhappy SQL admin to re-publish the Topology with his SQL Admin account or get temporary SQL Admin permissions and DIY.

...yes, and voila



4 November 2013

OAuth certificate missing

Issue
Whilst deploying Lync Enterprise Edition with 3 Front End Servers I cam across an interesting issue. FE 1 was fine but when I fired up FE 2 and got to the certificate wizard the OAuth Certificate was missing.
One thing you will notice if there is no OAuth certificate is that the Lync Front End Service wont start. OK so where is the cert???

Found a good blog explaining the purpose of OAuth here (thanks Doug)
So fist thing was to see if the Front End Servers were replicating, and indeed they were BUT no OAuth. 

Checking the Cert Manager through MMC shows that the cert isn't in the personal store. Adding it there manually didn't help me much either...

Seems that it needs to be put there by the replication process.

I decided to move along (against my best judgement and the clock) and add the default cert to FE 2 and then come back to OAuth. Re-ran setup Step1 and Step 2, rebooted the server and after that still now OAuth.

Retracing my steps I noticed that the internal DNS records had not been added yet.

Resolution
You must add the Front End Pool FQDN with all the individual Front End Server IP addresses. Ensure that when you run nslookup that all the Front End IP's are found. If an IP is missing from DNS you wont get the OAuth Certificate....

Below is an error I got in the event log.

The replication of certificates from the central management store to ...2013, Replica Replicator Agent will continuously attempt to retry the replication.While this condition persists, the certificates on the local machine will not be updated

Object reference not set to an instance of an object.
at Microsoft.Rtc.Management.Common.Certificates.CertUtils.GetKeyFileName

1 November 2013

Edge Server wont Replicate

So "Lync Rescue" continues to keep me busy...

Issue
Got asked to look at why a Lync Edge Server wont replicate.

Cause
Seems that that the External IP addresses of the Edge server are resolving to Edge Server internal FQDN

Resolution
Correct DNS, the only DNS entry for the Edge Server should be the internal facing IP address resolving to the Edge Servers FQDN.

16 October 2013

Publishing Topology Completed with Errors - Microsoft.Rtc.Applications.TestBot

Issue
Publishing Topology Completed with Errors - Microsoft.Rtc.Applications.TestBot

What is Testbot?
Testbot is the little fella that runs the Audio Test Service

You can see TestBot SIP Address woth the Lync Server Management Shell command:
Get-CSAudioTestServiceApplication

OwnerUrn: urn:application:testbot
SipAddress: sip:RtcApplication-0e0e407a-6283-4c93-99fa-c0e252b8af09@lynclab.local

OK so that has also revealed the problem in my case, the SIP domain was renamed from .local to .com and it seems that TestBot now has an invalid SIP address.

Probably would have noticed other issues had I tried using the Audio TEst Service as that SIP address would be unreachable.

Solution
We will need to correct the TestBot SIP address in ADSIEdit.
Using the SIP address identity (from  Get-CSAudioTestServiceApplication) we can search for the object.

Identity: CN={0e0e407a-6283-4c93-99fa-c0e252b8af09},CN=Application Contacts,CN=RTC Service,CN=Services,CN=Configuration,DC=contoso,DC=local


  1. Open the ADSI Editor.
  2. Right click ADSI Edit and select Connect To, and then select Configuration from the "Select a well known Naming Context list" and click ok.
  3. Click on node CN={46577062-9cae-404b-b89c-a3d39511e4cc}, CN=Application Contacts, CN=RTC Service, CN=Service, CN=Configuration, DC=lynclab, DC=local, and then right click this node and then select properties.
  4. Choose the msRTCSIP-PrimaryUserAddress attributes, and change the domain part of value to: sip:RtcApplication-0e0e407a-6283-4c93-99fa-c0e252b8af09@lynclab.com
  5. Log on to your Lync front end server and restart Audio Test Service. Once the service has been restarted, try a test call from Lync client.
You should no longer have issues publishing the Topology either.

NOTE
The CN attributes, GUID's and domain names used are from my Lab so please replace with your own.

10 October 2013

Call loop between Lync and Gateway

As part of ongoing expansion we recently added another DDI range. Most of the new range would be dormant initially. Of course I wanted to catch all the calls that were unassigned and send them off to a RGS.

Here is where the plot thickens...
This seemed to work well in Lync 2010, now I find myself on 2013 and see a few interesting new features...like trunk-to-trunk routing.

So I ring the unassigned number and low and behold, the gateway sends the call to Lync. Lync sees the number as unassigned and does a trunk-to-trunk transfer sending the call back to the gateway. You know the rest of it right, the gateway sends the call to the Telco who sends the call back to the gateway and then back to Lync and before we know it all the SIP channels between Lync and the gateway are saturated.

The caller simply hears what seems to be a longer than usual set up time and then the call fails.

So by now I am thinking why isnt my unassigned number config working...
A similar issue on Technet leads me to believe that the issue is resolved in Lync 2013 July here 

Not in my case though...

Further down the same Technet page I came across a reference to Chris Norman's article on Call Park retrieval Issues here (nice article Chris!)

I also came across this blog http://d1it.wordpress.com/2013/05/14/pstn-to-lync-2013-unassigned-number/ (again nice work MLamontagne)

Turns out the gist of the matter is that the trunk-to-trunk transfer is engaging prior to the call reaching the Unassigned number config.


The Solution As per Chris' findings, removing all the PSTN Usage records from the Trunk configuration which in effect prevents the Trunk-to-Trunk transfer from engaging.

If you needed Trunk-to-trunk routing a separate route should be built for the purpose.

Removed the PSTN usages from the route, and voila!

30 September 2013

Event ID 31196 LS Response Group Service

Issue
Today I found a Warning that was filling my Lync 2010 Front End event Log. Event ID 31196 every 2 minutes



Collaboration platform has failed to provision the data.

Collaboration platform has failed to provision the data because of the following exception:
Exception: Microsoft.Rtc.Collaboration.ProvisioningFailureException:Cannot read contacts from Active Directory: Active Directory server "adbdc01.mydomain.co.nz" is not available. Try again later.

Solution
Restart the Lync Response Group Service...

29 September 2013

SIP 503 after Migrating Gateway to Lync 2013 pool

Yesterday I was cutting over a SIP Gateway from a 2010 pool to a newly built 2013 pool. Migrating the gateways and Trunk configuration is really simply.

What surprised me though is that once I had made the required config changes and published the Topology I was unable to make or recieve calls.

Looking at the SBC Gateway I found the error SIP 503 Service Unavailable for each attempted call setup. OK..

I fired up wireshark on the Mediation Server and found that the call was being sent to the Mediation Server but was being ignored ...??? What's going on here?

Checked connectivity and ports, all good.

Lets look at an outbound call to see if that reveals anything. 
An attempt to make the call fails immediately. This usually suggests that the call routing isn't even engaging. Often a symptom of call routing issues. 

Since I never changed any call routing I  just confirmed that the Gateway is correctly associated with the route I was using, all good.

Another culprit is often that the gateway is marked as down. A quick look at the Event Log confirmed by the Event ID 46046 stating that the gateway has been marked as down. Once the gateway is down Lync won't even try to send the call to the gateway hence the immediate failure.

Usually restarting the Lync Mediation Service will reset the communication to the Gateway. But in my case this didn't happen. The gateway stays "down".

After much head scratching I noticed that the Mediation service wasn't able to resolve the Lync Edge Pool name. Since its recommended that the Edge Servers are not domain joined their DNS records need to be manually added.  

Once the Mediation Server was able to resolve the Edge Pool Name the calls started working.

Go figure...

12 August 2013

Troubleshooting: Lync on Prem - UM in Cloud

Got asked to solve an issue where Voicemail for users in a Lync onprem, UM in cloud environment had mysteriously stopped working.

The FE event log had this to say

Running Lync Traces showed the following

Followed by the Error

So as exchange UM is in the O365 cloud and the event says to check that off I went.

Steps to check O365 UM configuration with Lync on premise
I'll start with checking the Lync setup
1. What DNS is required?
         _sipfederationtls._tcp.lynclab.co.nz port 5061 dest Edge FQDN 
         Edge FQDN
2. Ensure Access Edge Configuration is correct

3. Check the Hosting Provider

4. Check the UM Contact Object

5. Make sure that a test user is enabled for Hosted VM

6. Check the Hosted Voicemail Policy 

7. Make sure that the Edge Server is replicating

Now let Check the O365 configuration
1. Check that the UM Dial Plan is setup. Now this is really simple, there is absolutely no trick at all. Just remember that there wont be an IP gateway. Thats it!

2. Check to see the Authoritative domain in O365 matches the Organization configured in CsHostedVoicemailPolicy by running Powershell remotely to connect to O365 deployment.
From Powershell...

$cred = Get-Credential
$s = New-PSSession -ConfigurationName Microsoft.Exchange -ConnectionUri https://ps.outlook.com/powershell -Credential $cred -Authentication Basic -AllowRedirection

importresults = Import-PSSession $s

Get-AcceptedDomain

Note the Results.
In my case this is where the problem was, somehow ... mysteriously the Authoritative Domain had changed.

Update
Now I have see this else where, when I have the opportunity to investigate why I'll come back and post an update.



6 August 2013

When publishing our Lync Topology

When publishing our Lync Topology I'm getting the following error on the Enabling Topology step:
Error: Found multiple objects with identity "lyncFE01.lynclab.local,McxInternal" in Active Directory.
 Details
 Type: ActiveDirectoryException
  Stack Trace
at Microsoft.Rtc.Management.Deployment.Core.CompatTrustedService.GetTrustedService(ADSession session, ADObjectId containerId, String fqdn, String serviceType)
at Microsoft.Rtc.Management.Deployment.Core.CompatTrustedService.Create()
at Microsoft.Rtc.Management.Deployment.Roles.WebServices.GlobalActivate(IService service, Computer computer)
at Microsoft.Rtc.Management.Deployment.Core.Service.GlobalActivate(Computer computer)
at Microsoft.Rtc.Management.Internal.Utilities.LogWriter.InvokeAndLog[T](Action`1 action, T arg)
8/6/2013 2:41:59PMError
Error: An error occurred: "Microsoft.Rtc.Management.Deployment.ActiveDirectoryException" "Found multiple objects with identity "lyncFE01.lynclab.local.McxInternal" in Active Directory."



Solution
1) Run Test-CsTopology -Report C:\temp\testtopology.html
2) Prepare the appropriate AD for a TXT file:Ldifde -f c:\temp\addif.txt -s DC_FQDN -d "CN=RTC Service, CN=Services, CN=Configuration, DC=lynclab, DC=local"
3) Find the duplicate entries in txt file. Then delete them from AD using ADSIE Edit

IP Change for Gateway\SBC

Issue
Change of IP to SIP SBC causes one way speech for outbound calls. SDP shows internal IP on call setup

Solution
Force the deployment to use fixed addresses in topology, publish. Then remove this (remember to visit the PSTN gateways tab) publish.

You should have no IP's set when running 

get-csnetworkinterfaces for PSTN

Lync cannot verify that the server is trusted

Problem
You get the error message "Lync cannot verify that the server is trusted for your sign-in address"




Cause
When Lync Communicator discovers the Lync FE to log on to it uses the SRV Record _sipinternaltls._tcp.SIPDOMAIN.com. If the associated server FQDN is resolved to a server that doesnt match the SIP DOMAIN then this error is presented.EG Below record is for DNS zone xxx.co.nz, Sip Domain is xxx.co.nz but target host is a .local FQDN


Solution
Add an A record (xxx.co.nz for the FE Server) with matches the Sip and DNS, then edit the SRV record to point to this record.

Credential Prompt

Problem
Get a second prompt for credentials when logging in with the following text..
"Type your user name and password to connect for retrieving response groups"



Solution
The Lync Share needs to have read\write permissions to itself and containing folders. Corrected in the Advanced Sharing tab as below.


Lync Control Panel

Problem
Can't connect to the Lync Server control Panel directly but https:\\FQDN\cscp works

Solution
Something I didn't realize is that the Control Panel uses the DNS _sipinternal SRV Record. This is the record in the DNS branch matching the SIP Domain and not necessarily the branch that matches the internal DNS naming.

Certificate Authentication Problem

Problem
Lync cannot verify that the server is trusted for your sign-in address. Connect anyway?
Cause
Lync Client 2013 has an additional safety check implemented in that the users SIP Domain  is compared with the FQDN of Lync server when the user tries to connect.


In the most environments, the SIP domain is different from the Active Directory domain.

Solution
HKEY_CURRENT_USER\Software\Policies\Microsoft\Office\15.0\Lync

here you need to modify or add the "new String Value" TrustModelData
in this key, you need to add the server listed in the warning.
e.g. lyncpool.lynclab.local

Computer clock

Problem
Communicator can't sign in and and reports:-
Cannot sign in to Communicator because your computer clock in not set correctly...

Solution
This is caused when there is a time difference between the Lync\OCS server and the clients. I think the maximum threshold is around 10 minutes for time difference. Correct this and you should be sorted

problem verifying certificate

Problem
When trying to sign in to Lync get the following error:
There was a problem verifying the certificate


Solution
It's either a certificate trust issue or a DNS name mismatch to the certificate that you have issued.  
The PC or device which you are using to logon to Lync needs to trust the certificate chain from which you generated the Lync certificate(s) and the DNS records used to locate and connect to the Lync server need to match the name(s) on the certificate.

In my case I was using Manual Login and pointed to the IP address which was obviously not in the certificate :p

Web Conferencing: Target Principal Name is incorrect

ProblemWhen accessing the meet url from outside the corporate network you get the error Server error 500 - Target Principal Name is incorrect

Cause
When you tickle the TMG rule the traffic is redirected to the Lync FE, however the requested URL [eg.https://webconf.lynclab.co.nz/meet/john.bravo/9c6gsa] needs to be in the internal FE cert...
Solution
Simply update the cert. So the internal cert will need webconf.abc.com, when youo run through the cert wizard on the FE it will auto populate the cert accordingly

Cant change Meet URL

Problem
Unable to change the default Meet URL. Get a red X and the OK button is greyed out.


Solution
Firstly let me say that I prefer adding a URL/meet than a meet.URL since I don't need to add additional SANs to my cert. This is the reason why this ussue has come up. In any event...
Topology builder will allow you to add a Meet and Dialin URL that actually conflicts with the External Web Services (shouldn't let you...)


 It does however give you an error if you try to make it the default or try to remove another meet url that is different from the External Web
Ultimately the simple URL's and the External URL's need to be different

Meet URL fails

Problem
Meet URL fails

Solution
Ensure that the URL is added in TMG under Published Sites.
TMG test rule will fail as it requires additional switches to be valid.
In my deployment we had multiple Edge Severs and sites - make sure that the meet URL is reachable across all sites, remember that the URL will be directed to the FE based on where the user is homed.

MCX Forbidden

Problem
Can't connect to Lync MCX service. Http Error 403 Forbidden, Lyncdiscover Http Authentication Test failed when testing https://<LyncWebService FQDN>/Mcx/McxService.svc

Also get Authentication Test failed from http://www.testocsconnectivity.com/

Solution
Error was the TMG rule


The error here says that the Credentials for the request to the site were deleted. It also explains how no delegation is set and user authentication isn't enabled. Of course this needs to be enabled!!!

IE Security

Problem
Default install of Windows 2008 internet explorer security blocks just about every page.

Solution
From the Server Manager, Deactivate IE Security as seen below

Frequent invalid SIP requests

Problem
Partners receiving a large number of errors in the Edge Server event log like below





Solution
The cause seems to be Lync still sending discovery packets every 10 minutes.
If federation is allowed, add the SIP domain to the allowed list, if blocked - add the SIP domain to the blocked list.
This will be followed by a final event entry stating that the problem has been resolved


Schema State check has failed.

Problem
Schema State check has failed. 

Solution
Both instances were linked back to DNS.
To prove that AD was healthy I ran the Prepare AD components directly from the DC (that works as usual)..which confirms that a DNS validation issue is present.
So what's going on with DNS?

Fisrtly an NSLOOKUP on the Lync box reveals that the default DNS server is unknown, adding a PTR record for the DNS server solves that.

Secondly, the installer queries the SRV records for contacting the PDC in active directory. This SRV record is: 
_ldap._tcp.pdc._msdcs. DnsDomainName 

UM Badmail

Problem
Actually this is more of a where is it than an issue ;-)
Where is the voicemail stored in UM before sending to Exchange? This includes the bad voicemail folder

Solution
C:\Program Files\Microsoft\Exchange Server\V14\UnifiedMessaging\...

Forcing Join Conference from Browser

Problem
Foreign user is sent a Lync online Meeting Request, if the invited user has Lync installed but doesn't have Federation capabilities the Join Conference request url will fail (since it calls the local Lync client to connect)

Solution
Force the conference invite URL to launch the Web and Lync Attendee options and not local Communicator Client (if present) -Just append this to the url   "?sl=1"


Lync Communicator Mobile wont login

Problem
Lync Communicator Mobile wont login

Error Message

Server unavailable at this time

Solution

On the Sign In page you enter your SIP Login name and password. However you also need to go to More Details (ios and WM7)\Options (Android) and add your user name. 

I have found that the username for WM7 needs to be Domain\User Name, although this format works on Android and ios simpoly adding the user name also works


PSTN Conferencing Error: Sorry, I can't seem to connect you to your meeting..."

Problem
While trying to call in to a conference from an external PSTN connection the error "Sorry, I can't seem to connect you to your meeting..."

Error Message
S4 traces on snooper revealed a "foreign gateway" IP address been called by the Mediation server.

Solution
The default Gateway in Topology Builder was an old (decommisioned) SIP connection (aka "foreign gateway". Changed that to the gateway I was actually using to call out on - solved!


Application Server keeps stopping

Problem
ApplicationServer (includes Call Park Service) Starts and then stops within seconds

Error Message
ErrorCode=-2146893022 
FailureReason=IncorrectNameInRemoteCertificate 
LocalEndpoint=127.0.0.1:62233 
RemoteEndpoint=127.0.0.1:5075 
RemoteCertificate=<null>

Solution
#1 make sure 127.0.0.1  localhost exists in hosts file
#2 For EE Server you need add both the FQDN of pool name and server name as SAN in the default certificate.