We recently noticed most of our DMZ-based OpsMgr agents were not connecting to their gateway server. On the agent we saw the following event:
Event Type: Error
Event Source: OpsMgr Connector
Event Category: None
Event ID: 20070
Computer: <Computer>
Description: The OpsMgr Connector connected to <domain>, but the connection was closed immediately after authentication occurred. The most likely cause of this error is that the agent is not authorized to communicate with the server, or the server has not received configuration. Check the event log on the server for the presence of 20000 events, indicating that agents which are not approved are attempting to connect.
On the gateway server the following event was being logged:
- Is TCP 5723 open to the gateway server?
- Restart the HealthService
- Is agent in Pending Management?
- Restart the HealthService
- Wait 5 minutes
- Restart the HealthService
All to no avail. We found http://blogs.technet.com/operationsmgr/archive/2009/02/17/opsmgr-2007-port-requirements-for-scom-agents-in-a-dmz.aspx which suggested opening ports 88 and 389 from the agent to the RMS. This did not make sense to us since some agents were working. So we used Netmon 3.3 to trace the client while the HealthService starts. It never used any port but 5723.
We even enabled verbose diagnostic tracing (http://support.microsoft.com/kb/942864) and reviewed the logs. We saw where the 20070 event was being generated but not much interesting besides that:
5412.5956::02/19/2010-10:46:56.978 [Common] [] [Verbose] :Common::EventLogUtil::LogEvent{EventLogUtil_cpp311}Logging error event 20070 with args “<servername>”, “NULL”,”NULL”, “NULL”, “NULL”, “NULL”, “NULL”, “NULL”, “NULL”
5412.5956::02/19/2010-10:46:56.978 [Common] [] [Information] :Common::EventLogUtil::LogEvent{EventLogUtil_cpp397}Logging event 20070 from source “OpsMgr Connector” with severity Error and description “The OpsMgr Connector connected to <GatewayServer>, but the connection was closed immediately after authentication occurred. The most likely cause of this error is that the agent is not authorized to communicate with the server, or the server has not received configuration. Check the event log on the server for the presence of 20000 events, indicating that agents which are not approved are attempting to connect.”.
Solution…
We finally had to call Microsoft. After about 30 minutes of troubleshooting the engineer saw that the OpsMgrConnector.Config.xml file in the C:\Program Files\System Center Operations Manager 2007\Health Service State\Connector Configuration Cache\<MgmtGrpName> folder on the gateway server was last modified several weeks ago. He had us rename the Health Service State folder under C:\Program Files\System Center Operations Manager 2007 and restart the HealthService. After this a new Health Service State folder was created and the OpsMgrConnector.Config.xml had a much more current last modified date. We then restarted the HealthService on the agents and they reported in to the gateway server correctly.
