I was doing some Performance Analysis today (again) and on one Server I found that the CPU load regularly went to 100% and even, at times, stayed there for a while.
I happened to have a Microsoft Premier Field Engineer looking over my shoulder at the time and he said something to the effect „oh, that’s probably the Operations Manager Monitoring“… normally one would assume that monitoring is a quiet little task sitting in the background doing its work unnoticed until there is something to report. However, in this case it soon turned out to be that the „oh“ was justified, the „System Center Operation Manager“ was often top of the CPU% list and often over 50%.
We could start to talk about the various options to restrict SCOM monitoring activity but instead i decided to use a procedure which had cured a problem on another node of the same cluster.
- Stop the „System Center Management“ service
- Delete all files in „C:\Program Files\System Center Operations Manager\Agent\Health Service State“
- Restart the „System Center Management“ service
And it did bring some relief. The CPU Load was very rarely at 100% and the monitoring service was never above 50%. At least not in the few hours I continued to look into it.
Ultimately the SCOM stuff will have to be configured so as not to take so many resources on the monitored machines but for now we have one more bottleneck widened a little.
We can, again, consider whether the SCOM Team/Guy received a warning message that the SCOM Agent had been stopped, as I would expect, but that is a question for another day.
