Arkiv för kategori ‘op5’

Lazy admin style – monitoring and self-repairing of automatically started Windows services

09 juni 2011

Hi all!

It’s been a while since we had a post here regarding monitoring, last one was back in ’09. As with that post, this will be in English because of the large international monitoring community that can make use of this, but can be successfully translated with Google Translate :)

Basically, I’m getting lazy. I want to monitor Windows services and I don’t want to log onto every server checking which services that may need to be monitored (sometimes I don’t even get that info from the Windows Server guys so I might miss something every once in a while). Also, if a service stops, it’d be smooth if I didn’t have to start that service myself, wouldn’t it? Now here comes a problem, what if there’s a service that is set to automatic but doesn’t start? And never starts because it isn’t supposed to? It sure does sound stupid but remember, this is a Microsoft platform, it isn’t made to be logical. Remember “Performance Logs”? Or currently in 2008/2008R2, Windows Software Protection service?

There’s a few scripts already out there that checks automatic services, and really, you don’t need a script for it if you have the NSClient++ agent on your Windows server since it has that function including an exclude function (-c CheckServiceState -a CheckAll exclude= ), this doesn’t however attempt to start the service again. There are also scripts that attempts to start services, but they don’t include an exclude function.

The solution!

In Nagios we have eventhandlers that can deal with issues getting caught by a check which is nice. However, for my script I decided not to use an eventhandler for this. Basically because I found it so easy to include in the first script and base it on the first output of check_nrpe … -c CheckServiceState which is nice because it means we only have to ask the Windows server once. Did I mention that I love to reduce loads as well? The only arguments we pass to the check is which services that are to be excluded. I’m not a fancy programmer, so the script looks really ugly and messy but it works like a charm. I can sit back and look at my OP5 Ninja interface detect a failed service only to see the check come back from Soft alert with all services running before a notification has been sent out, really sweet! The check always enters a soft Critical state if a service has been detected as not running even if successfully restarted. This is because I want to be able to track down services that frequently stops and are automatically repaired in the Alert history log. As soon as I have time I will post the plugin to Nagios Exchange, but until then just give me your e-mail and I can send it to you.

This is how it can look in Ninja whilst in Ok mode;

I love it and I hope you will find it useful as well :)

Terminalserver Admin – Seize your peace of mind!

07 december 2009

Dear readers!

This blog post is written in english due to the large population of the monitor community, but can be found here in Swedish (with credits to Google :)).

Monitoring of a terminal server is an issue where most surveillance systems faces a challenge. Fair enough, resources are very easy to monitor, but that doesn’t really tell you that it is possible to logon to the terminal server. At IXX, we have taken the monitoring of our terminalservers one step further and developed a plugin for our op5 monitoring system that performs user actions and also measuring the time it takes to create graphs which at a later date can be evaluated.

So what does it do?
It’s so simple it’s unbelieveble. It creates a new session on the terminalserver, launches a few applications and then logs out the user. What could possibly be more easy than this? :)

And you get something like this;

 

Actually, it’s a bit more complicated than this. Not only does it requires an additional server to launch the RDP/ICA session from since you don’t really want an X-server nor do you want the load it creates on your monitoring server, it also requires proper GPO settings. The plugin was originally created to use RDP however modifying it to use ICA is as easy as replacing a binary and updating the GPO settings.

Contact us for further information regarding this and everything else that is monitoring related!

op5 – http://www.op5.com

nagios – http://www.nagios.org

cacti – http://www.cacti.net


Follow

Get every new post delivered to your Inbox.

Join 151 other followers