Does Monitoring Really Suck?

I’ve been seeing the phrase “monitoring sucks” lately. Recently, Kris Buytaert organized a “monitoring sucks” hackathon after FOSDEM, and in a similar vein Cliff Moon, the CTO of Boundary (a monitoring service provider), also posted a “Why monitoring sucks – for now” article.

Working with OpenNMS as I have for the last decade, I really can’t share the sentiment that things suck. Having spent the decade before that as a consultant working with products like HP’s OpenView, Micromuse NetCool, Concord Network Health and BMC’s PATROL, we set out with OpenNMS to build the best tool for consultants like me – something that combines the functions of all of these products under one umbrella, with the ability to quickly and easily expand that functionality as needed. That’s why you’ll hear me refer to OpenNMS as a network management application platform instead of just an application.

OpenNMS has been addressing a lot of the concerns raised in Mr. Moon’s article for years now. Unlike point products that focus on data collection or service monitoring or trending, OpenNMS does all of them in one package. It also includes functions, such as inventory, that aren’t usually addressed in a monitoring solution. With easy, API-level integration with trouble ticketing systems (Request Tracker, OTRS, Jira, etc.) and configuration tools like RANCID, OpenNMS can be easily expanded as a given network environment grows.

We realized a long time ago that traditional alerting mechanisms were broken, so in addition to such staples as “high” and “low” thresholding, we added “relative” and “absolute” options as well to better detect anomalies. The built in alarms subsystem allows for complex automations to be created, and the event translator does a great job of enriching basic events with information such as customer impact. Finally, with 1.10 we’ve resurrected and improved the OpenNMS integration with Drools, where extremely complex analysis can be built into the system to streamline alerting. This is a key feature that led Juniper to license OpenNMS as part of their JunOS Space management product.

But I have to ask myself, if OpenNMS is so cool at solving management problems, why do people still think things suck? I can think of two reasons, although I’m sure that there are many more.

The first is that OpenNMS is written in Java, and a lot of those in the “devops” world either have no Java experience or they are prejudiced against it. The second is that OpenNMS is a seriously complex platform, and unlike some of the point products mentioned it really does take an investment of time to get the most out of it.

I can’t do much about the former issue, and history seems to have demonstrated that if people are prejudiced enough against a better solution they will eventually get left behind. I’m not saying that Java is great or even that Java is better than other options, but in many cases OpenNMS is better than the options and if Java is what’s keeping you away from it, then that’s a shame.

But the second issue I can address, and we hope to do so this year in a number of ways. The best way to help people climb the learning curve with OpenNMS is in education, and we even delayed the release of OpenNMS 1.10 in order to get the documentation to a much higher level than it has been in the past. Also this year we are having a couple of users conferences focusing on addressing real world and real time solutions, as well as increasing the number of our training courses. Finally, I hope to put together some videos to jumpstart those interested in coming up to speed with the platform.

So if you think monitoring sucks, please check out OpenNMS. Perhaps we can change your mind.

3 thoughts on “Does Monitoring Really Suck?

  1. Tarus,

    Some OpenNMS people had pinged me before the event asking if they were welcome.. sadly they never showed up . I had hoped for them to show us some of the current state of OpenNMS and tell us #monitoringdoesntsuck

    Sadly I also missed the OpenNMS presentations at Fosdem .. #etoocrowded

    The thing with OpenNMS is not Java… if you’ve read my blog you’d notice I do a lot of Java (and so do a lot of other people involved in devops) , I’ve seen OpenNMS been disqualified as an alternative in Java shops. It’s most probably lack of externally visible community..

    I keep running into the one guy on a yearly base talking about OpenNMS , sadly he fails to convince me … Yes you have frequent conferences but I’ve actually never met anyone who talked about being there. (At least in Europe)

    As you state correctly , the complexity of OpenNMS , even more with open source software if it is complex people look at the tools their peers use, people look at what they can pick their friends brains about. Combine that with the lack of awesome community

    So I welcome you folks to propose talks or tutorials for any of the conferences I`m involved in. (e.g loadays.org) and I`ll gladly try to help you grow that community …

    But till then ..

    #monitoringsucks 🙂

  2. Kris – I definitely hope that my post didn’t come across as criticizing your efforts. That wasn’t the point. I had hoped to make FOSDEM this year, but Ronny and Jeff (who did the presentation) had to head back to Fulda for a client project.

    Your blog is in my RSS reader.

    We’re announcing the OpenNMS Users Conference – Europe schedule this week. Hope you can make it.

  3. Kris, I’m sorry that our little contingent missed the hackathon. Unfortunately it couldn’t be avoided.

    I’ll echo Tarus’ sentiment in hoping that you can make it to the OUCE.

Comments are closed.