Order of the Green Polo: Requiescat In Pace

One of the first “group chat” technologies I was ever exposed to was Internet Relay Chat (IRC). This allowed a group of people to get together in areas called “channels” to discuss pretty much anything they felt like discussing. The service had to be hosted somewhere, and for most open source projects that was Freenode.

You might have seen that recently Freenode was taken over by new management, and the policies this new management implemented didn’t sit well with most Freenode users. In the grand open source tradition, most everyone left and went to other IRC servers, most notably Libera Chat.

In May of 2002 when I became the sole maintainer of OpenNMS, there was exactly one person who was dedicated full time to the project – me. What kept me going was the community I found on IRC, in both the #opennms channel and the local Linux users group channel, #trilug.

It was the people on IRC who supported me until I could grow the business to the point of bringing on more people. I still have strong friendships with many of them.

I was reminded of those early days as we migrated #opennms to Libera Chat. At the moment there are only 12 members logged in, and most of those are olde skoool OpenNMS people. I haven’t used IRC much since we switched to Mattermost (we host a server at chat.opennms.com) and with it a “bridge” to bring IRC conversations into the main Mattermost channel. Most people moved to use Mattermost as their primary client, but of course there were a few holdouts (Hi Alex!).

While I was reminiscing, I was also reminded of the Order of the Green Polo (OGP). When David, Matt and I started The OpenNMS Group in 2004, interest in OpenNMS was growing, and there was a core of those folks on IRC who were very active in contributing to the project. I was trying to think of someway to recognize them.

At that time, business casual, at least for men, consisted of a polo shirt and khaki slacks. Vendors often gifted polo shirts with their logos/logotypes on them to clients, and a number of open source projects sold them to raise money. We sold a white one and a black one, and I thought, hey, perhaps I can pick another color and use that to identify the special contributors to OpenNMS.

Green has always been associated with OpenNMS. In network monitoring, green symbolizes that everything is awesome. We even named one of our professional services products the “Greenlight Project“. Plus I really like green as a color.

Then the question became “what shade of green?” For some reason I thought of Tiger Woods who, by this time, late 2004, had won the prestigious Masters golf tournament three times (and would again the next spring). The winner of that tournament gets a “hunter green” jacket, and so I decided that hunter green would be the color.

Also, for some unknown reason, I saw an article about a British knighthood called “The Order of the Garter“. I combined the two and thus “The Order of the Green Polo” was born.

It was awesome.

People who had been active in contributing to OpenNMS became even more active when I recognized them with the OGP honor. They contributed code and helped us with supporting our community, as well as adding a lot to the direction of the project. We started having annual developer conferences called “Dev-Jam” and OGP members got to attend for free so we could spend some face to face time with each other. I considered these men in the OGP to be my brothers.

As OpenNMS grew, we looked to the OGP for recruitment. It was through the OGP that Alejandro came to the US from Venezuela and now leads our support and services team (if OpenNMS went away tomorrow, getting him and his spouse here would have made it all worth it). When you hired an OGP member, you were basically paying them to do something they wanted to do for free. Think of is as like eating an ice cream sundae and finding money at the bottom.

But that growth was actually something that lead to the decline of the OGP. When we hired everyone that wanted a job with us, the role of the OGP declined. Dev-Jam was open to anyone, but it was mandatory for OpenNMS employees. Not all employees were OGP even though they were full-time contributors, so there was often pressure to induct new employees into the Order. And, most importantly, as we aged many OGP members moved on to other things. Hey, it happens, and it doesn’t reflect poorly on their past contributions.

We had a special mailing list for the OGP, but instead of discussing OpenNMS governance it basically became a “happy birthday” list (speaking of which, Happy Birthday Antonio!). When OpenNMS was acquired by NantHealth, we had to merge our mail systems and in the process the OGP list was deactivated. I don’t think many people noticed.

Recently it was brought to my attention that associating OpenNMS with the Masters golf tournament through the OGP could have negative connotations. The Masters is hosted by the Augusta National Golf Club and there have been controversies around their membership policies and views on race. It was suggested that we rename the OGP to something else.

One quick solution would be to just change the shade of green to, perhaps, a “stoplight” green. But this got me to thinking that the same logic used to associate the color with racism could apply to the whole “Order of” as well, since that was based on a British knighthood which, much like Augusta, is mainly all male. Plus the British don’t have the best track record when it comes to colonialism, etc.

I think it is time for something totally new, so I’ve decided to retire the Order of the Green Polo. The members of the OGP are all male, and I’m extremely excited that as we’ve grown our company and project we have been able to greatly improve our diversity, and I would love to come up with something that can embrace everyone who has a love of OpenNMS and wants to contribute to it, be that through code, documentation, the community, &tc.

OpenNMS has changed greatly over the past two decades, and it has become harder to contribute to a project that has grown exponentially in complexity. As part of my role as the Chief Evangelist of OpenNMS, I want to change that and come up with easier ways for people to improve the OpenNMS platform, and I need to come up with a new program to recognize those who contribute (and if you want to skip that part and get right to the job thingie, we’re hiring, but don’t skip that part).

To those of you who were in the Order of the Green Polo, thank you so much for helping us make OpenNMS what it is today. I’m not sure if it would exist without you. And even without the OGP mailing list, I plan to remember your birthdays.

What’s Old Is New Again

Today we launched a new look for OpenNMS, a rebranding effort that has been going on for the better part of a year. It represents a lot more than just a new logo and new colors. While OpenNMS has been around for over two decades now, it is also quite different from when it started. A tremendous amount of work has gone into the project over the past couple of years, and if you looked at using it even just a short while ago you will be surprised at what has changed.

New OpenNMS Logo

One of the best analogies I can come up with to talk about the “new” OpenNMS concerns cars. I like cars, especially Mercedes, and when I was in college I usually drove an older Mercedes sedan. I enjoyed bringing them back to their former glory (and old, somewhat beaten down cars were all I could afford), and so I might start by redoing the brake system, overhauling the engine, etc.

When I would run out of money, which was often, sometimes I’d have to sell a car. Prospective buyers would often complain that the paint wasn’t perfect or there was an issue with the interior. I’d point out that you could hop in this car right now and drive it across the country and never worry about breaking down, but they seemed focused on how it looked. Cosmetics are usually the last thing you focus on during a restoration, but it tends to be the first thing people see.

This is very much like OpenNMS. For over a decade we’ve been focused on the internals of the platform, and luckily we are now in a position to focus on how it looks.

Please don’t misunderstand: application usability is important, much more important than, say, the paint job on a car, but in order to provide the best user experience we had to start by working under the hood.

For example, from the beginning OpenNMS has contained multiple “daemons” that control various aspects of the platform. Originally this was very monolithic, and thus any small change to one of them would often require restarting the whole application.

OpenNMS is now based on a Karaf runtime which provides a modular way of managing the various features within the application. It comes with a shell that can allow even non-Java programmers access to both high and low level parts of the platform, and to make changes without restarting the whole thing. Features can be enabled and disabled on the fly, and it is easy to test the behavior of OpenNMS against a particular device without having to set up a special test environment to pore through pages of logs.

Another great aspect of OpenNMS is that much of the internal messaging can now take place through a broker such as Kafka. While this increases the stability and flexibility of the platform, users can also create custom consumers for the huge amounts of information OpenNMS is able to collect. For very large networks this creates the option to use that data outside of the platform itself, giving end users a high level of custom observablity.

The monolithic nature of OpenNMS has also been improved. The addition of “Minions” to provide monitoring at the edge of the network creates numerous monitoring solutions where there was none before. You can now reach into isolated or private networks, or monitor the performance of applications from various locations seamlessly. The “Sentinel” project allows the various processes within OpenNMS to be spread out over multiple devices with the aim to have virtually unlimited scale.

APM Example World Map

And I haven’t even started on the ability of OpenNMS to monitor tremendous amounts of telemetry data and to analyze it with tools such as “Nephron” or our foray into artificial intelligence with ALEC.

So much has changed with OpenNMS, much of it recently, that it was time for that new coat of paint. It was time for people to both notice the new look of OpenNMS at the surface, and the new OpenNMS under the covers.

One thing that hasn’t changed is that OpenNMS is still 100% open source. All of these amazing features are available to anyone under an OSI approved open source license. Plus we leverage and integrate with best-in-class open source tools such as Grafana for visualization and Cassandra (using Newts) for storing time series data.

Our new logo is a stylized gyroscope. For centuries the gyroscope has represented a way to maintain orientation in the most chaotic of situations. In much the same way, OpenNMS helps you maintain the orientation of your IT infrastructure which, let’s admit it, plays a huge role in the success of your enterprise.

Where the car analogy falls apart is that while the paint job is usually the end of a restoration, this new look for OpenNMS is just the beginning of a new chapter in the history of the project. Our goal is to create a platform where monitoring just happens. We’re not there yet, but check out the latest OpenNMS and we hope you’ll agree we are getting closer.

OpenNMS Resources

Getting started with OpenNMS can be a little daunting, so I thought I’d group together some of the best places to start.

When OpenNMS began 20+ years ago, the main communication channel was a group of mailing lists. For real time interaction we added an “#opennms” IRC channel on Freenode as well. As new technology came along we eagerly adopted it: hosting forums, creating a FAQ with FAQ-o-matic, building a wiki, writing blogs, etc.

The problem became that we had too many resources. Many weren’t updated and thus might host obsolete information, and it was hard for new users to find what they wanted. So a couple of years ago we decided to focus on just two main places for community information.

We adopted Discourse to serve as our “asynchronous” communication platform. Hosted at opennms.discourse.group the goal is to migrate all of our information that used to reside on sites like FAQs and wikis to be in one place. In as much as our community has a group memory, this is it, and we try to keep the information on this site as up to date as possible. While there is still some information left in places like our wiki, the goal is to move it all to Discourse and thus it is a great place to start.

I also want to call your attention to “OpenNMS on the Horizon (OOH)”. This is a weekly update of everything OpenNMS, and it is a good way to keep up with all the work going on with the platform since a lot of the changes being made aren’t immediately obvious.

While we’ve been happy with Discourse, sometimes you just want to interact with someone in real time. For that we created chat.opennms.com. This is an instance of Mattermost that we host to provide a Slack-like experience for our community. It basically replaces the IRC channel, but there is also a bridge between IRC and MM so that posts are shared between the two. I am “sortova” on Mattermost.

When you create an account on our Mattermost instance you will be added to a channel called “Town Square”. Every Mattermost instance has to have a default channel, and this is ours. Note that we use Town Square as a social channel. People will post things that may be of interest to anyone with an interest in OpenNMS, usually something humorous. As I write this there are over 1300 people who have signed up on Town Square.

For OpenNMS questions you will want to join the channel “OpenNMS Discussion”. This is the main place to interact with our community, and as long as you ask smart questions you are likely to get help with any OpenNMS issues you are facing. The second most popular channel is “OpenNMS Development” for those interested in working with the code directly. The Minion and Compass applications also have their own channels.

Another channel is “Write the Docs”. Many years ago we decided to make documentation a key part of OpenNMS development. While I have never read any software documentation that couldn’t be improved, I am pretty proud of the work the documentation team has put into ours. Which brings me to yet another source of OpenNMS information: the official documentation.

Hosted at docs.opennms.org, our documentation is managed just like our application code. It is written in AsciiDoc and published using Antora. The documentation is versioned just like our Horizon releases, but usually whenever I need to look something up I go directly to the development branch. The admin guide tends to have the most useful information, but there are guides for other aspects of OpenNMS as well.

The one downside of our docs is that they tend to be more reference guides than “how-to” articles. I am hoping to correct that in the future but in the meantime I did create a series of “OpenNMS 101” videos on YouTube.

They mirror some of our in-person training classes, and while they are getting out of date I plan to update them real soon (we are in the process of getting ready for a new release with lots of changes so I don’t want to do them and have to re-do them soon after). Unfortunately YouTube doesn’t allow you to version videos so I’m going to have to figure out how to name them.

Speaking of changes, we document almost everything that changes in OpenNMS in our Jira instance at issues.opennms.org. Every code change that gets submitted should have a corresponding Jira issue, and it is also a place where our users can open bug reports and feature requests. As you might expect, if you need to open a bug report please be as detailed as possible. The first thing we will try to do is recreate it, so having information such as the version of OpenNMS you are running, what operating system you are using and other steps to cause the problem are welcome.

If you would like us to add a feature, you can add a Feature Request, and if you want us to improve an existing feature you can add an Enhancement Request. Note that I think you have to have an account to access some of the public issues on the system. We are working to remove that requirement as we wish to be as transparent as possible, but I don’t think we’ve been able to get it to work just yet. I just attempted to visit a random issue and it did load but it was missing a lot of information that shows up when I go to that link while authenticated, such as the left menu and the Git Integration. You will need an account to open or comment on issues. There is no charge to open an account, of course.

Speaking of git, there is one last resource I need to bring up: the code. We host our code on Github, and we’ve separated out many of our projects to make it easier to manage. The main OpenNMS application is under “opennms” (naturally) but other projects such as our machine learning feature, ALEC, have their own branch.

While it was not my intent to delve into all things git on this post, I did want to point out than in the top level directory of the “opennms” project we have two scripts, makerpm.sh and makedeb.sh that you can use to easily build your own OpenNMS packages. I have a video queued up to go over this in detail, but to build RPMs all you’ll need is a base CentOS/RHEL install, and the packages “git” (of course), “expect”, “rpm-build” and “rsync”. You’ll also need a Java 8 JDK. While we run on Java 11, at the moment we don’t build using it (if you check out the latest OOH you’ll see we are working on it). Then you can run makerpm.sh and watch the magic happen. Note the first build takes a long time because you have to download all of the maven dependencies, but subsequent builds should be faster.

To summarize:

For normal community interaction, start with Discourse and use Mattermost for real time interaction.

For reference, check out our documentation and our YouTube channel.

For code issues, look toward our Jira instance and our Github repository.

OpenNMS is a powerful monitoring platform with a steep learning curve, but we are here to help. Our community is pretty welcoming and hope to see you there soon.

Thoughts on Security and Open Source Software

Due to the recent supply-chain attack on Solarwinds products, I wanted to put down a few thoughts on the role of open source software and security. It is kind of a rambling post and I’ll probably lose all three of my readers by the end, but I found it interesting to think about how we got here in the first place.

I got my first computer, a TRS-80, as a Christmas present in 1978 from my parents.

Tarus and his TRS-80

As far as I know, these are the only known pictures of it, lifted from my high school yearbook.

Now, I know what you are thinking: Dude, looking that good how did you find the time off your social calendar to play with computers? Listen, if you love something, you make the time.

(grin)

Unlike today, I pretty much knew about all of the software that ran on that system. This was before “open source” (and before a lot of things) but since the most common programming language was BASIC, the main way to get software was to type in the program listing from a magazine or book. Thus it was “source available” at least, and that’s how I learned to type as well as being introduced to the “syntax error”. That cassette deck in the picture was the original way to store and retrieve programs, but if you were willing to spend about the same amount as the computer cost you could buy an external floppy drive. The very first program I bought on a floppy was from this little company called Microsoft, and it was their version of the Colossal Cave Adventure. Being Microsoft it came on a specially formatted floppy that tried to prevent access to the code or the ability to copy it.

And that was pretty much the way of the future, with huge fortunes being built on proprietary software. But still, for the most part you were aware of what was running on your particular system. You could trust the software that ran on your system as much as your could trust the company providing it.

Then along comes the Internet, the World Wide Web and browsers. At first, browsers didn’t do much dynamically. They would reach out and return static content, but then people started to want more from their browsing experience and along came Java applets, Flash and JavaScript. Now when you visit a website it can be hard to tell if you are getting tonight’s television listings or unknowingly mining Bitcoin. You are no longer in charge of the software that you run on your computer, and that can make it hard to make judgements about security.

I run a number of browsers on my computer but my default is Firefox. Firefox has a cool plugin called NoScript (and there are probably similar solutions for other browsers). NoScript is an extension that lets the user choose what JavaScript code is executed by the browser when visiting a page. A word of warning: the moment you install NoScript, you will break the Internet until you allow at least some JavaScript to run. It is rare to visit a site without JavaScript, and with NoScript I can audit what gets executed. I especially like this for visiting sensitive sites like banks or my health insurance provider.

Speaking of which, I just filed a grievance with Anthem. We recently switched health insurance companies and I noticed that when I go to the login page they are sending information to companies like Google, Microsoft (bing.com) and Facebook. Why?

Blocked JavaScript on the Anthem Website

I pretty much know the reason. Anthem didn’t build their own website, they probably hired a marketing company to do it, or at least part of it, and that’s just the way things are done, now. You send information to those sites in order to get analytics on who is visiting your site, and while I’m fine with it when I’m thinking about buying a car, I am not okay with it coming from my insurance company or my bank. There are certain laws governing such privacy, with more coming every day, and there are consequences for violating it. They are supposed to get back to me in 30 days to let me know what they are sending, and if it is personal information, even if it is just an IP Address, it could be a violation.

I bring this up in part to complain but mainly to illustrate how hard it is to be “secure” with modern software. You would think you could trust a well known insurance company to know better, but it looks like you can’t.

Which brings us back to Solarwinds.

Full disclosure: I am heavily involved in the open source network monitoring platform OpenNMS. While we don’t compete head to head with Solarwinds products (our platform is designed for people with at least a moderate amount of skill with using enterprise software while Solarwinds is more “pointy-clicky”) we have had a number of former Solarwinds users switch to our solution so we can be considered competitors in that fashion. I don’t believe we have ever lost a deal to Solarwinds, at least one in which our sales team was involved.

Now, I wouldn’t wish what happened to Solarwinds on my worst enemy, especially since the exploit impacted a large number of US Government sites and that does affect me personally. But I have to point out the irony of a company known for criticizing open source software, specifically on security, to let this happen to their product. Take this post from on of their forums. While I wasn’t able to find out if the author worked at Solarwinds or not, they compare open source to “eating from a dirty fork”.

Seriously.

But is open source really more secure? Yes, but in order to explain that I have to talk about types of security issues.

Security issues can be divided into “unintentional”, i.e. bugs, and “intentional”, someone actively trying to manipulate the software. While all software but the most simple suffers from bugs, what happened to the Solarwinds supply chain was definitely intentional.

When it comes to unintentional security issues, the main argument against open source is that since the code is available to anyone, a bad actor could exploit a security weakness and no one would know. They don’t have to tell anyone about it. There is some validity to the argument but in my experience security issues in open source code tend to be found by conscientious people who duly report them. Even with OpenNMS we have had our share of issues, and I’d like to talk about two of them.

The first comes from back in 2015, and it involved a Java serialization bug in the Apache commons library. The affected library was in use by a large number of applications, but it turns out OpenNMS was used as a reference to demonstrate the exploit. While there was nothing funny about a remote code execution vulnerability, I did find it amusing that they discovered it with OpenNMS running on Windows. Yes, you can get OpenNMS to run on Windows, but it is definitely not easy so I have to admire them for getting it to work.

I really didn’t admire them for releasing the issue without contacting us first. Sending an email to “security” at “opennms.org” gets seen by a lot of people and we take security extremely seriously. We immediately issued a work around (which was to make sure the firewall blocked the port that allowed the exploit) and implemented the upgraded library when it became available. One reason we didn’t see it previously is that most OpenNMS users tend to run it on Linux and it is just a good security practice to block all but needed ports via the firewall.

The second one is more recent. A researcher found a JEXL vulnerability in Newts, which is a time series database project we maintain. They reached out to us first, and not only did we realize that the issue was present in Newts, it was also present in OpenNMS. The development team rapidly released a fix and we did a full disclosure, giving due credit to the reporter.

In my experience that is the more common case within open source. Someone finds the issue, either through experimentation or by examining the code, they communicate it to the maintainers and it gets fixed. The issue is then communicated to the community at large. I believe that is the main reason open source is more secure than closed source.

With respect to proprietary software, it doesn’t appear that having the code hidden really helps. I was unable to find a comprehensive list of zero-day Windows exploits but there seem to be a lot of them. I don’t mean to imply that Windows is exceptionally buggy but it is a common and huge application and that complexity lends itself to bugs. Also, I’m not sure if the code is truly hidden. I’m certain that someone, somewhere, outside of Microsoft has a copy of at least some of the code. Since that code isn’t freely available, they probably have it for less than noble reasons, and one can not expect any security issues they find to be reported in order to be fixed.

There seems to be this misunderstanding that proprietary code must somehow be “better” than open source code. Trust me, in my day I’ve seen some seriously crappy code sold at high prices under the banner of proprietary enterprise software. I knew of one company that wrote up a bunch of fancy bash scripts (not that there is anything wrong with fancy bash scripts) and then distributed them encrypted. The product shipped with a compiled program that would spawn a shell, decrypt the script, execute it and then kill the shell.

Also, at OpenNMS we rely heavily on unit tests. When a feature is developed the person writing the code also creates code to “test” the feature to make sure it works. When we compile OpenNMS the tests are run to make sure the changes being made didn’t break anything that used to work. Currently we have over 8000 of these tests. I was talking to a person about this who worked for a proprietary software company and he said, “oh, we tried that, but it was too hard.”

Finally, I want to get back to that other type of security issue, the “intentional” one. To my understanding, someone was able to get access to the servers that built and distributed Solarwinds products, and they added in malware that let them compromise target networks when they upgraded their applications. Any way you look at it, it was just sloppy security, but I think the reason it went on for so long undetected is that the whole proprietary process for distributing the software was limited to so few people it was easy to miss. These kind of attacks happen in open source projects, too, they just get caught much faster.

That is the beauty of being able to see the code. You have the choice to build your own packages if you want, and you can examine code changes to your hearts content.

We host OpenNMS at Github. If you check out the code you could run something like:

git tag --list

to see a list of release tags. As I write this the latest released version of Horizon is 26.0.1. To see what changed from 26.0.0 I can run

git log --no-merges opennms-26.0.0-1 opennms-26.0.1-1

If you want, there is even a script to run a “release report” which will give you all of the Jira issues referenced between the two versions:

git-release-report opennms-26.0.0-1 opennms-26.0.1-1

While that doesn’t guarantee the lack of malicious code, it does put the control back into your hands and the hands of many others. If something did manage to slip in, I’m sure we’d catch it long before it got released to our users.

Security is not easy, and as with many hard things the burden is eased the more people who help out. In general open source software is just naturally better at this than proprietary software.

There are only a few people on this planet who have the knowledge to review every line of code on a modern computer and understand it, and that is with the most basic software installed. You have to trust someone and for my peace of mind nothing beats the open source community and the software they create.

It Was Twenty Years Ago Today …

On March 30th, 2000, the OpenNMS Project was registered on Sourceforge. While the project actually started sometime in the summer of 1999, this was the first time OpenNMS code had been made public so we’ve always treated this day as the birth date of the OpenNMS project.

Wow.

OpenNMS Entry on Sourceforge

Now I wasn’t around back then. I didn’t join the project until September of 2001. When I took over the project in May of 2002 I didn’t really think I could keep it alive for twenty years.

Seriously. I wasn’t then nor am I now a Java programmer. I just had a feeling that there was something of value in OpenNMS, something worth saving, and I was willing to give it a shot. Now OpenNMS is considered indispensable at some of the world’s largest companies, and we are undergoing a period of explosive growth and change that should cement the future of OpenNMS for another twenty years.

What really kept OpenNMS alive was its community. In the beginning, when I was working from home using a slow satellite connection, OpenNMS was kept alive by people on the IRC channel, people like DJ and Mike who are still involved in the project today. A year or so later I was able to convince my business partner and good friend David to join me, and together we recruited a real Java programmer in Matt. Matt is no longer involved in the project (people leaving your project is one of the hardest things to get used to in open source) but his contributions in those early days were important. Several years after that we were joined by Ben and Jeff, who are still with us today, and through slow and steady steps the company grew alongside the project. They were followed by even more amazing people that make up the team today (I really want to name every single one of them but I’m afraid I’ll miss one and they’ll be rightfully upset).

I can’t really downplay enough my lack of responsibility for the success of OpenNMS. My only talent is getting amazing people to work with me, and then I just try to remove any obstacles that get in their way. I get some recognition as “The Mouth of OpenNMS” but most of the time I just stand on the shoulders of giants and enjoy the view.

Meridian 2018

It is hard to believe that our first release of OpenNMS Meridian was over three years ago.

Meridian Logo

We were struggling with trying to balance the needs of a support organization with the open source desire to “release early, release often”. How do you deal with wanting to be as cutting edge as possible but to support customers who really need a stable platform? We did have a “development” release, but no one really used it.

Our answer was to model OpenNMS on Red Hat, the most successful open source company in existence. While Red Hat has hundreds of products, their main offering is Red Hat Enterprise Linux (RHEL). This is derived, in large part, from the Fedora Linux distribution. New things hit Fedora first and, once vetted, make their way into RHEL.

We decided to do the same thing with OpenNMS. OpenNMS was split into two main branches: Horizon and Meridian. Horizon was the Fedora equivalent, while Meridian was modeled on RHEL.

This has been very successful. While we were averaging a new major OpenNMS release every 18 months, now we do three or four Horizon releases per year. Tons of new features are hitting Horizon, from the ability to deal with telemetry data, new correlation features to condense alarms into “situations” based on unsupervised machine learning, to the first steps toward a microservices architecture.

We do our best to release code as production-ready as possible. Our users are very creative and use OpenNMS in unique ways. By offering up rapid Horizon releases it allows us to find and fix issues quickly and work out how to best implement new functionality.

But what about our users who are more interested in stability than the “new shiny”? They needed a system that was rock solid and easy to maintain. That’s why we created Meridian. Meridian lags Horizon on features but by the time a feature hits Meridian, it has been tested thoroughly and can immediately be deployed into production.

There is one major Meridian release a year, with usually three or four point updates. Anyone who has ever upgraded OpenNMS understands that dealing with configuration file changes can be problematic. With Meridian, moving from one point release to another rarely changes configuration, so upgrades can happen in minutes and users can rest assured that their systems are up to date and secure. Each Meridian release is supported for three years.

There is a cost associated with using Meridian. Similar to RHEL, it is offered as a subscription. While still 100% open source, you pay a fee to access the update servers, and the idea is that you are paying for the effort it takes to refine Horizon into Meridian and get the most stable version of OpenNMS possible. We are so convinced that Meridian is worth it, it is available without having to buy a support contract. Meridian users get access to OpenNMS Connect, which is a forum for asking questions about using Meridian.

It seems like it was just yesterday that we did this but it has now been over three years. That means support will sunset on Meridian 2015 at the end of the year. Never fear, the latest releases are just as stable and even more feature rich.

The main feature in Meridian 2018 is support for the OpenNMS Minion. The Minion is a stateless application that allows for remote distribution of OpenNMS functionality. For example, I used to run an OpenNMS instance at my house to monitor my devices. Now I just have a Minion. Even though my network is not reachable from our production OpenNMS instance, the Minion allows me to test service availability, and well as collect data and traps, and then forward them on to the main application. The Minion itself is stateless – it connects to a messaging broker on the OpenNMS server in order to get its list of tasks.

A Minion is defined by its “Location”. You can have multiple Minions for a given location and they will access the broker via a “competitive consumer queue”. This way if a particular Minion goes down, there can be another to do the work. By default OpenNMS ships with ActiveMQ as the broker, but it is also possible to use an external Kafka instance as well. Kafka can be clustered for both load balancing and reliability, and the combination of a Kafka cluster and multiple Minions can make the amount of devices OpenNMS monitors virtually limitless (we are working on a proof of concept for one user with over 8 million discrete devices).

There are a number of other features in Meridian 2018, so check out the release notes for more details. It is an exciting addition to the OpenNMS product line.

Dealing with Docker Interfaces

We run a lot of instances of OpenNMS (‘natch) and lately we’ve seen issues with disk space being used up faster than expected.

We tracked the issue down to Docker. If Docker is running on a machine, SNMP will discover a Docker interface, usually labelled “docker0”. When that instance is stopped and restarted, or another Docker instance is created, another interface will be created. This will create a lot of RRD files of limited usefulness, so here is how to address it.

First, we want to tell OpenNMS not to discover those interfaces in the first place. This is done using a “policy” in the foreign source definition for the devices in question. Here is what it looks like in the webUI:

Skip Docker Interfaces Policy

The “SNMP Interface Policy” will match on various fields in the snmpinterface table in the database, which includes ifDescr. The regular expression will match any ifDescr that starts with the string “docker” and it will not persist (add) it to the database. This policy has only one parameter, so either “Match All Parameters” or “Match Any Parameter” will work.

If you want to use the command line, or have a lot of custom foreign source definitions, you can paste this into the proper file:

   <policies>
      <policy name="Ignore Docker interfaces" class="org.opennms.netmgt.provision.persist.policies.MatchingSnmpInterfacePolicy">
         <parameter key="action" value="DO_NOT_PERSIST"/>
         <parameter key="ifDescr" value="~^docker.*$"/>
         <parameter key="matchBehavior" value="ALL_PARAMETERS"/>
      </policy>
   </policies>

This will not deal with any existing interfaces, however. For that there are two steps: delete the interfaces from the database and delete them from the file system.

For the database, with OpenNMS stopped access PostgreSQL (usually with psql -U opennms opennms) and run:

delete from ipinterface where snmpinterfaceid in (select id from snmpinterface where snmpifdescr like 'docker%');

and restart OpenNMS.

For the filesystem, navigate to where your RRDs are stored (usually /opt/opennms/share/rrd/snmp) and run:

find . -type d -name "docker*" -exec rm -r {} \;

That should get rid of existing Docker interfaces, free up disk space and prevent new Docker interfaces from being discovered.

Horizon™ Version 20 Released

Just a heads up that version 20 of Horizon has been released.

Since version 20 coincides with the 20th anniversary of the film The Fifth Element, we’ve decided to use characters from that movie as codenames for this release. Version 20.0.0 is called “Leeloo”.

This release continues our commitment to rapid releases in the Horizon product line, and is mainly focused on bug fixes, small enhancements and code cleanup. We have removed all use of Castor for the parsing of XML files and replaced it with JAXB, and number of deprecated events have been removed from the system.

Probably the biggest new feature is a topology provider that can be used to create custom maps. The Asset Topology Provider generates a GraphML topology based on node metadata including asset fields.

You can read the announcement and for more information, check out the release notes.

New Meridian® Releases Available

Just a quick note to point out that new Meridian releases are now available: 2015.1.5 and 2016.1.5

For those who aren’t aware, Meridian is a subscription-based version of OpenNMS built to complement Horizon, the cutting edge release. You can think of it as Meridian is our Red Hat Enterprise Linux to Horizon’s Fedora. There is one major Meridian release per year and each major release is supported for three years.

Before the Meridian/Horizon split it was taking us 18 months or so to do a new major release of OpenNMS. Now we do three to four Horizon major releases a year.

About half of our revenue comes from support contracts and so we had to be extra careful when doing a release, and even with that many of our customers were reluctant to upgrade because the process could be involved. This was bad for two main reasons: often they wouldn’t get bug fixes which meant an increase in support tickets, and more importantly they might miss security updates.

Updates to Meridian, within a major release, are dead simple. This is the process I used yesterday to upgrade our production instance of OpenNMS.

First, I made a backup of the /opt/opennms/etc and /opt/opennms/jetty-webapps/opennms directories. The first is out of habit since configuration files shouldn’t change between point releases, but the second is to preserve any customizations made to the webUI. I modify the main OpenNMS page to include a “weather widget” and that customization gets removed on upgrades. Most users won’t have an issue but just in case I like having a backup.

Next, I stop OpenNMS and run yum install opennms which will download and install the new release. The final step is to run /opt/opennms/bin/install -dis to insure the database is up to date.

And that’s it. In my case, I copy the index.jsp from my backup to restore the weather information, but otherwise you just restart OpenNMS. The process takes minutes and is basically as fast as your Internet connection.

If you have a Meridian subscription, be sure to upgrade as soon as you are able, and if you don’t, what are you waiting for? (grin)

Ulf: My Favorite Open Source Animal

Over at opensource.com they asked “What’s your favorite open source animal?” Hands down, it’s Ulf.

OpenNMS Kiwi: Ulf

When I was at FOSDEM this year, we were often asked about the origin of having a kiwi as our mascot. Kiwi’s are mainly associated with New Zealand, and OpenNMS is not from New Zealand. But Ulf is.

Every year we have a developer’s conference called “Dev Jam“. Back in 2010, a man named Craig Miskell came from NZ and brought along a plush toy kiwi. He gave it to a group of people who had come from Germany, since he had come the furthest east for the conference and they had come the furthest west. They named him “Ulf”.

There was no conscious decision to make Ulf our mascot, it just happened organically. People in the project started treating him as a “traveling gnome“, setting up a wiki page to track some of the places he’s been, and he even has his own Twitter account.

I lost him once. We had a holiday party a few years ago and Ulf went missing. We thought he had been left in a limo, so I dutifully sought out a replacement. I found one for US$9, but of course shipping from NZ was an additional US$80 more, so I bought two. I later found Ulf hiding in the pocket of a formal overcoat I rarely wear (but had the night of the party) so now we have a random array of individual Ulf’s.

Anyway, Ulf manages to represent OpenNMS often, from stickers to holiday cards and keychains. I love the fact that he just kind of happened, we didn’t make a conscious decision to use him in marketing. If you happen to come across OpenNMS at conferences like FOSDEM, be sure to stop by and say “hi”.