OpenNMS Horizon 17.1 Released

As some of you may have noticed, Horizon 17.1 has been released. As Horizon 17 will form the basis for Meridian 2016, I’m extremely happy to see how much progress has been made on fixing issues.

Be sure to check out the Release Notes.

Horizon is our rapid release version, and its goal is to get all the cool new features out as soon as they are ready. In this case, right about release we discovered an annoying but easy to fix bug with provisioning, so if you plan to run Horizon 17.1 you should also apply this patch.

Have fun, and we hope you find OpenNMS useful.

Bug

  • [NMS-4108] – Bad suggestions in install guide
  • [NMS-7152] – Enlinkd Topology Plugin fails to create LLDP links for mismatched link port descriptions.
  • [NMS-7820] – snmp-graph.properties.d files with –vertical-label="verticalLabel" in config
  • [NMS-7866] – Incorrect host in Location header when creating resources via ReST
  • [NMS-7910] – config-tester is broken
  • [NMS-7953] – Opsboard and Opspanel use wrong logo
  • [NMS-7966] – Unable to generate eventconf if a MIB (improperly?) uses a TC to define a TC
  • [NMS-7988] – container features.xml still references jersey 1.18 when it should reference 1.19.
  • [NMS-8000] – Topology-UI shows CDP links not correct
  • [NMS-8014] – Backshift graphs show dates in UTC instead of the browser's timezone
  • [NMS-8017] – StrafePing: Unexpected exception while polling PollableService
  • [NMS-8023] – Grafana Box did not work anymore
  • [NMS-8026] – Constant Thread Locking on Enlinkd
  • [NMS-8029] – Wrong use of opennms.web.base-url
  • [NMS-8038] – NRTG with newts – get StringIndexOutOfBoundsException
  • [NMS-8051] – The newts script only works if cassandra runs on localhost
  • [NMS-8054] – AlarmPersisterIT test is empty
  • [NMS-8064] – The 'newts init' script does work when authentication is enabled in Cassandra
  • [NMS-8065] – ReST Regression in Alarms/Events
  • [NMS-8066] – Newts only uses a single thread when writing to Cassandra
  • [NMS-8073] – User Restriction Filters: mapping class for roles to groups does not work
  • [NMS-8074] – The "Remove From Focus" button intermittently fails
  • [NMS-8079] – The OnmsDaoContainer does not update its cache correctly, leading to a NumberFormatException
  • [NMS-8084] – File not found exception for interfaceSTP-box.jsp on SNMP interface page
  • [NMS-8097] – Installation Guide Debian Bug Version 17.0.0
  • [NMS-8100] – Unable to complete creation of scheduled reports
  • [NMS-8103] – NPE when persisting data with Newts
  • [NMS-8104] – init script checkXmlFiles() fails to pick up errors
  • [NMS-8106] – INFO-severity syslog-derived events end up unmatched
  • [NMS-8109] – Memory leak when using the BSFDetector
  • [NMS-8112] – init script "configtest" exit value is always 1
  • [NMS-8116] – Heat map Alarms/Categories do not show all categories
  • [NMS-8119] – WS-MAN has broken ForeignSourceConfigRestService and the requisitions UI doesn't work.
  • [NMS-8123] – Removing ops boards via the configuration UI does not update the table
  • [NMS-8126] – JNA ping code reuses buffer causing inconsistent reads of packet contents
  • [NMS-8133] – Synchronizing a requisition fails
  • [NMS-8147] – Add all the services declared on Collectd and Pollerd configuration as available services on /opennms/rest/foreignSourceConfig/services

Enhancement

  • [NMS-7123] – Expose SNMP4J 2.x noGetBulk and allowSnmpV2cInV1 capabilities
  • [NMS-7978] – Add threshold comments and whitespace changes to match how the OpenNMS web GUI generates XML files
  • [NMS-8005] – Add support for using NRTG via Ajax calls
  • [NMS-8024] – Add support for OSGi-based Ticketing plugins
  • [NMS-8028] – Add event definition for postfix syslog message TLS disabled
  • [NMS-8030] – Improve the SNMP data collection config parsing to give more flexibility to the users
  • [NMS-8042] – set Up severities for RADLAN-MIB.events.xml
  • [NMS-8068] – Add support for marshalling NorthboundAlarms to XML
  • [NMS-8071] – Event definition file for JUNIPER-IVE-MIB
  • [NMS-8120] – Fixed a paragraph in the "Automatic Discovery" provisioning chapter
  • [NMS-8156] – Upgrade Angular Backend for the Requisitions UI

Review: Angel Sensor Fitness Tracker

Angel Sensor represents everything that’s wrong with the technology industry today.

TL;DR; Two years ago, Angel Sensor ran an Indiegogo campaign to create an “open sensor for health and fitness”. They implied that the software would be open source. I finally got mine this week and it is total bollocks. Not only is the software not open source, the app that goes with it is barely an app. There is little communication from the vendor to the community, and while the hardware is solid, it is too expensive to manufacture so the “classic” model is obsolete on delivery. Don’t deal with this company.

Okay, so I like metrics. I work on an open source project to monitor anything you reach over the network. I have a weather station at my house and a temperature sensor in my workshop. I am very eager to gather information about what’s going on in my body, and while companies like Fitbit make great products for that purpose, I distrust sending this most personal data to a third party.

So a couple of years ago I did a search on “open source fitness tracking” and came across Angel Sensor. This company claimed that they were going to create an open health platform where the software would be open source, so I bought an Angel Sensor wristband and eagerly awaited its arrival.

And waited. And waited.

Two years later, it finally arrived and it is a total disappointment.

First, the good.

The packaging is nice.

Angel Sensor Box

The band itself is in its own compartment, and on the side of the box is a little drawer that you can pull out containing the accessories:

Angel Sensor What's In the Box

You get the band, a small instruction booklet, a charging cradle and seven flexible clasps (of various lengths) that help hold the band to your wrist.

I picked out a clasp and pretty soon had it on my wrist:

Angel Sensor On Wrist

Although heavier than I would have expected, it felt comfortable, something I could wear 24/7. My LG Urbane watch is slightly thicker but overall a bit lighter:

Angel Sensor vs. LG Urbane

The instruction booklet says to charge the band fully before using, and this is where the problems started. The “classic” uses a charging cradle. At both ends of the band (where the clasp connects) are metal studs. You insert one set of studs (marked on the band) into the cradle to charge it. The problem is that there is nothing in the charger to really grab on to the studs, so in my case it kept losing the connection and charging would stop. I had to prop it up at an angle in order to keep a connection, and even then I wasn’t sure about it staying in place.

Which is a shame, since the band itself is rather stylish. While there is no screen, there are two white LEDs on each side of the band that can glow and pulse to let you know something is going on. I kept having to keep an eye on the LEDs to make sure the thing was still charging.

Note that all this is moot since the classic proved too hard to produce. The new unit is called the M1. The M1 is thicker and you lose water resistance, which I think is an important feature. While I don’t plan to dive with a fitness tracker, I might wash dishes while wearing it, so the ability to be submerged in liquid for a small amount of time is a requirement. The M1 does use a standard microUSB charging connector so that is a plus in its favor.

Summary to this point: solid hardware design, although now obsolete, with a major flaw in the charger.

My real disappointment set in with the software.

I knew something was wrong when they announced on their blog that the first app released would be iOS only. Now I don’t have a problem with people leading with the iOS version, it is a huge market, but when your market differentiation is based on being “open” one would assume that an Android version would be first to encourage more contribution. Alas, the Android version seems to be more of an afterthought. You want to see it? Here it is:

Angel Sensor Android App Screen

Yup. You’re looking at it. No menu, no explanation, just four values.

To get to this point, I downloaded the app from Google Play, launched it, and then paired it with the band. You do this by tapping on the band’s button once, which will cause it to vibrate. The sensor will then show up on the app’s screen and you can connect to it. Note that you have to do this every time you launch the app, or at least I did. The sensor will be identified by a number and a MAC address.

On the main screen you get what I assume to be heart rate, body temperature, number of steps and some unknown value represented in units of “g”. No history, no way to, say, gather and export collected data, no way to even change the temperature units from Celsius to Fahrenheit. The title bar does show connection strength and battery life, but the only other thing is a tab for firmware updates. From what I can tell, it doesn’t actually check anywhere for firmware updates, but it would give you the ability to install one should it be released.

Of course, the app doesn’t tell you the current firmware version, and there doesn’t seem to be a download section anywhere on the Angel website (the support section is just a duplicate of the FAQ section).

Oh, and if you turn it sideways, you get some graphs.

Angel Sensor Android App graphs

No explanation of what is actually being graphed, but it does wiggle around a bit. I did find a Youtube video that suggests the top two graphs are heart related and the bottom one is motion, and the IOS app shown in the video has more features, but since I don’t have a modern iPhone I couldn’t check it out.

Since I couldn’t believe this was it, I kept searching for software related to the Angel Sensor. I did manage to find some code on Github (which appears to be an SDK for accessing the API and perhaps the sad Android app). Of course, the License file is incomplete and doesn’t really say under what license the software is published. Once again people, access to source code doesn’t make it open source.

And this is the biggest flaw with Angel Sensor and one reason I have such an issue with the current technology environment. You have people like Paul Graham crowing about creating income inequality and that has resulted in a new start up life cycle. You come up with an idea (that hasn’t changed) but now you wrap it a bunch of buzzwords, like “open”, “IoT” and “mobile”, even if you don’t really understand what they mean.

Next, you raise some money through crowdfunding, often underestimating the amount needed so your “funding” can be a success. Now, remember, your audience isn’t the poor saps who decided to give you money – you want to show viability to VCs who can deliver the money you’ll actually need. You use the initial cash to build just enough product to either get a lot of investment or get acquired by someone wanting to cut a year or so off developing their own product. This is considered “success” in some circles.

(sigh)

This whole thing is a shame since there is a market for a truly open mobile health platform. I’m not insisting that the hardware be open (I’m not going to build my own silicon molds) but all the software, including the firmware, should be available under an OSI-approved license. Then you need to focus on building a community, and that really requires good communication.

Angel Sensor sucks at communication.

In addition to the total lack of documentation, even their blog fails miserably. In the last half of 2015 there were a total of three posts, the last one from 3 November. I used to be able to get a rapid response from a person named Io Salant, but when I wrote to them a few months ago asking for a status I got:

Io is on a well-deserved leave. Your email has been forwarded to hello@angelsensor.com
The team is quite busy, but will make every attempt to get to your email as quickly as possible.

Of course, never heard back. So my now “classic” Angel Sensor is destined for eBay.

In summary, this is another case of a crowdfunded effort done by people in over their heads with no real desire to create something that lasts but to make money at the expense of their customers. I wouldn’t trust Angel Sensor to feed my dog, much less monitor my health, and I can only hope someone with actual ability will come out with a truly open personal medical platform.

Add a Weather Widget to OpenNMS Home Screen

I was recently at a client site where I met a man named Jeremy Ford. He’s sharp as a knife and even though, at the time, he was new to OpenNMS, he had already hacked a few neat things into the system (open source FTW).

Weathermap on OpenNMS Home Page

One of those was the addition of a weathermap to the OpenNMS home page. He has graciously put the code up on Github.

The code is a script that will generate a JSP file in the OpenNMS “includes” directory. All you have to do then is to add a reference to it in the main index.jsp file.

For those of you who don’t know or who have never poked around, under the $OPENNMS_HOME directory should be a directory called jetty-webapps. That is the web root directory for the Jetty servlet container that ships with OpenNMS.

Under that directory you’ll find a subdirectory for opennms. When you surf to http://[my OpenNMS Server]:8980/opennms that is the directory you are visiting. In it is an index.jsp file that serves as the main page.

If you are familiar with HTML, the JSP file is very similar. It can contain references to Java code, but a lot of it is straight HTML. The file is kept simple on purpose, with each of the three columns on the main page indicated by comments. The part you will need to change is the third column:

<!-- Right Column -->
        <div class="col-md-3" id="index-contentright">
                <!-- weather box -->
                <jsp:include page="/includes/weather.jsp" flush="false" />

Feel free to look around. If you ever wanted to rearrange the OpenNMS Home page, this is a good place to start.

Now, I used to like poking around with these files since they would update automatically, but later versions of OpenNMS (which contain later versions of Jetty) seem to require a restart. If you get an error, restart OpenNMS and see if it goes away.

Now the weather.jsp file gets generated by Jeremy’s python script. In order to get that to work you’ll need to do two things. The most important is to get an API key from Weather Underground. It is a pretty easy process, but be aware that you can only do 500 queries a day without paying. The second thing you’ll need to do is edit the three URLs in the script and change the location. It is currently set to “CA/San_Francisco” but I was able to change it to “NC/Pittsboro” and it “just worked”.

Finally, you’ll need to set the script up to run via cron. I’m not sure how frequently Weather Underground updates the data, but a 10 minute interval seems to work well. That’s only 144 queries a day, so you could easily double it and still be within your limit.

[IMPORTANT UPDATE: Jeremy pointed out that the script actually does three queries, not just one, so instead of doing 144 queries a day, it’s 432. Still leaves some room with 10 minute queries but you don’t want to increase the frequency too much.]

Thanks to Jeremy for taking the time to share this. Remember, once you get it working, if you upgrade OpenNMS you’ll need to edit index.jsp and add it back, but that should be the only change needed.

Dev-Jam 2016 Dates Announced

Yay! We have settled on dates for the eleventh (!) OpenNMS Dev-Jam Conference.

Dev-Jam 2015 Group Picture

Once again we will descend on the campus of the University of Minnesota for a week of fun, fellowship and hacking on OpenNMS and all things open source.

Anyone is welcome to attend, although I must stress that this is aimed at developers and it is highly unstructured. Despite that, we get a ton of things done and have a lot of fun doing it (and I’m not just saying that, there’s videos).

We stay at Yudof Hall on campus, and while that can scare older folks I want to point out the accommodation is quite nice and I’ve been told they they have recently refurbished the dorm. If you want to stay on campus the cost is US$1500 for the week which includes all meals.

If you prefer hotels, there are several nearby, and you can come to the conference for US$800.

Registration is now open and space is limited. If you think you want to come but aren’t sure, let me know and I’ll try to save you a space. We’ve sold out the last two years.

Oh, sponsorships are available as well for $2500. You will help us bring someone deserving to Dev Jam who wouldn’t ordinarily get to attend, and you’ll get your logo and link on www.opennms.org for a year.

Dev Jam!

OmniROM 6.0

For the last few days it has been hard to remain true to my free and open source roots. I guess I’ve been spoiled lately with almost everything I try out “just working”, but it wasn’t so with my upgrade to OmniROM 6.0 on my Nexus 6 (shamu).

I’ve been a big fan of OmniROM since it came out, and I base my phone purchases on what handsets are officially supported. While I tend not to rush to upgrade to the latest and greatest, once the official nightlies switched to Android “Marshmallow” I decided to make the jump.

Now there are a couple of tools that I can’t live without when playing with my phone. They are the Team Win Recovery Project (TWRP) and Titanium Backup. The first lets you create easy to restore complete backups and the latter allows you to restore application status even if you factory reset your device, which I had to do.

[NOTE: I should also mention that I rely on Chainfire’s SuperSU for root. It took me awhile to find a link for it I trust.]

When I tried the first 6.0 nightlies, all I did was sideload the ROM, wipe the caches, and reboot. I liked the new “OMNI” splash screen but once the phone booted, the error “Unfortunately process com.android.phone has stopped” popped up and couldn’t be cleared. Some investigation suggested a factory reset would fix the issue, but since I didn’t want to go through the hassle of restoring all of my applications I decided to just restore OmniROM 5.1 and wait to see if a later build would fix it.

Well, this weekend we got a dose of winter weather and I ended up home bound for several days, so I decided to give it another shot. I sideloaded the latest 6.0 nightly and sure enough, the same error occurred. So I did a factory reset and, voilà, the problem went away.

Now all I had to do was reload all 100+ apps. (sigh)

I started by installing the “pico” GApps package from Open GApps and in case you were wondering, the Nexus 6 uses a 32-bit ARM processor.

I guess I really shouldn’t complain, as doing a fresh install once in awhile can clean out a bunch of kruft that I’ve installed over the past year or so, but I’ve come expect OmniROM upgrades to be pretty easy.

One of the first things I installed from the Play store was the “K-9 Mail” application. Unfortunately, it kept having problems connecting to my personal mail server (the work one was fine). The sync would error with “SocketTimeoutException: fai”. So I rebooted back to Omni 5.1 and things seemed to work okay (although I did see that error when trying to sync some of the folders). Back I went to 6.0 (see where TWRP would come in handy here?) and I noticed that when I disabled Wi-Fi, it worked fine.

As I was trying to sleep last night it hit me – I bet it has something to do with IPv6. We use true IPv6 at the office, but not to our external corporate mail server, which would explain why a server in the office would fail but the other one work. At home I’m on Centurylink DSL and they don’t offer it (well, they offer 6rd which is IPv6 encapsulated over IPv4 but not only is it not “true” IPv6 you have to pay extra for a static IP to get it to work). I use a Hurricane Electric tunnel and apparently Marshmallow utilizes a different IPv6 stack and thus has issues trying to retrieve data from my mail server when using that protocol.

(sigh)

I tried turning off IPv6 on Android. It’s not easy and I couldn’t get any of the suggestions to work. Then I found a post that suggested it was the MTU, so I reduced the MTU to 1280 and still no love.

So I turned off the HE tunnel. Bam! K-9 started working fine.

For now I’ve just decided to leave IPv6 off. While I think we need to migrate there sooner rather than later, there is nothing I absolutely have to have IPv6 for at the moment and I think as bandwidth increases, having to tunnel will start to cause performance issues. Normal traffic, such as using rsync, seems to be faster without IPv6.

That experience cost me about two days, but at the moment I’m running the latest OmniROM and I’m pretty happy with it. The one open issue I have is that the AOSP keyboard crashes if you try to swipe (gesture type) but I just installed the Google Keyboard and now it works without issue.

I have to say that there were some moments when I was very close to installing the Google factory image back on my Nexus 6. It’s funny, but the ability to shake the phone to dismiss an alarm is kind of a critical app with me. Since the last time I checked it wasn’t an available option on the Google ROM, I was willing to stick it out a little longer and figure out my issues with OmniROM.

Heh, freedom.

OpenNMS at Scale

So, yes, the gang from OpenNMS will be at the SCaLE conference this weekend (I will not be there, unfortunately, due to a self-imposed conference hiatus this year). It should be a great time, and we are happy to be a Gold Sponsor.

But this post is not about that. This is about how Horizon 17 and data collection can scale. You can come by the booth at SCaLE and learn more about it, but here is the overview.

When OpenNMS first started, we leveraged the great application RRDTool for storing performance data. When we discovered a java port called JRobin, OpenNMS was modified to support that storage strategy as well.

Using a Round Robin database has a number of advantages. First, it’s compact. Once the file containing the RRD database is created, it never grows. Second, we used RRDTool to also graph the data.

However, there were problems. Many users had a need to store the raw collected data. RRDTool uses consolidation functions to store a time-series average. But the biggest issue was that writing lots of files required really fast hard drives. The more data you wanted to store, the greater your investment in disk arrays. Ultimately, you would hit a wall, which would require you to either reduce your data collection or partition out the data across multiple systems.

No more. With Horizon 17 OpenNMS fully supports a time-series database called Newts. Newts is built on Cassandra, and even a small Cassandra cluster can handle tens of thousands of inserts a second. Need more performance? Just add more nodes. Works across geographically distributed systems as well, so you get built-in high availability (something that was very difficult with RRDTool).

Just before Christmas I got to visit a customer on the Eastern Shore of Maryland. You wouldn’t think that location would be a hotbed of technical excellence, but it is rare that I get to work with such a quick team.

They brought me up for a “Getting to Know You” project. This is a two day engagement where we get to kick the tires on OpenNMS to see if it is a good fit. They had been using Zenoss Core (the free version) and they hit a wall. The features they wanted were all in the “enterprise” paid version and the free version just wouldn’t meet their needs. OpenNMS did, and being truly open source it fit their philosophy (and budget) much better.

This was a fun trip for me because they had already done most of the work. They had OpenNMS installed and monitoring their network, and they just needed me to help out on some interesting use cases.

One of their issues was the need to store a lot of performance data, and since I was eager to play with the Newts integration we decided to test it out.

In order to enable Newts, first you need a Cassandra cluster. It turns out that ScyllaDB works as well (more on that a bit later). If you are looking at the Newts website you can ignore the instructions on installing it as it it built directly into OpenNMS.

Another thing built in to OpenNMS is a new graphing library called Backshift. Since OpenNMS relied on RRDTool for graphing, a new data visualization tool was needed. Backshift leverages the RRDTool graphing syntax so your pre-defined graphs will work automatically. Note that some options, such as CANVAS colors, have not been implemented yet.

To switch to newts, in the opennms.properties file you’ll find a section:

###### Time Series Strategy ####
# Use this property to set the strategy used to persist and retrieve time series metrics:
# Supported values are:
#   rrd (default)
#   newts

org.opennms.timeseries.strategy=newts

Note: “rrd” strategy can refer to either JRobin or RRDTool, with JRobin as the default. This is set in rrd-configuration.properties.

The next section determines what will render the graphs.

###### Graphing #####
# Use this property to set the graph rendering engine type.  If set to 'auto', attempt
# to choose the appropriate backend depending on org.opennms.timeseries.strategy above.
# Supported values are:
#   auto (default)
#   png
#   placeholder
#   backshift
org.opennms.web.graphs.engine=auto

If you are using Newts, the “auto” setting will utilize Backshift but here is where you could set Backshift as the renderer even if you want to use an RRD strategy. You should try it out. It’s cool.

Finally, we come to the settings for Newts:

###### Newts #####
# Use these properties to configure persistence using Newts
# Note that Newts must be enabled using the 'org.opennms.timeseries.strategy' property
# for these to take effect.
#
org.opennms.newts.config.hostname=10.110.4.30,10.110.4.32
#org.opennms.newts.config.keyspace=newts

There are a lot of settings and most of those are described in the documentation, but in this case I wanted to demonstrate that you can point OpenNMS to multiple Cassandra instances. You can also set different keyspace names which allows multiple instances of OpenNMS to talk to the same Cassandra cluster and not share data.

From the “fine” documentation, they also recommend that you store the data based on the foreign source by setting this variable:

org.opennms.rrd.storeByForeignSource=true

I would recommend this if you are using provisiond and requisitions. If you are currently doing auto-discovery, then it may be better to reference it by nodeid, which is the default.

I want to point out two other values that will need to be increased from the defaults: org.opennms.newts.config.ring_buffer_size and org.opennms.newts.config.cache.max_entries. For this system they were both set to 1048576. The ring buffer is especially important since should it fill up, samples will be discarded.

So, how did it go? Well, after fixing a bug with the ring buffer, everything went well. That bug is one reason that features like this aren’t immediately included in Meridian. Luckily we were working with a client who was willing to let us investigate and correct the issue. By the time it hits Meridian 2016, it will be completely ready for production.

If you enable the OpenNMS-JVM service on your OpenNMS node, the system will automatically collected Newts performance data (assuming Newts is enabled). OpenNMS will also collect performance data from the Cassandra cluster including both general Cassandra metrics as well as Newts specific ones.

This system is connected to a two node Cassandra cluster and managing 3.8K inserts/sec.

Newts Samples Inserted

If I’m doing the math correctly, since we collect values once every 300 seconds (5 minutes) by default, that’s 1.15 million data points, and the system isn’t even working hard.

OpenNMS will also collect on ring buffer information, and I took a screen shot to demonstrate Backshift, which displays the data point as you mouse over it.

Newts Ring Buffer

Horizon 17 ships with a load testing program. For this cluster:

[root@nms stress]# java -jar target/newts-stress-jar-with-dependencies.jar INSERT -B 16 -n 32 -r 100 -m 1 -H cluster
-- Meters ----------------------------------------------------------------------
org.opennms.newts.stress.InsertDispatcher.samples
             count = 10512100
         mean rate = 51989.68 events/second
     1-minute rate = 51906.38 events/second
     5-minute rate = 38806.02 events/second
    15-minute rate = 31232.98 events/second

so there is plenty of room to grow. Need something faster? Just add more nodes. Or, you can switch to ScyllaDB which is a port of Cassandra written in C. When run against a four node ScyllaDB cluster the results were:

[root@nms stress]# java -jar target/newts-stress-jar-with-dependencies.jar INSERT -B 16 -n 32 -r 100 -m 1 -H cluster
-- Meters ----------------------------------------------------------------------
org.opennms.newts.stress.InsertDispatcher.samples
             count = 10512100
         mean rate = 89073.32 events/second
     1-minute rate = 88048.48 events/second
     5-minute rate = 85217.92 events/second
    15-minute rate = 84110.52 events/second

Unfortunately I do not have statistics for a four node Cassandra cluster to compare it directly with ScyllaDB.

Of course the Newts data directly fits in with the OpenNMS Grafana integration.

Grafana Inserts per Second

Which brings me to one down side of this storage strategy. It’s fast, which means it isn’t compact. On this system the disk space is growing at about 4GB/day, which would be 1.5TB/year.

Grafana Disk Space

If you consider that the data is replicated across Cassandra nodes, you would need that amount of space on each one. Since the availability of multi-Terabyte drives is pretty common, this shouldn’t be a problem, but be sure to ask yourself if all the data you are collecting is really necessary. Just because you can collect the data doesn’t mean you should.

OpenNMS is finally to the point where the storing of performance data is no longer an issue. You are more likely to hit limits with the collector, which in part is going to be driven by the speed of the network. I’ve been in large data centers with hundreds of thousands of interfaces all with sub-millisecond latency. On that network, OpenNMS could collect on hundreds of millions of data points. On a network with lots of remote equipment, however, timeouts and delays will impact how much data OpenNMS could collect.

But with a little creativity, even that goes away. Think about it – with a common, decentralized data storage system like Cassandra, you could have multiple OpenNMS instances all talking to the same data store. If you have them share a common database, you can use collectd filters to spread data collection out over any number of machines. While this would take planning, it is doable today.

What about tomorrow? Well, Horizon 18 will introduce the OpenNMS Minion code. Minions will allow OpenNMS to scale horizontally and can be managed directly from OpenNMS – no configuration tricks needed. This will truly position OpenNMS for the Internet of Things.

UPDATE: Alejandro pointed out the following:

There is a property in OpenNMS for Newts that doesn’t appear in opennms.properties called heartbeat org.opennms.newts.query.heartbeat, which expects a duration in milliseconds. By default is 450000 (i.e. 1.5 x 5min) and it is used when no heartbeat is specified. Should generally be 1.5x your biggest collection interval.

If you were using 10 minute polls, set it to 15 min (1.5 x 10min), and then you will see graphs. To do this, add the following to opennms.properties and restart OpenNMS:

org.opennms.newts.query.heartbeat=900000

Avoiding the Sad Graph of Software Death

Seth recently sent me to an interesting article by Gregory Brown discussing a “death spiral” often faced by software projects when issues and feature requests start to out pace the ability to close them.

Sad Graph of Death

Now Seth is pretty much in charge of managing our Jira instance, which is key to managing the progress of OpenNMS software development. He decided to look at our record:

OpenNMS Issues Graph

[UPDATE: Logged into Jira to get a lot more issues on the graph]

Not bad, not bad at all.

A lot of our ability to keep up with issues comes from our project’s investment in using the tool. It is very easy to let things slide, resulting the the first graph above and causing a project to possibly declare “issue bankruptcy“. Since all of this information is public for OpenNMS, it is important to keep it up to date and while we never have enough time for all the things we need to do, we make time for this.

I think it speaks volumes for Seth and the rest team that OpenNMS issues are managed so well. In part it comes naturally from “the open source way” since projects should be as transparent as possible, and managing issues is a key part of that.

The Inverter: Episode 58 – Nappy Hue Year

It’s a new year, and that means a new Bad Voltage.

Let’s hope the Intro is not an indication of things to come. Worst … intro … ever. Seriously, just jump to the 3 minute mark. You’ll be glad you did.

Okay, brand new year and that means predictions, where I predict that Jeremy will once again win. Yes, his entries aren’t all that strong, but he always wins.

The way the game works is that each member of the BV team must make two predictions, with bonus predictions available as well.

Jeremy’s Predictions:

  • This is the year that some sort of Artificial Intelligence (AI) or Virtual Reality (VR) device goes mainstream. I’m not sure if Mycroft or Echo counts as an AI device, but after playing with the Samsung Gear VR I made the prediction that VR would really take off this year. He specifically stated that the device in question would not be the Oculus Rift.
  • Apple will have a down year, meaning that gross revenues will be lower this year than in 2015. Hrm, I’ve been thinking this might happen but I’m not sure this is the year. In the show they brought up the prospect of Apple making a television, and if that happens I would expect enough fans to rush out and buy it that Apple’s revenues would increase considerably. But without a new product line, I think there is a good chance this could happen.
  • Bonus: a device with a bendable display will become popular. There are devices out there with bendable displays, but nothing much outside of CES. We’ll see.

Bryan’s Predictions:

  • Canonical pulls out of the phone/tablet business. While the Ubuntu phone hasn’t been a huge success, it is the vehicle for exploring the idea of turning a handset-sized device into the only computer you use (i.e. you connect it up to a keyboard and screen to make a “desktop”). I can’t really see Shuttleworth giving this up, but in a mobile market that is pretty much owned by Apple and Android, this probably makes good business sense.
  • In a repeat from last year, Bryan predicts that ChromeOS will run Android apps natively, i.e. any app you can get from the Google Play store will run on Chrome without any special tricks. Is the second time the charm?
  • Bonus: Wayland will not ship as the default replacement for X on any major distro. Probably a safe bet.

Jono’s Predictions:

  • The VR Project Morpheus on Playstation will be more popular than Oculus Rift. Another VR prediction, and it is hard to argue with his logic. Sony already has a large user base with its Playstation 4 console, and if this product can actually make it to market with a decent price point, you can expect a lot of adoption. Contrast that to the Oculus Rift, whose user base is still unknown, plus an estimated price tag of US$600 and the need for a high end graphics computer, and Morpheus has a strong chance to own the market. Making it to market and the overall user experience will still determine if this is a winner or a dud.
  • Part of Canonical will be sold off. Considering that Canonical has a number of branches, from its mobile division, the desktop and the cloud, the company might be stretched a little thin to focus on all of them. Plus, Shuttleworth has been bank-rolling this endeavor for awhile now and he may want to cash some of it out. Moving the cloud part of the company to separate entity makes the most sense, but I’m not feeling that this will happen this year.
  • Bonus: a crowdfunding campaign will pass US$200MM. The current record crowdfunding campaign is for the video game Star Citizen, which has passed US$100MM, so Jono is betting that something will come along that is twice as successful. As I’ve started to sour on crowdfunding, as have others I know, it would have to be something pretty spectacular.

Stuart’s Predictions:

  • People will stop carrying cash. Well, duh. It is rare that I have more than a couple of dollars on me at any time. Now, this is different when I travel, but around town I pay for everything with a credit card. I get the one bill every month and I can track my purchases. Heck, even my favorite BBQ joint takes cards now (despite what Google says). Not sure how they will score this one.
  • Microsoft will open source the Microsoft Edge browser. Hrm – Microsoft has been embracing open source more and more lately, so this isn’t out of the realm of possibility. If I were a betting man I’d bet against it, but it could happen.
  • Bonus: he was going to originally bet that Canonical would get out of the phone business, but since Bryan beat him to it he went with smaller phones would outsell larger phones in 2016. It’s going to be hard to measure, but he gets this right if phones 5 inches and smaller move more units than phones bigger than that. I don’t know – I love my Nexus 6 and I think once you get used to a larger phone it is hard to go back, but we’ll see.

The gang seemed pretty much in agreement this year. No one joined me in the prediction that a large “cloud” vendor would have a significant security issue, but both Jono and Jeremy mentioned VR.

The next segment was on a product called the “Coin“. This is a device that is supposed to replace all of the credit cards in your wallet. Intriguing, but it has one serious flaw – it doesn’t work everywhere. If you can’t be sure it will work, then you end up having to carry some spare cards, and that defeats the whole purpose. Coin’s website “onlycoin.com” seems to imply that Coin is the only thing you need, but even they admit there are problems.

It also doesn’t seem to support some of the newer technologies, such as “Chip and PIN” (which isn’t exactly new). This means that Coin is probably dead on arrival. Jeremy brought up a competitor called Plastc, but that product isn’t out yet, so the fact that Coin is shipping gives it an advantage.

I don’t carry that many cards to begin with, so I have little interest in this. I’d rather see NFC pay technologies take off since I usually have my phone with me. I need more help with my “rewards” cards such as for grocery stores, and there are already apps for that, like Stocard. I don’t see either of these things taking off, but I give the edge to Plastc over Coin.

Note: Stocard is pretty awesome. It is dead easy to add cards and they have an Android Wear integration so I don’t even need to take the phone out of my pocket.

The last segment was an interview with Jorge Castro (the guy from Canonical’s Juju project and not the actor from Lost). Juju is an “orchestration” application, and while focused on the Cloud I can’t help but group it with Chef, Puppet and Ansible (a friend of mine who used to work on Juju just moved to Ansible). Chef has “recipes” and Juju has “charms”.

I don’t do this level of system administration (we are leaning toward using Ansible at OpenNMS just ’cause I love Red Hat) thus much of the discussion was lost on me (lost, get it?). I couldn’t help but think of my favorite naming scheme, however, which comes from the now defunct Sorcerer Linux distribution. In it, software packages were called “spells” and you would install applications using the command “cast”. The repository of all the software packages was called the “grimoire”.

Awesome.

The show closed with a reminder that the next BV would be Live Voltage at the SCaLE conference. I’ve seen these guys get wound up in front of 50 people, so I can’t imagine what will happen in front of nearly 1000 people. They have lots of prizes to give away as well, so be there. I can’t make it but I hope there is a live stream and a Twitter feed like the last Live Voltage show so I can at least follow along. I can’t promise it will be good, but I can promise it will be memorable.

So, overall not a great show but not bad. I don’t like the title, and if you listen to the Outro you might agree with me that “Huge Bag Full of Nickels” would have been a better one.

Annual LinuxQuestions Poll

Just a quick note that the annual LinuxQuestions “Member’s Choice” poll is out. While I don’t believe OpenNMS is known to many of the members of that site, if you feel like showing it a little love, please register and vote.

http://www.linuxquestions.org/questions/2015-linuxquestions-org-members-choice-awards-117/network-monitoring-application-of-the-year-4175562720/

Many thanks to Jeremy Garcia for maintaining that site and including OpenNMS.