Jack Hughes over at Tech Teapot posted a story about how nice it is to have RAID on your server when a drive fails, and this got be to thinking about our own servers.
For the record, I’m a security and backup nut. The Dell server that hosts opennms.org and this blog has six drives. Five are configured RAID-5 and the remaining drive is a hot standby. There is even a spare drive sitting on top of the device just in case a drive fails so someone can quickly replace it. Even though I figure the chances of two drives failing in quick succession is pretty slim, why take chances? (grin)
Also, “RAID is not backup“, so we also rsync that server every night as well.
But Jack’s article made me realize that I had not taken the time to actually monitor the status of our RAID. Since OpenNMS is our monitoring tool of choice, I decided to write up how to go about it and I placed “Monitoring a Dell PowerEdge Expandable RAID Controller 3/Di” on the wiki.
Now I have the “afaRAID” service up and running and I’ll get notified if anything goes wrong. For as much sleep as I’ve lost over the last six years worrying about OpenNMS, this step will at least make me sleep a little easier.