Zenoss how the big four should do monitoring…

The biggest benefit to both Open Source Producers and consumers is the community.  The Zenoss community is its greatest strengths as we learned in podcast number 21 back on 3/13/2010.  The tool is being used by some large corporate customers right along side an army of small businesses. If you do not have a staff experienced with Zenoss or a large enough staff to properly roll it out, they have the ability to support any size company, for a fair fee of course.  As opposed to Groundwork which is based on Nagios, Zenoss is a completely distinct product.  Zenoss is developed as a blended company that delivers an Open Source and free to use Core product.  Zenoss also offers additional support features through their Enterprise version, for an additional fee though.

So how was Zenoss to use?  Well if you actually read the documentation and watched the videos the tool is straight forward, relatively easy to use and quick to get up and running.  After the normal initial learning curve with the UI you can start to really get to the meat and value of what the product has to offer.  Your mileage may vary, but I started to get the hang of it after watching the videos and spending about fifteen or twenty unfocused hours on it.  As has been my experience with most software, the more you know about this type of software, the easier it will be for you to get up to speed.

Let me state this again, watching the videos helped immensely, so at least start there if you do not want to read the manuals either before or as you are getting this setup.  The UI for Zenoss was the hardest item for me to learn.  While chatting with Matt and Mark from Zenoss, they assured us they understood it was a problem and that we should expect big changes in this area within the next few releases.  Once you have decoded how to work in the app, it really starts to make sense.  I could start to see the logic in what started out as chaos. 

Once you have enough data to work with, I started with just a few days worth, the tool starts to get interesting. Creating custom reports and alerts are so easy that I could easily see people ending up with report overload.  For reports, you tell the tool the server or group of servers you want to report on.  Then you tell it what out of the available metrics you want to report on and how to layout the report. The tool is all Ajax/Web Gui based and it works smoothly, and really is just that easy.

One of the neatest features in Zenoss is the way they handle alerting.  You have the option as a user to setup your own alerts.  Alerts can also be setup for groups as most normal systems do.  Why is that something neat?  I have been in very few IT shops where team members  I worked with, didn’t each have their own pet systems or applications.  Allowing each of them to set up the extra alerting they want, on a one by one basis, is one of the many signs that experienced operational engineers built this system.  There are other little things that support personnel will pick up on that just make you stop and say “WOW, someone really thought of that feature.”  It is these little differences, that as individual items, do not seem like a lot but as a collective you will quickly learn to love about this tool.  

The next big thing with Zenoss is what they call ZenPaks.  ZenPaks are groups of scripts and small applications that add functionality like a plug-in in FireFox.  This is where the strength of the Community really comes in.  I am running an ESXi Server at home on a Core i7 machine I built.  While I love the server, VMWare has intentionally encumbered several of the features that normal ESX has.  One of those is in the area of monitoring.  VMWare intentionally built the system with no SNMP based agent built-in.  With most systems, this means you are just out of luck for checking anything other than if the machine is up and has connectivity to it.  With Zenoss, there is very likely a ZenPak for that.  If a ZenPak does not already exist, there is a group of people in the community that love challenges and are eager to help you create a ZenPak for that.  This level of support is really helping the Zenoss team and community to set themselves apart.

So what didn’t I like about the product?  The UI takes serious effort to master.  The tutorials and hours of videos are a tremendous help while the Zenoss team works to make it more intuitive.  The other issue is the limited support for using SSH.  It is another area we were assured is being addressed, but took me considerable effort to figure out the first time I tried.  By contrast snmp based discovery worked perfectly, assuming that all of your machines are using the same read and write keys or user name and password.  The last minor issue is that several of the services I have running on my test machines were either misidentified, causing a failure after discovery, or missing completely.  This is easy to fix for small environments of less than 50 servers and it won’t take you a long time to correct.  Another feature I missed that would help, is the import feature as a way to add systems to your installation.

Once you have this tool up and running, you really do start forgiving the pain it put you through to get there.  Creating reports quickly and using the event correlation features starts to pay off quickly.  The Zenpaks will help you keep things monitored without having to write something custom.  All and all this is definitely a solid, scalable and flexible system for monitoring.  I suggest that you download the VM and give it  a try.