Recently I began playing around with free and open source tools that could be replacements for some closed source tools I have used to build and manage systems. I specifically wanted to be able to build servers up to a base level so that I could test out tools like Hudson, several different Source Code Management solutions and different application server platforms. I also run some systems in my home network that I find myself rebuilding often so I can be on the latest and greatest versions of my favorite distro’s. In the past, this has meant building a new machine, installing and configuring the software, then swapping out the old machine for the new one. That last step is almost always followed by several hours, if not days, of tweaking and trying to remember the settings I learned about the last time that made thing 1 work better with thing 2. I was able to improve efficiency at work with a traditional Vendor Model(Closed Source) software, but until recently had not found the open source alternatives to be that compelling. Considering that this is my hobby as well as my job, it has always just seemed faster and easier to set them up more or less by hand. Now that is all about to change and I am ready to be a Puppet Master.
Puppet as described via the Puppet website is:
“an open source data center automation and configuration management framework. Puppet provides system administrators with a simplified platform that allows for consistent, transparent, and flexible systems management.”
Why not use Launchpad, FAI or Spacewalk you may be wondering? I was looking for simple tools that do just what I require. This was also a much easier and faster setup. The time it took me just to read the documentation when I was looking into these solutions was more than I have spent installing and configuring Puppet. I am sure there are a few features I am missing, like the ability to install an OS instance, but I do not do that too often and was just looking to manage my systems with some tool for this round of changes. I have started playing with Novell’s Open Source Baracus(http://baracus-project.org/Site/Baracus.html) for doing OS Installs and updates. I will leave that discussion for an upcoming article.
Before you download and start playing with Puppet, I strongly suggest setting up a Source Code Management system like Git or Mercurial, assuming you don’t already have one running. I didn’t know which one I wanted to use so I downloaded the SCM machine from the guys at
There is nothing worse than accidentally deleting a configuration line you spent hours searching to find to solve a major issue. Except that is, saving the errant file into your Puppet repository, logging off and going home for the day expecting it to replicate while you sleep and solve a problem. While you sleep the file is replicating and services are restarting putting you back to where you began or making things even worse for you the next morning. If you have the file backed up and/or version controlled it will let you get back to where you thought you were when you left for home. The last thing is a tool like
. Etckeeper is a great little tool the folks at Puppet turned me on too. I have made several attempts in the past to add the use of version control to my /etc/ directory. If you don’t keep up with it or remember to commit your changes before an update or upgrade to software, you can often lose your most current configurations. Etckeeper has hooks into apt, yum and pacman-g2, that allow it to use your favorite SCM tool, as long as they are not svn or cvs, to check in any changed files to /etc before the package that contains it are installed or removed. I have only been using it for a couple of weeks and have already tested out it’s ability to correct my mistakes twice.
I use Ubuntu Servers at home and SuSE servers at work. I have chosen Ubuntu at home so I can keep up my ability to flip between Debian based and RPM based systems. One of the things I like most about Debian systems is the apt-get/dpkg package management system. With only a few commands I had Puppet installed on the server and a client ready to receive files. Once installed, another fifteen to thirty minutes and I had them talking and my first few files were under Puppet management. I have now setup a few RPM based machines using YUM instead of just RPM. It gave me a vary similar experience and the Puppet software just worked. That shows the level of polish and unity the Puppet project has going on with things they consider complete.
Having such good luck with the initial setup I decided to try setting up the Puppet Dashboard. It was at this point that things came to a grinding halt. The software installs well enough. It’s figuring out how to configure it that seems to be impossible based on the documentation provided. It’s all written in Ruby and while I could probably read the code and figure out what I am doing wrong, how many other people are going to do that? In the corporate world, probably no one will attempt this while at their day job. The almost complete lack of documentation on this part of the tool should have been a warning to me. Instead I spent several hours trying to figure out what I had done wrong. (At the time of this writing I still haven’t.) The good news is that this is a relatively new part of the package and the dashboard not running isn’t a showstopper. Configuring several additional servers was about three steps per server. No rebooting and only the services with related configurations needed to be restarted. After a few more hours of work I have the most common files I update replicating across the network. I then imported my DNS and DHCP related files and set them up to be managed with Puppet. The great part of this addition is that I can make changes to these files and once updated, Puppet automatically restarts the appropriate services.
Puppet is extremely flexible with how it is configured and where you can put the related files, so I decided that I would place them into a “special” directory. I then wrote scripts to help me remember the SCM commands to check the files in periodically. This now give me exactly what I have been wanting to do for years. Etckeeper puts copies of my files into the SCM in a directory for each server. Puppet then updates the files and with each update, etckeeper backs up the current file to the SCM.
Here is a drawing of a basic network similar to what I have in my home network.
The Puppet software let’s you design systems with a base configuration and then group the servers that are similar in purpose and apply additional configurations. So in the picture above, the Dark Blue lines represent the base configuration. The Red Arrow between the Puppet Server and Mail Server represents the Mail server specific settings. If you do have multiple servers that do a basic functions like DNS/DHCP above, you can even use templates to change the files or a set of the files it copies based on a server specific configurations. Puppet can then change the server specific parts of the configuration you tell it to and the server it’s being applied to. There are no real limits to what you can control and push out with Puppet, as long as it’s in a file and goes in a standard place. This could easily be used to manage things like Web Sites, but probably isn’t the best solution for replicating your file servers data, even though it could.
As you are setting Puppet up, one of the most interesting steps is that you are required to create certificates between the client servers and the Puppet Master Server. If managed properly, this gives you a relatively secure and reliable way to know the server you are configuring is meant to be that type of server and with that type of configuration. The documentation repeats in several places that while you can “mis-configure” the Puppet Master to accept any computers certificate on request, you should not. If you ignore this advice, it would allow anyone to request a key with any servers configuration. For example, I could stand up a copy of the company website and get the configuration then forge a whole new site.
Some other cool features is that the push to servers is staggered to keep them from all trying to pull the same file or files as soon as you update them. For most new Admins, this seems like overkill and if you are on a Gigabit network then it probably is. If however, you are on a mixed network with servers both local and remote, you may not have the bandwidth for a remote site to pull all of the files associated with changes to every server at the remote location at once. Once configured, it will also track the changes progress.
If you are looking for an easy way to manage the configurations of 5 or 50,000 servers, this tool makes it simple. The configuration of the tool is simple, the template language/format is great, and their future plans are only going to make it all better. All in all, we here at
have to give this a “Go Install.”