Automation – Can to much of a good thing be bad?

Senior  systems administrators on any platform know that automation is the  single fastest way to improve the effectiveness of their team.  Scripts  provide stability, repeatability and reduce the time spent on often  repeated tasks.  If done correctly, automation will make everything more  stable and manageable.  

However,  scripts for managing systems can be a double edged sword.  On one hand,  they make a team highly efficient.  They can help junior admins perform  far above their experience level and free senior admins up to  investigate more difficult problems.  On the other hand though, they can  lead to a loss of knowledge.  The knowledge it took to create the  scripts becomes locked inside of them.  So what do you do to strike the  proper balance?  How can you keep the knowledge fresh in every-one’s  mind while still automating?  What steps can be taken to avoid knowledge  erosion and worse the brain drain or vacuum that is left when people  leave?

The  first thing to remember is that there is no one thing that can be done  to answer these questions.  Here we will provide you with some tips and  ideas we have found to be useful and effective.  This is a short list  and we hope that it will inspire you to think about what might work for  you and your company. 

The  first item is well documented scripts and procedures.  Taking 5 minutes  to write up what you were thinking when you wrote the script can save  you days trying to figure it out later.  As more object oriented  scripting languages like Python, Ruby and Perl take hold, it becomes  easier to break down complex scripts into much easier and digestable  chunks.  These smaller chunks, like the core ideas behind Linux, should  do one thing and do it well.  The names of the functions should describe  what they do.  For instance, a function called createNewSSHKeys, should  probably create new SSH keys.  This combined with an explanation of  what you were trying to do inside the function will help you and others  manage them.  When you get really good at this way of thinking, people  should be able to take your function calls out and write a manual  procedure that could replace your automation.  If that is your goal,  then it only makes sense that starting with a well documented procedure  to compare against when your done scripting makes sense.  It is unlikely  that every procedure step will match a function or series of function  calls.  Getting everything close does count though.

As  much as self documenting scripts helps though,  documenting  configuration files for your scripts can keep things fresh in peoples  mind.  At the very least, if done correctly, it will give them a  breadcrumb trail to follow to see if what they think is being set is  set.  We recently began testing out Puppet, an automated way to manage  server configuration files and other admin related tasks.  The  configuration files for Puppet can be used as a great example.  They  allow you to use a combination of intelligent names and comments to  inform the person reading the file what will be changed.  They also  include a description of where to look to verify that the changes are  being done correctly.  This means that I don’t need to know Ruby, the  language Puppet is written in, to figure out how or what its going to  do.  The configuration file itself tells me everything I need to know.   When you write your own script, the time it takes to do this may not be  warranted.  So at the very least, make sure that you have comments that  tell people where to look for the output based on these configurations  or what the configurations mean in the file.

Try  to keep everyone with the sharp skills needed so they are ready to  slice through problems as they arise. This also means internal training.   One of the things we have participated in on a regular basis is a  short one hour refresher put on by the subject matter experts(SME) for  each of the technologies we use.  Doing this accomplishes a few  different things at once.  It helps the SME keep their documentation  current.  It gives the SME an opportunity to share changes they want to  make or have made in the environment.  Then it gives everyone supporting  the environment a chance to ask questions about the technology when  there is no pressure.  When possible, annual reviews of each area that a  team supports, goes a long way towards elevating the teams ability to  be as productive as possible.

While  you can never completely prevent brain drain when a team member leaves,  the steps above, if done correctly, can go a long way.  Having been the  person transitioned to more than once, the better these steps are  followed, the better we have felt about taking on the responsibility.   Another side effect of these approaches and others along the same  thought process is that it allows people to migrate from one SME area to  another.  This helps people stay fresh and keeps them from becoming  bored and complaisant.  The more driven your team is to solve businesses  problems, the more profitable you will be.