Everyone in IT's goal should always be delivering quality solutions to meet or exceed a given set of business needs. The key word in that statement isn't delivering, it's quality. Most management is concerned with the delivery and being able to put a "check mark" showing their completion of the job. But if delivery is your main concentration, you are completely missing out on one of the fundamental rules of development which is quality. If your a Dot Com, then delivering early and often and with plenty of bugs may work for you. Think back to the early days of Twitter where with every refresh you saw the fail whale. Most businesses will not succeed long if this is managements approach to development. If you work for a "normal" enterprise and your systems stability were to match Twitter in those early days, where do you think that would leave you. Most likely, you will start having conversations with management about your future with the enterprise. A normal business is not going to get to many "Fail Whale" opportunities before customers start looking elsewhere. Even the founders of Twitter did not stand for it for too long. ( After some research, I came across an article that explained how Twitter evolved.) So how do you mitigate your risks of succumbing to a "Fail Whale" culture? This can be accomplished through the implementation of Change Management.
You may be wondering, what is Change Management, and what will this do for my company? For the purposes of this discussion, it is simply a documented process meant to enforce a set of standards for systems going into a production environment. These systems can be anything from the latest OS Fix Pack, to a brand new technology implementation like IP Telephony. Anything that can have an impact on a business and its IT systems should flow through this process while on it's way to production. The process will normally have several mandatory rules for instance, the system being implemented must attain some level of testing sufficient for the category of the change. It should also have optional rules that are situational like performance testing for major changes to existing systems and new implementations. Most importantly, it must have an escalation path that can short-cut the rules in cases where an emergency fix is needed to restore a system. All in all, it should exist to help you manage the chaos of day to day life in the IT world as well as build confidence in IT of your management as well as users.
The purpose of change management is not to stop changes from happening, but to make sure they are done as safely and effectively as possible. The tricky thing about Change Management is measuring the processes success. Every company will have slightly different ideas that shape what a successful change means. No matter how successful you are at putting changes in by any definition, you will never be perfect. Almost every release will have some number of defects that everyone hopes is minor or only affects a very small group of outlying users. While change management cannot prevent these issues, it can and should help to identify through testing of the application, the mitigation of bugs to such a point as they are deemed manageable. When an unmanageable issue arises or a team is identified to not be sufficiently prepared, the change should be delayed until the issue in question can be addressed.
The most successful change management process is one where there is a small number of changes that are stopped, backed out or delayed for sound reasons each year. While you may think this is counter intuitive, it is healthy. If no change is ever stopped, then you are actually weakening the process as a whole. It also starts reinforcing the fear of a perceived failure that being stopped can have, even though it should not. If you continually rubber stamp changes without stopping any from going into production, then the perception of the process changes over time from one of respect to one of annoyance. Over time a rubber stamping process will result in the people using the process to become complacent. I have even seen times when a failing project has blamed the Change Management process for not stopping them from making the mistake. If you rubber stamp for too long, then the IT group may actually be better off without the process. Explaining that you don't have a process is far easier than having to admit that the one you have does not work. We really are not suggesting that anyone try to remove an in place process, as even the worst change management processes can be corrected.
The best processes that we have worked with have kept it simple. The basics just work in this situation. If you have nothing, than start off with five basic questions everyone must answer "True" to, to get a change into production.
- Have you tested the change as thoroughly as is feasibly possible? Also, where is the test plan outlining what you did in the test phase?
- Have you communicated to all of the interested parties affected by the change? Normally through E-Mail or meetings about the upcoming change.
- What defects still exist and how will you be mitigating them?
- Do you have and have you tested your back-out plan to the best of your abilities?
- How many teams/people will be affected?
Be sure to document the process for any contractors or consultants that you may have since they, most of all, should be following it. From here, you will need to refine and enhance your process. Add questions as new situations come up and remove old ones that no longer apply. Then, you will need to create some kind of review board to be the checks and balance here. In small companies, this may be one or two people but in larger corporations, it is generally suggested to have at least one representative from each area of IT. Suggestions to this may be the manager of the System Admins, the manager of the DBAs and the CTO or Directory of Operations can make up the Change review board. No matter how you organize this group, remember that you need a heavy. This person is someone who is not worried about having friends after the meeting. In IT there is generally at least one person interested in the position, so look for them and take them up on it. This person must act on facts and not emotion.
So now that you have defined a process, how do you convince people to work through and follow the change management processes? The simple way out is to make it a mandate from upper management. The best way is to base peoples bonus' on how well they have followed the process and having a high success rate.
How do you convince Change Managers and Change Advisory Board Members to stop the changes they think pose a risk or that are from an unprepared team? Make sure everyone understands what a mistake costs the company. This includes both Business and IT resources. Add a total cost section to your outage or incident reporting process. Determine agreed too ways for accounting of costs between the Business and IT. Do you know what it costs you for an hour of down time for all of your major systems? If you do, are you looking at the soft costs like Sally in HR waiting for the PeopleSoft HR System to come back up so she can input everyone's annual bonus? What about the lost productivity of the IT employees that had to work on the issue? Don't forget about the doubling effect of lost opportunity cost for the time that people were able to use or had to support the system. If IT spends 30 FTE(Full Time Employee) hours to fix a problem, that time is not being spent on the other things they were going to do like patch the system or fix a bug in their code. As the son of a Marketing Executive, I can tell you that they understand the costs of doing business better than the majority of the IT people I have worked with. Making money is the glue that holds every organization together, even the Non-Profit ones. Hurt the money stream or the image of the company with Website issues and the costs will add up quickly. Twitter had the benefit of Ashton Kutcher and Opra to recover from their PR issues caused by downtime. Chances are most companies won't get that lucky.
Let me say this again, Stopping changes is healthy, good and normal when the change isn't ready or tested properly. No company can afford to shoot itself in the foot with changes of any size. The loss of confidence from a major outage caused by a change is far worse than a delay of the new shiny feature. There is always a fear that you will be marked a failure if you miss a date or have to back out a change. Put in a change that causes a four hour outage and hundreds of thousands of dollars and you will guaranteed that you will be marked a failure. If a change misses a date or is delayed because of a reasonable concern, you start to reduce the angst around being stopped. While it may irritate the business folks pushing the change ask them if they can afford the loss of confidence if the change fails will cost them. Over time you will free the group evaluating changes to say "No" without feeling like they destroyed someones career. When that happens then accountability and long term cost savings becomes the focus and key for the whole IT organization.
So when you implement Change Management and stop bad changes from going to production you can do nothing but win. Twitter got away with badly managed changes because they had major hype and even larger celebrity promoters. Chances are you and your company won't get that kind of help. So do the right thing and protect your company by standing strong against less than ideal or worse yet buggy untested changes to your production environments.
We have told you where we stand on Change Management what does your company do?
- Does your company have a Change Management process?
- If so, Are you happy with the Change Management processes you have?
- Do you have a success story for your Change Management process?
- How do you promote your change process to people in IT? What about to Business folks?
- What's the worst failure you have seen that should have been stopped and wasn't becuase of fear?