Since we have an expectation that “things just work,” the visibility to incident management can take center stage and as a result is often described as a “high-value process.” The real challenge is that we view value in this manner at all. When we take a more objective look at this definition, we see that we want to avoid incidents at all cost rather than celebrate that we are great at resolving them in the first place.
In its simplest description, an incident is something no longer working as it was designed. This characterization alone should tell us that this is the opposite of value add. The trouble is that culturally we “love the hero,” and incident managers can be seen as those who restore service when we need it most.
Because of this need to ensure service is restored as quickly as possible, many of the support people outside of the actual incident become very hands-off in an effort not to slow things down with too many hands working to help. This led me to the statement:
“Just because you are not an incident manager doesn’t mean that you can’t help improve the process.”
Think about that for a moment—everyone has some part in improving how an incident can impact our business. Here’s just a sample:
Depending on our organization setup, the service desk analysts may not be managing the incidents themselves. However, they are the piece of IT that faces the business, so what they do during the incidents is important. While they are likely capturing the escalations, this is a good time to also capture some knowledge about the service that is impacted. We may know that a capability is unavailable, but does IT truly understand the business impact? Gathering further information from the business will allow us as IT to better understand the impact and improve communications. After the incident, details such as these are important in a post-mortem so that, if we need to adjust our responses, we can do so based on the impact the business is reporting.
IT Operations Manager
Infrastructure monitoring is something that in many cases is done “by operations for operations” in many organizations, as we just haven’t tied it into service management for one reason or another. It would make sense to correlate these alerts into real-time incidents, so why isn’t this being done? While doing this would allow us to identify issues before the business sees the impact, in reality many times the alerting mechanism is set up as an afterthought to the incident process. Understanding and identifying these thresholds will enable the ops team to facilitate what alerts are going to improve service delivery. This will not only improve response time on incidents, bu this proactive approach will also improve the performance of the service from the businesses perspective.
IT Application Manager
We have all been involved in an incident that was escalated to networks because we all know that “this must be a networks issue.” One of the many challenges for incidents as they apply to application-level issues is that the symptoms could point to many things. From an application management perspective, having a solid knowledge repository of issues allows the incident manager or even the service desk to ask better questions in the event of an issue. Rather than saying that the users are not able to see module x on the application, they would be able to look up previous issues to see that, when an issue with module x arises, we need to ‘check the following three items’ to better determine a cause for the issue. Remember that knowledge is power. As an application manager keep in mind that no matter what the issue you have an insight to the application that no one else might. Even when the issue is not an application one getting involved in the incident process will ensure that we have a full 360 picture of the environment and make corrections that are from a big picture view.
If you were to look into any incident and think of ways that it could be resolved quicker you will find that the wasted time is always spent in lack of communication. Something is broken, people are already aware of this. There is no point on keeping this a secret. Get everyone involved and communicate as a single unit. Make no assumptions that they either are aware of this is not their issue. Talk as a service delivery organization and that is the results you will deliver.
Everyone plays a part in incident management, big or small. From dealing with escalations, event management, getting involved and improving communications. Start to think about what you can do, not only to improve your incident management process, but your overall delivery of services.