While each company’s incident management process may have similarities, there are many factors to be considered to ensure the incident management process is effective and mature. We have created this incident management process website to promote incident management best practices to help you build a process that works for your team and company.
An incident is an event not part of the standard operation of the service causing an interruption to the quality of the service. The goal of having an established Incident Management process is to return the service to normal functionality quickly while minimizing the impact to the business. Some of the basic process fundamentals are to make sure your team is creating incident tickets for all issues, assigning ticket priorities, escalating as needed to appropriate resolver groups, and following up with the customer before closing the Help Desk incident ticket. As a new Help Desk manager you must audit the incident management process to ensure incident priority is set correctly, ticket classification categories are functional, and escalated ticket queues are being managed appropriately.
Objectives and Purpose of an Incident Management Process
Stating the objective and purpose of your incident management process procedure is important. An example of a purpose statement is “Incident management is the process to handle all incidents involving IT Personnel in a consistent, timely, professional, and cost-effective manner.” Examples of the objective of your incident management process procedure could include;
- To resolve an incident as quickly and efficiently as possible
- To ensure client satisfaction with the quality of support
- To provide a consistent and repeatable process for incidents
- Ensure the process is beneficial for Information Technology department, while minimizing the bureaucratic impact on the customer and support communities
- Supply accurate and timely information pertaining to incidents
- To use common process and tools for providing customer support that provides:
- Usability and responsiveness to enable quick call entry
- Measurements to understand workload
- Continuous review and improvement of the current tools and processes
- Links into other defined and approved processes
Incident Management Process Best Practices
Establish a good foundation by documenting the incident management process best practices foundational rules for your department. Below are some general incident management process best practices.
Incident Owner – Identifying the incident ticket owner is important to ensure that all activities are occurring in a timely manner. These activities include monitoring, tracking, and communicating status updates to both customers and Help Desk staff. All communication with the customer will be documented into the incident ticket. Typically the Help Desk Agent will be identified as the Incident Owner for all incident tickets they create.
Incident Tickets – All contacts and interaction with the customer must be documented into an incident ticket. If it is not documented, then it did not happen.
Incident Priority – The incident priority or severity should be set by using an incident priority matrix. It is important to prioritize incident tickets so normal operations can be restored as quickly as possible in a prioritized fashion with the highest priority incident receiving the most immediate attention.
New Incidents – If the customer is contacting the Help Desk about a new issue, the Help Desk Agent will create a new incident ticket and will fill out all appropriate ticket fields.
Existing Incidents – If the customer is contacting the Help Desk about an existing issue, the Help Desk Agent will search for existing tickets and will provide the user with a status update. The incident ticket must be updated with a summary of the interaction.
Escalation Queue Management – If the Help Desk is unable to resolve an incident, the Help Desk Agent will assign the incident ticket to the appropriate Escalation Queue for the escalated work team. The Manager of the escalation queue to which the incident has been assigned, will ensure the appropriate resources are monitoring the queue for newly assigned incident tickets. A member of the escalation queue will acknowledge the incident ticket and identify himself/herself as the assignee of the incident ticket. If the issue and customer information required to perform the resolution activities are missing, or if it was assigned to the wrong escalation group, the assignee or Escalation Queue manager will assign the ticket back to the Help Desk queue with a documented reason for the reassignment.
Incident resolution – The incident ticket should be resolved when the service has been restored to standard operation, which may be a permanent fix or a temporary workaround. Incidents should not be moved to a status of “resolved” until service has been restored. Check out our post on MTTR occurrence improvement areas to focus on.
Incident closure – Incidents should not be moved to a status of “closed” until the incident resolution has been confirmed with the customer. The Help Desk should have an incident closure process if the Help Desk Agent is unable to make contact with the customer after multiple attempts. We recommend that the Help Desk staff will attempt to contact the customer three (3) times by two (2) different methods (phone and e-mail) in a minimum five (5) business day period before moving the incident to a “closed” status.
Incident reopen – An incident in a “closed” status should never be reopened. If the incident was not resolved, a new incident ticket should be opened and it will be related to the previous incident.
Root cause – All priority 1 or critical incidents should have a problem management investigation ticket opened for a root cause analysis.
Due to most Help Desk resource constraints, not all incidents can be worked on simultaneously. Incident tickets will need to be prioritized by incident based on impact and urgency. Incident impact is the potential financial, brand or security damage caused by the incident on the business organization before it can be resolved. Urgency is how quickly incident resolution is required.
Major Incident Management
A major incident is an incident which demands a response and resource engagement level well beyond the routine incident management process. Therefore, a procedure for a major incident management should be designed to coordinate the response and accelerate the recovery process to return the IT service to a normal state as quickly as possible. Typically, a major incident is assigned a critical priority based on an incident priority matrix of impact and urgency. Additionally, major incidents could have a high priority assignment. Learn how to improve stability and availability in your environment by reducing the frequency and duration of Major Incidents at your company. Read Major Incident Management Best Practices.
Incident Management Communication Plan
A proactive Help Desk team will have Incident Management Communication Plan in place to follow when an outage to a service occurs. In advance of an outage, it is important to develop a well thought-out Incident Management Communication Plan detailing how people will be initially notified, what information they need, when status updates will be communicated, and what resolution steps occur when a service has been restored.
Customers of a service, technical service support staff, and service owners rely on the Incident Management team to obtain the latest status of a service outage and recovery. Incident Management Communication is typically handled in a coordinated effort via email, text messages, voicemail, web portal messages, and phone bridges. Incident Management communication reduces call volume to the Help Desk, allow the business to adjust their work activities, facilitates greater collaboration to resolve the incident, and keeps the leadership team informed of the status.
Post Incident Review
Post-incident review (PIR) is an evaluation of the incident response for major, critical and high priority incidents. The post incident review is typically initiated once the incident has been resolved. A post incident review is a process to review the incident from start to finish. The goal is to determine if the incident could have been handled better. Consistently performing post incident reviews are a great way to continuously improve the incident handling process.