Incident Problem and Known-Error Relationships

An error in the infrastructure causes a disruption in service. This disruption causes an end-user to call the Service Desk and create an incident record. If the incident can not be resolved, or if it indicates an underlying problem that needs to be addressed, then a problem record is opened and associated with the incident. Once technical staff investigates the problem and finds the root cause then the status of the problem record is changed to “known error” and a known error record is created in the known error database. The problem record and known error record should be linked.

Typically, a known error requires a work around be provided to Service Desk so that they can restore service, while the error is being resolved. Once the known error record is created, Problem Management will create an RFC and forward it to the Change Manager. Once the change has been implemented and verified the RFC can be closed, which results in a cascading closure of problem records, and any associated incident records that may still be open. Known-error records should remain in place and be removed only when the underlying technology is no longer being used.

Incident records and problem records are logically and physically different records. This is true even though they may reside in the same application. This is important for a number of reasons.

The goals of incident management and problem management are different and drive different behavior. Incident records provide a record of the business impact. While, problem records provide a record of the It costs involved in providing stability. As a result, reporting requirements are different between the two. Much of the inadequacy of current reporting techniques stems from the failure of organizations to distinguish between these types of records.

Consequently, an incident can never be said to turn into a problem. What can happen is that an incident causes a problem record to be created, resulting in initiating the Problem Management process. Opening of a problem record ensures that a decision will be made to determine allocation of resources and investment in downtime for investigating and potentially resolving the root causes of errors in the infrastructure. Incident records should be opened and closed with the end-users experience and problem records should be opened to initiate a decision and can be kept open as long as necessary.