Actions

Mean Time to Repair (MTTR)

Revision as of 18:28, 19 May 2020 by User (talk | contribs) (Created page with "'''Mean time to repair (MTTR)''' is a basic measure of the maintainability of repairable items. It represents the average time required to repair a failed component or device....")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Mean time to repair (MTTR) is a basic measure of the maintainability of repairable items. It represents the average time required to repair a failed component or device. Expressed mathematically, it is the total corrective maintenance time for failures divided by the total number of corrective maintenance actions for failures during a given period of time. It generally does not include lead time for parts not readily available or other Administrative or Logistic Downtime (ALDT). In fault-tolerant design, MTTR is usually considered to also include the time the fault is latent (the time from when the failure occurs until it is detected). If a latent fault goes undetected until an independent failure occurs, the system may not be able to recover. MTTR is often part of a maintenance contract, where a system whose MTTR is 24 hours is generally more valuable than for one of 7 days if mean time between failures is equal, because its Operational Availability is higher. However, in the context of a maintenance contract, it would be important to distinguish whether MTTR is meant to be a measure of the mean time between the point at which the failure is first discovered until the point at which the equipment returns to operation (usually termed "mean time to recovery"), or only a measure of the elapsed time between the point where repairs actually begin until the point at which the equipment returns to operation (usually termed "mean time to repair"). For example, a system with a service contract guaranteeing a mean time to "repair" of 24 hours, but with additional part lead times, administrative delays, and technician transportation delays adding up to a mean of 6 days, would not be any more attractive than another system with a service contract guaranteeing a mean time to "recovery" of 7 days.[1]


The Importance of Mean Time to Repair (MTTR)[2]

Because MTTR ostensibly measures how long business-critical systems are out of service, it’s a powerful predictor of the impact an IT incident will have on the organization’s bottom line. The higher an IT team’s MTTR, the greater the risk that the organization will experience significant downtime when IT incidents occur, potentially leading to business disruptions, customer dissatisfaction and loss of revenue.

Technological failures are inevitable. Understanding MTTR gives organizations an idea of how quickly and efficiently they can expect to respond to these failures and return business operations to normal. On the whole, lower MTTR ratings are a sign of a healthy computing environment and a positive IT function.


Mean Time to Repair (MTTR) Formula[3]

The MTTR formula is calculated by dividing the total unplanned maintenance time spent on an asset by the total number of failures that asset experienced over a specific period. Mean time to repair is most commonly represented in hours. The MTTR calculation assumes that:

  • Tasks are performed sequentially
  • Tasks are performed by appropriately trained personnel

MTTR Formula

For example, if you have spent 50 hours on unplanned maintenance for an asset that has broken down eight times over the course of a year, the mean time to repair would be 6.25 hours. What is considered world-class MTTR is dependent on several factors, like the type of asset, its criticality, and its age. However, a good rule of thumb is an MTTR of under five hours.


The limitations of Mean Time to Repair (MTTR)[4]

Mean time to repair is not always the same amount of time as the system outage itself. In some cases, repairs start within minutes of a product failure or system outage. In other cases, there’s a lag time between the issue, when the issue is detected, and when the repairs begin.

This metric is most useful when tracking how quickly maintenance staff is able to repair an issue. It’s not meant to identify problems with your system alerts or pre-repair delays—both of which are also important factors when assessing the successes and failures of your incident management programs.

  1. What is Mean Time to Repair (MTTR)? Wikipedia
  2. Why is MTTR important? Splunk
  3. How to calculate MTTR FIIX
  4. The limitations of Mean Time to Repair (MTTR) Atlassian