Problem Recognition in a Failure Mapping (FM) Environment
Apart from the characteristic of being proactive or heading off problems before they occur, the most desirable characteristic for an industrial enterprise is the ability to quickly respond when problems begin to develop. Said another way, the objective is to have a highly tuned system of reactive response that does not allow problems to go for long periods without being addressed.
There is some value for placing this discussion in the context of how things work in the “real world”.
First, in a conventional (or non-Failure Mapping) environment, problems are frequently recognized in a top-down manner. In other words, the problem becomes apparent because of a highly visible downturn in overall performance. For example, production may be below demand or maintenance demands may be outstretching available resources. In other words the organization has begun to scream for help. After the highly visible signals become apparent, individuals begin to drill down into the functions, systems, sub-systems and finally components to identify the source of the chronic problems.
Second, most systems designed to recognize and respond to problems are tuned to ignore “onesie” and “twosie” problems. Typically, reliability engineering organizations are not staffed to respond to each and every little problem. More frequently, there are some accepted, but unwritten, rules that the reliability engineering organizations respond only to problems with the following characteristics:
• Problems are chronic with sufficient severity and frequency to justify the expenditure of the resources needed to correct them.
• Problems have a “bona-fide” solution that is certain to provide the Return-on-Investment required to justify the resource investment.
As a result, reliability organizations have unwritten rules that they do not go “chasing their tail” every time something fails.
As a result, responsiveness to problems is frequently delayed in non-FM environments.
On the other hand, the manner in which data is collected and analyzed in a FM environment provides an opportunity for organizations to respond much more quickly and accurately. In a FM environment, all repairs are closed by identifying the Failure Mode in the form component and condition. In addition, components are linked back to sub-systems, then systems and finally the specific individual (engineer) who is accountable for overseeing each system or equipment type.
With data structured in this manner, it is possible to establish a set of rules that triggers an alert to the accountable individual, but does so only after some “alert level” has been reached. For instance, to avoid applying scarce resources to “onesie” or “twosie” events (random failures), the alert level can be set to issue an alert only if a specific number of failures occur in a specific period of time.
For instance, if two of any specific component fails in a single week, it might be appropriate to issue an alert to the accountable individual. Or if three of any specific failure mode occurs in any three or four week period of time, it might justify an alert.
Clearly if a number of some common component fails in the same manner in a limited period of time, it should be taken as a signal that something is wrong. There might be a manufacturing defect or the components might have been generally misapplied. In any case, this kind of alert system prevents organizations from seeming unresponsive while at the same time prevents them from using all their scarce resources in a reactive manner.
Failure Mapping prevents organizations from having to depend on highly visible (and embarrassing) signals, (like production or resource shortages) to trigger responses. Rather than having to “drill down” to find chronic problems, chronic problems evidence themselves through the content and functionality of the information system.
This article introduces the following subjects:
• Responsive maintenance
• The role of Failure Mapping in a responsive environment
• Top-down or highly visible and embarrassing failure identification
• Pattern recognition
Readers are invited to comment on these or other closely related topics.
- Dan Daley's blog
- Login or register to post comments