Recurring Failures Need the Right Model

Repeat failure ruleIf the same failure keeps returning, stop running the same root-cause meeting. Match the model to the problem.

5 Whys

Best for

Single-event failures with a clear starting point.

1Start with what failed.

2Ask why until the system cause shows up.

3Stop before it becomes a blame exercise.

4Write the corrective action before the meeting ends.

Watch for "operator error" as a lazy finish line.

Fishbone

Best for

Failures with several possible causes across people, process, equipment, and environment.

1Map the suspects wider than the broken part.

2Separate causes from guesses.

3Confirm the branch with evidence.

4Assign one owner to test the leading cause.

Watch for a pretty diagram with no data behind it.

FMEA

Use when

You need to prevent critical failures before they happen.

1List failure modes.

2Score severity, frequency, detectability.

3Act on the highest-risk items first.

4Turn the top risks into PM changes.

Rule Don't run FMEA on equipment that cannot hurt you.

Pareto

Use when

Everything feels urgent and the team keeps chasing noise.

1Pull repeat work orders.

2Rank by frequency or cost.

3Attack the few assets causing most pain.

4Review the top offenders every month.

Rule Dirty CMMS codes create confident bad decisions.

Fault Tree

Use when

The failure needs multiple conditions to line up.

1Start with the failure event.

2Map AND / OR cause paths.

3Find the condition stack that creates failure.

4Remove one condition so the stack breaks.

Rule This is a team analysis, not a form for one tech.

RCM

Use when

You need the right strategy for how an asset actually fails.

1Name the function.

2Trace failure, effect, and consequence.

3Choose PM, monitoring, redesign, or run-to-failure.

4Document why that strategy fits the asset.

Rule Start with critical assets. Full RCM is heavy.