Audits vs. Leader Standard Work

5S audits, standard work audits, and for that matter ISO-900x audits, are a frequent source of questions in various online discussion forums. At the same time, the topic of “leader standard work” comes up frequently, as it did in a recent question / comment on “Walking the Gemba.”

I think the topic is worth exploring a bit.

Let’s start with audits.

Typically the purpose of an audit is to check compliance with a standard. The auditor has a checklist of some kind that defines various levels of compliance. He evaluates the current situation against the checklist, and produces a score, a report of discrepancies, a pass/fail evaluation of some kind.

So, for example, a typical 5S audit would assign various criteria in each of the 5 ‘S’ words, and assign a 1-5 scale against each of them. Periodically, the person responsible for 5S will come into the work area, do an audit, and post the score. Often there is a campaign to “get to level 3” or something.

Although there are fewer boilerplate checklists out there, “standard work audits” tend to be pretty similar, at least the ones I have seen.

Further up the scale is something like an ISO 900x audit, or an “Class-A MRP II” audit or a corporate “lean assessment.” These are often done by outside agencies to certify the organization. There is a lot of work up front to pass the audit, a plaque goes on the wall, and everybody is happy.

So what’s the problem? (this is turning into one of my favorite questions)

The key is in the difference between a “check” and a “countermeasure.”

A countermeasure is a change or adjustment to the system itself so that the root cause of a problem, or at least its effect, is eliminated.

Audits, on the other hand, actually change nothing about the underlying system. All they do is assess the current state against some (presumed) standard.

Yet so many organizations try to use “audits” as a means to alter the system.

What an audit is good for (if it is planned and performed well – a big assumption) is to CHECK to see if the other things you are doing are working. But, by itself, it is “management by measurement.” People will do what they must to pass an audit (if it even matters that much to them), then go back to what they were doing before.

Leader Standard Work operates at a much lower level of granularity, and looks for different things. Think of the analogy in a previous post about cost accounting:

When dieting, standard cost accounting would advise you to weigh yourself once a week to see if you’re losing weight. Lean accounting would measure your calorie intake and your exercise and then attempt to adjust them until you achieve the desired outcome.

So, to paraphrase, audits are weighing yourself once a week (or once a quarter!) to see if you are losing weight. Leader standard work, on the other hand, is a process to continuously verify that the calorie intake is as specified, and the exercise is as specified, while those things are being done.

That, in turn, implies that there is a daily plan for calorie intake, and a daily plan for exercise. Without those specifications, there is nothing to check.

Leader standard work defines what the leader will check, when it will be checked, and how it will be checked. It also defines how the leader will respond if there is a problem.

He is looking for solid evidence of control.

Are things going as planned?

Is anything disrupting the work cycles or flow of material?

Are quality checks being made as specified?

And, in my opinion, the most important: Are problems being handled correctly, or worked around?

This is important because a culture of working around problems is one in which problems are routinely hidden, often without malice and with the best of intentions. But hidden problems remain, come up again tomorrow, and become part of the routine, adding a little waste, a little friction, making the system a little worse every day.

The typical effort to “pass an audit” reinforces this – it actually hides problems, and the auditor’s job is to ferret them out. This is the exact opposite of the kind of problem transparency we need.

It is human nature to work around problems, and it is the default behavior, everywhere. It takes constant leader vigilance, coaching, response to prevent it.

Is This a Problem – Part 2

Last week I posted a story of a failed freezer, ruined food, and a customer support experience that could be summed up as “That’s how we do it.” I invited comments and asked:

“Is this a problem?”

And when I say “problem” I mean, is this a “problem” from the standpoint of the company’s internal process?

There are some interesting comments, some about the internal culture of the company, others about the support process itself.

But I promised to offer my thoughts, so here they are.

The key question is “What did they intend to happen?” While we can speculate, unless we have the process documentation or are otherwise privy to that internal information, we really don’t know what they intended in this case.

Let’s assume, for the sake of argument, that Frank’s experience was exactly as the company intended it to be. Then, from the point of view of their internal process, there is no problem.

“Wait a minute!” I can hear, “Nobody wants  a customer to never buy the product again.”

And here is my point. We don’t know. This company may be perfectly willing to accept that consequence, i.e. “fire the customer” to preserve their warranty cost structure. They certainly would not be the first. Whether that is good business or not is a totally separate issue. The question is “Did they produce this result on purpose, as a logical, foreseeable outcome of the process as they designed it?.” If the answer is “Yes, they did” (and only they can know), then there is no problem. It might be bad business, but the process is working just fine. (I acknowledge that “bad business practices” can result in unintended results – like bankruptcy. But my point is the results are the outcome of a process, and the process is the result of a decision, even if that decision was to “not care.”)

The key point here is that only after there is clarity of what should happen, can the process itself even be addressed. Until the intended result is clear, then there is no way to see if the process works or not.

Was there a problem here? I don’t know. But this is what I would like people to take away from this little story.

Whenever something in your company seems “not right” ask this really powerful clarifying question:

“Did (or would) we do this on purpose?
If the answer is anything other than an unqualified yes then it is likely you have a problem.

Here is a tougher position: If something was unpleasant for your customer, and you don’t intend to fix it, then embrace the truth that you did do it on purpose. Take responsibility for your decisions, look in the mirror, and say “We meant to do it exactly that way, and will do it the same way next time.” If you can’t stomach that, then go back the the first question.

Here is an extra credit question for this little case study in customer support.
What, exactly, did the customer want here?

Systematic Problem Solving

If I were to look at the experience of the organization profiled in the last three posts “A Systematic Approach to Part Shortages” I believe their biggest breakthrough was cultural. By applying the “morning market” as a process of managing problems, they began a shift from a reactive organization to a problem solving culture.

I can cite two other data points which suggest that when an organization starts managing problem solving in a systematic way, their performance begins to steadily improve. Even managing problem solving a little bit better results in much more consistent improvement and less backsliding. Of course my personal experience is only anecdotal. That is certainly true by the time you read it here as I try to filter things. But consider this: The key difference with Toyota’s approach that Steven Spear pointed out in “Decoding the DNA of the Toyota Production System” (as well as his PhD dissertation) was that the application of rules of flow triggered problem solving activity whenever there was a gap between expected and actual process or expected and actual outcome.

Does this mean you should go out and implement morning market’s everywhere? Again, based on my single company data point, no. That doesn’t work any more than attaching kanban cards to all of your parts and calling it a pull system. It is not about the white boards, it is not even about reviewing the problems every day. Reactive organizations do that too. In most cases in the Big Company that implemented morning markets everywhere, that is what happened – they morphed into another format for the Same Old Stuff.

Here are some of the things my example organization did that I think contributed to success in their cultural shift.

  • They separated “containment” and “countermeasure” as two separate and distinct responses. This was to make sure that they all understood “containment” is what is done immediately to re-start production while preserving safety and quality. “Containment” was not a countermeasure as it rarely (if ever) actually addressed the root cause. It only isolated the effect of the problem as far upstream as possible. The rule of thumb was simple:
    • Containment nearly always adds time, cost, resources, etc.
    • A true countermeasure nearly always removes not only the containment, but reduces time, cost, resources.
  • They didn’t use the meetings to discuss solutions. They only addressed two things:
    • A quick update on the status of ongoing problem solving.
    • A quick overview of new problems from yesterday.

I think this is an important point because too many meetings get bogged down with people talking about problems, and speculating what the causes are. That is completely non-productive.

  • The actual people working on problems attended the meeting. I cannot over-emphasize how important this is. They did not send a single representative. Each person with expected activity reported his or her progress over the last 24 hours. It is difficult to stand in front of a group and say “I didn’t do anything.”
  • They blocked out time to work on problems. I probably should have put this one first. The manufacturing engineers and other professional problem solvers agreed not to schedule anything else for at least two hours every morning. This time was dedicated to working on the shop floor to understand the problem, and physically experiment with solutions. There was a lot of resistance to this. But over a couple of months it became close to the norm. It helped a lot because it started to drive the group to consider where they really spent their time vs. what they needed to get done. There was no doubt in the past that solving these problems was important, but it was never urgent. Nobody was ever asked why they weren’t working on a shop floor issue. That had been a “when I am done with everything else activity.” Gradually the group developed a stronger sense of the shop floor as their customer.
  • They didn’t assign problems until someone was available to work on them. This came a little later, when the problem-solvers were missing deadlines. The practice had been to assign a responsible person in the morning when the problem was first reviewed. Realistically a person can work on one problem a time, and perhaps work on another when waiting for something. They established a priority list. The priority was set primarily by manufacturing. When a problem-solver became available, the next item on the list was assigned. Once a problem was assigned, nothing would over-ride that assignment except a safety issue or a defect that had actually escaped the plant and reached a customer.
  • They got everyone formal training on problem solving with heavy emphasis on true root cause. People were expected to follow the method.
  • A problem was not cleared from the board until a long term countermeasure had been implemented and verified as working.

By blocking out time, they were able to establish some kind of expectation for productivity. After that, if problems were accumulating faster than they were being cleared, they knew they had a methods or resource issue. The same was true for their other tasks which were worked on during the rest of the day.

This was the start of establishing a form of standard work for the problem solvers.

21 Oct 08 – There is more on the subject here and here

What Nukes – a little more clear.

I re-read my “What Nukes?” post and realized I was really rambling. I want to reiterate a key point more clearly because I think it is important.

In the “Bad Apple” theory there is an implied assumption that the cause of an accident or other problem was one person who, at that moment in time, was not following the documented rules or procedures.

Except in the most egregious cases, such as deliberate misconduct, that is likely not the case. Most organizations have a set of “norms” that operate at some level of violation of the written or established procedures. The reasons for this are many, but usually it is because good people are doing the best they can, in the conditions they are given, to get the job done.

Failure to follow the rules does not result in an accident or incident.

Have you every run a red light or a stop sign? It happens thousands of times every day. It almost never results in an accident. Only when other contributing conditions are ripe will an accident result. Running a stop sign AND a car coming through the intersection.

The same goes for quality checks, and the more reliable an “almost 100%” process becomes, the more vulnerable you are. If a defect is only rarely produced, it is unlikely that any kind of human-based inspection will catch it. The faster the work cycle, the more this is true. The mind numbs, it is impossible to always pay attention to the detail, and the mind sees what it expects. “Failure to pay attention” is never an adequate root cause. It is blaming an unlucky Team Member for an omission that everyone makes every day just going through life. It is just, in this case, “there was a car coming through the intersection.” It is bad luck. It is being blamed for red beads in Deming’s paddle experiment.

So attaching the failure of an individual, while it is easy, avoids the core issue:

People’s failure in critical processes is a SYSTEM PROBLEM. You must investigate from the viewpoint of the person at the pointy end. What did he see? What did he perceive? What did he believe was happening and why was that belief reasonable given his interpretation of the circumstances at the time.

The post about “sticky visual controls” got to this. Your mistake-alerts or problem signals must penetrate conciousness and demand attention if they do not actually shut down the process.

Standards Protect the Team Members

One of my kaizen-specialists-in-training just came to me asking for help. The Team Members he is working with are not seeing the need to understand sources of work variation.

I hear that a lot, both in companies I have worked in and in the online forums. Everyone seems to think it is a problem in their company, their culture – that they are unique with this problem.

The idea of a unique problem is variation on the “our process / environment / product is different so ____ won’t work here.” Someday I will make a list of the standard management “reasons why not” but that isn’t the topic of this post.

I told him:

  1. This is not unique to China, or to this facility. The same resistance a always comes up, and nearly always comes up the same way once the Team Members begin to realize we are serious.
  2. There is no way to just change people’s minds all at once.

Here is something to explain to the concerned Team Members: The standard process is there to protect the team member. If there is a problem, and the standard process was followed, then the only focus for investigation can be where the process itself broke down. Countermeasures are focused on improving the strength of the process.

If, on the other hand, the process was not followed (or if there is no process), then the team member is vulnerable. Instead of the “Five Why’s” the investigation usually starts with the “Five Who’s” – who did it? Countermeasures focus on the individual who happened to be doing the work when the process failure occurred.

As you introduce the concept of standard work into an area that is not used to it, it is probably futile to try to tighten down everything at once. The good news is that you really don’t have to.

Start with the key things that must be done a certain way to preserve safety and quality. If they are explained well and mistake-proofed well, there is usually little disagreement that these things are important.

The next step is to make it clear that the above are totally mandatory. If anything gets in the way of doing those operations exactly as specified, then STOP. Do not just work around the problem, because doing so makes you (the Team Member) vulnerable to the Five Who’s inquisition.

If you focus here for a while, you will start to get more consistent execution leading to more consistent output, which is what you want anyway.

Then start looking at consistent delivery and all of a sudden the concept of variation in time comes into play. Why was this late? The welder ran out of wire, I had to go get some more, I couldn’t find the guy with the key to the locker…… Go work on that. At each step you must establish that the point of all of this is to build a system that responds to the needs of the people doing the work.