There Are (almost) No Big Problems

In a “management by metrics” world, problems are detected when performance indicators are off track. Perhaps inventory is too high, first pass quality is a problem. Maybe operational availability is tanking.

Once the problems are abstracted into numbers, the numbers become the problem. The solution, then, is usually a directive to reverse the trend, to improve this measurement or that measurement, to get things back on track.

Here is the rub: The numbers aren’t the problem. They don’t even tell you what the problem is without a whole lot of investigation, digging, and stratification. And why was this investigation and stratification necessary in the first place? Because now you are trying to sort out a batch of problems that weren’t dealt with, one-by-one, at the moment they occurred.

The further you go up in the organization, the more things are aggregated and summarized. And the more they are aggregated and summarized, the more things look like big problems, even though they are composed of many (hundreds, sometimes) of little problems.

Let me cite an example. I was in a factory where the site manager was proudly showing off his real time display of Overall Equipment Effectiveness (OEE). Each time the equipment slowed, for any reason, the display would immediately change. He could see, at an instant, how the day’s OEE compared with his target, and, he said, “take action.”

But exactly what action is he going to take?
Even the real-time display lags the actual problem.
Two weeks ago a lubrication point was missed. Today something is overheating, and we need to slow or stop the machine to deal with it.

A bracket securing a roller is loose, today the machine is stopping all the time as they clear jams.

A hole in a screen let chips and debris clog a filter, and the coolant pumps are overheating.

This list can go on. The point is that the things that actually slow down or stop the machinery are second, third, fourth levels in chains of events. The actual problem had no immediate effect on the machinery.

In this system, the very best they will ever do is fix it fast and hold the number at some level. They will never be able to actually improve it, because they aren’t dealing with the underlying systemic issue: They aren’t finding and dealing with the actual root causes.

When you get to financial measures, things are even more abstract.

Inventory levels (or inversely, inventory turns) is a great example. “We have too much inventory!” “Our inventory levels are above the industry norm!”

In the boardroom, or in the Chief of Operations’ office, this is all they can see. Because most leaders in that position really like direct action, they.. um.. “direct” that some “action” be taken.

So the organization goes to war against inventory.

Two levels down, “We have too much inventory” is still characterized as the problem because that is what they have to report every month.

Two more levels down it is often still attacking the symptom – too much inventory.

But inventory isn’t the problem.

The problem is an engineering change that missed one part on the bill of material update, delaying production for a day while all of the other parts continued to roll in.

The problem is a paint system that is unreliable, so the factory obsesses on maintaining four hours of buffer on either side of it.

The problem is pressure to keep the line moving, so people work ahead, and push incomplete or problem units into the “hospital” for later repair.

The problem is that the upstream processes have to be ready to respond to massive fluctuations in assembly’s demand.

The problem is a broken jig, local initiative to make something else instead, resulting in too much of what nobody needs and none of what they do – because they don’t want to waste time doing nothing.

An additional problem is that the typical organizational response to these problems is to add more inventory because each local unit needs to protect itself from the dysfunction problems of their suppliers and customers.

Worse, just “taking out inventory” is going to tank everything else because the underlying problems are still there.

Inventory isn’t the problem.

The problem is that each of these small problems is too small to bother addressing as a systematic breakdown that leads to “too much inventory.” While the high-level leaders are looking for a big problem, they are missing their real responsibility.

It isn’t to “do something about inventory.”
It is to ensure that the small problems that occur every day, the ones which cause all of that inventory to accumulate in the first place, actually get addressed. Actually no. Their responsibility is to ensure there are systems in place which catch those problems and address them. They do this by teaching, setting example, then checking.

Leaders – if you want to “do something about inventory” you have to do it by setting an expectation that:

  • Processes are well defined.
  • Those processes are followed.
  • When there is a problem following the process, it is raised immediately.
  • When a problem is raised, it is swarmed, fixed, understood and prevented.

You handle big problems by making sure the small ones are dealt with, every day, all day.

Then your OEE will go up quickly. OEE is an indicator of how well you respond to problems, not how well your machines run.

One more thing – for my Health Care readers.
You can substitute “medication errors” for any of this. But until there is a system in place that alerts anytime someone spots the possibility of confusion, medication errors will occur at pretty much the same rate they always have.

Attack on Ambiguity

When real effort is spent getting to the cause of problems (vs. a reflex to find someone to blame), ambiguity often enters into the picture.

Problem solving is a process of asking questions and clarification.

Is a “defect-free” outcome of the process specified? Does the Team Member know what “success” is?
Is there a way for the Team Member to actually verify the result?
Does that check give the Team Member a clear Yes/No; Met/Not Met; Pass/Fail response? Or is there interpretation and judgment involved?

If there is good specification for “defect-free” – is there a specification for how to achieve it? Do you know what must be done to assure the result that you want? Does the Team Member know? Is the Team Member guided through the process? Are there verifications (poka-yoke, etc) at critical points?

If all of the above is in place, do you know what conditions must be in place for success? What is the minimum pressure for your air tools? Is there a pressure gage? Does it cut off the tool if the pressure is too low? Is there some visual check that all of the required parts, pieces, tools are there before work starts? Thank about the things you assume are there when the Team Member gets started.

For ALL OF THE ABOVE, if something isn’t right, is there a clear, unambiguous way to alert the Team Member immediately?
Is there a clear way for the Team Member to alert the chain-of-support that something isn’t right?

If there is a defined result, a defined process to achieve it, and all of the conditions required for success are present –
Is the Team Member alerted if he deviates from the specified process, or if a critical intermediate result is not right?
More poka-yoke.

Now you have the very basics for consistent execution.

Is the process carried out as you expect?
Is the result what you expected?

If not, then the process may be clear, but it clearly does not work. Stop, investigate. Fix it.

All of this is about getting more and more about what is SUPPOSED to happen and compare it to what is REALLY happening… continuously. I say continuously because “continuous improvement” does not happen unless there is continuous checking and continuous correction and problem solving. If you want “continuous improvement” you cannot rely on special “events” to get it. It has to be embedded into the work that is done every day.

Ask “Why?” – but How?

Get to the root cause by “Asking Why?” five times.
We have all heard it, read it. Our sensei’s have pounded it into us. It is a cliché, obviously, since getting to the root cause of a problem is (most of the time) a touch more complicated than just repeatedly asking “Why?”

Isn’t it?

Maybe not. Maybe it is a matter of skill.

Some people are really good at it. They seem to instinctively get to the core issue, and they are usually right. Others take the “Problem Solving” class and still seem to struggle. So what is it that the “naturals” do unconsciously?

Let me introduce another piece of data here. “Problem Solving” is taught to be an application of the scientific method. The scientific method, in turn, is hypothesis testing. How does that relate to “ask why? five times” ?

Each iteration of asking “why?” is an iteration of hypothesis testing.

How do you “ask why?”

Observe and gather information.
Formulate possible hypotheses.
For each reasonable possibility, determine what information would confirm or refute it. (Devise experiments, which really means “Decide what questions to ask next and figure out a way to answer them.”)
Observe, gather information, experiment. Get answers to those questions.
Confirm or refute possible causes.

At each level, a confirmed cause is the result of “observe and gather information” so the process iterates back to the top.

Eventually, though, a point is reached where going further either obviously makes no sense, or is of no additional help. If you are now looking at something you can fix, you are at the “root cause” for the purpose of the exercise. Yes, you can probably keep going, but part of this is knowing how far is far enough.

Is this something I can fix easily?
Does it make sense to go further?

Otherwise, iterate again.

Now: Devise a countermeasure.

A countermeasure, itself, is a hypothesis. You are saying “If I take this action, I should get this result” i.e. the problem goes away.

Put the countermeasure into place. Does it work? That is yet another experiment only now you are (hopefully) confirming or refuting your fix on every production cycle. The andon will tell you if you are right.

We tell people “Ask why five times” but we really don’t teach them how to “ask why.”

The book examples usually show this neat chain of cause/effect/cause/effect, but the real world isn’t that tidy. When the problem is first being investigated, each level often has many possibilities. Once the chain is built then the chain can be used as a check.

But that isn’t how you GET there.

Why don’t the books do a good job teaching this?

“They” say that critical thinking is difficult to teach. I disagree. If the people who do it unconsciously can step back and become consciously competent, and know how they do it, then it breaks down into a skill, and a skill can be taught.

A Real World Example
My computer works, but it’s network connection to the outside world doesn’t.
OK. What could be wrong?
It could be software in the computer.
It could be a problem with the hardware.

Look at where the cable plugs into the back of the computer. Are the little lights flashing? No? Then there is no data going through that connection.

How could that be?

Well.. the it could be a problem in the computer or operating system.
It could be a problem with the hardware in the computer.
It could be a bad cable.
It could be a problem behind the network jack on the wall.

The QUICKEST thing to do is unplug the cable and plug my co-worker’s cable into the computer. (Please make sure he isn’t busy with email before you do this!). Do the lights come on? Yes? Does your network stuff work now? Yes? Then it isn’t anything in the computer. You have just done a hypothesis test – conducted an experiment.

Take his KNOWN GOOD cable out of the wall and plug it into your jack. Does your network work now? Yes? You have a bad cable. No? It is a problem behind the jack.. call I.T. and tell them. (unless you are at home, then head to the little blue box in the basement and start looking at flashing lights down there. But same process.. as you systematically eliminate internal causes, you are left with an external one.)

This is a natural flow, but most people wouldn’t describe it as “asking why?” or “hypothesis testing” – and the big words scare them off.

Still – when you (the lean guru) are teaching others, it is important for them to understand HOW TO ASK WHY is just a process of learning by systematic elimination of the impossible. (Whatever remains, however unlikely, must be the truth – Author Conan Doyle through Sherlock Holmes)

Systematic Problem Solving

If I were to look at the experience of the organization profiled in the last three posts “A Systematic Approach to Part Shortages” I believe their biggest breakthrough was cultural. By applying the “morning market” as a process of managing problems, they began a shift from a reactive organization to a problem solving culture.

I can cite two other data points which suggest that when an organization starts managing problem solving in a systematic way, their performance begins to steadily improve. Even managing problem solving a little bit better results in much more consistent improvement and less backsliding. Of course my personal experience is only anecdotal. That is certainly true by the time you read it here as I try to filter things. But consider this: The key difference with Toyota’s approach that Steven Spear pointed out in “Decoding the DNA of the Toyota Production System” (as well as his PhD dissertation) was that the application of rules of flow triggered problem solving activity whenever there was a gap between expected and actual process or expected and actual outcome.

Does this mean you should go out and implement morning market’s everywhere? Again, based on my single company data point, no. That doesn’t work any more than attaching kanban cards to all of your parts and calling it a pull system. It is not about the white boards, it is not even about reviewing the problems every day. Reactive organizations do that too. In most cases in the Big Company that implemented morning markets everywhere, that is what happened – they morphed into another format for the Same Old Stuff.

Here are some of the things my example organization did that I think contributed to success in their cultural shift.

  • They separated “containment” and “countermeasure” as two separate and distinct responses. This was to make sure that they all understood “containment” is what is done immediately to re-start production while preserving safety and quality. “Containment” was not a countermeasure as it rarely (if ever) actually addressed the root cause. It only isolated the effect of the problem as far upstream as possible. The rule of thumb was simple:
    • Containment nearly always adds time, cost, resources, etc.
    • A true countermeasure nearly always removes not only the containment, but reduces time, cost, resources.
  • They didn’t use the meetings to discuss solutions. They only addressed two things:
    • A quick update on the status of ongoing problem solving.
    • A quick overview of new problems from yesterday.

I think this is an important point because too many meetings get bogged down with people talking about problems, and speculating what the causes are. That is completely non-productive.

  • The actual people working on problems attended the meeting. I cannot over-emphasize how important this is. They did not send a single representative. Each person with expected activity reported his or her progress over the last 24 hours. It is difficult to stand in front of a group and say “I didn’t do anything.”
  • They blocked out time to work on problems. I probably should have put this one first. The manufacturing engineers and other professional problem solvers agreed not to schedule anything else for at least two hours every morning. This time was dedicated to working on the shop floor to understand the problem, and physically experiment with solutions. There was a lot of resistance to this. But over a couple of months it became close to the norm. It helped a lot because it started to drive the group to consider where they really spent their time vs. what they needed to get done. There was no doubt in the past that solving these problems was important, but it was never urgent. Nobody was ever asked why they weren’t working on a shop floor issue. That had been a “when I am done with everything else activity.” Gradually the group developed a stronger sense of the shop floor as their customer.
  • They didn’t assign problems until someone was available to work on them. This came a little later, when the problem-solvers were missing deadlines. The practice had been to assign a responsible person in the morning when the problem was first reviewed. Realistically a person can work on one problem a time, and perhaps work on another when waiting for something. They established a priority list. The priority was set primarily by manufacturing. When a problem-solver became available, the next item on the list was assigned. Once a problem was assigned, nothing would over-ride that assignment except a safety issue or a defect that had actually escaped the plant and reached a customer.
  • They got everyone formal training on problem solving with heavy emphasis on true root cause. People were expected to follow the method.
  • A problem was not cleared from the board until a long term countermeasure had been implemented and verified as working.

By blocking out time, they were able to establish some kind of expectation for productivity. After that, if problems were accumulating faster than they were being cleared, they knew they had a methods or resource issue. The same was true for their other tasks which were worked on during the rest of the day.

This was the start of establishing a form of standard work for the problem solvers.

21 Oct 08 – There is more on the subject here and here

A Systematic Approach to Part Shortages – Part 3

The third element of this organization’s successful drive to eliminate part shortages was a systematic approach to problem solving. They made it a process, managed just like any other process, rather than something people did when they had time. Even though this is “Part 3” of this series, in reality they put this into place at the same time, and actually a little ahead, of kanban and leveling.

The Morning Market

The idea of the “morning market” came from a chapter in Imai’s book “Gemba Kaizen.” He describes a process where the previous day’s defects are physically set out on a table and reviewed first thing in the morning – “while they are fresh” hence the analogy to the morning markets.

This organization had been trying to practice the concept of a morning market for a few weeks, and was beginning to get it into an actual process. Because supplier problems constituted a major cause of disruption, they set up a separate morning market for defective purchased parts.

That process branched yet again into a morning market for part shortages. And this evolved into a bit of a mental breakthrough.

They started looking at process defects.

Every shortage, every day, was recorded on the board.

Each morning the previous day’s shortages were reviewed. They were grouped into three categories based on knowledge of the cause – just like outlined in the book.

  • “A” problems – they knew the cause, knew the countermeasure, but had some excuse reason why it could not be implemented right away.
  • “B” problems – they knew the cause, but did not have a good countermeasure yet.
  • “C” problems – knew the symptom (parts weren’t there) but didn’t know why.

The mental breakthrough was systematically investigating the reason each and every shortage occurred. What they found was that in the vast majority of cases it was an internal process breakdown, rather than some problem at the supplier, that caused the shortage. This was a bit of a revelation.

They began systematically fixing their processes, one problem at a time.

Over time things got better. Simultaneously they were implementing the kanban system. Kanban comes with its own set of possible problems, like cards getting lost. Once again, when they found problems they went into the morning market and were systematically addressed.

After a few months into their kanban implementation, for example, they started turning in card audits with far less than 2% irregularities, and then it was not unusual for a card audit to find no problems at all. Why? They had addressed the reasons why cards end up somewhere other than where they should be. Instead of blaming people, they looked for why people acting in good faith would not follow the process.

This was also an attitude shift – assume a flaw in the process itself, or in communication, before looking for “who did it.”

Eventually the warehouse team had their own morning market. As did the receiving team. As did the parts picking team. As did assembly. Each looked at any case where they were not able to deliver exactly what their downstream customer needed.

About 8 months into this, another group in an adjacent building, was trying to work through their own issues. They came over for a tour. One of the supervisors, visibly shaken, came to me and said

“Now I get it. These people work together in a fundamentally different way.”

And they did. They worked as a team, focusing on the problems, not on each other.

And that, readers, is the goal of “lean manufacturing.” If you aren’t working toward that, then you aren’t really implementing anything.

The Seventh Flow

Those of you who are familiar with Shingijutsu’s materials and teaching (or at least familiar with Nakao-san’s version of things) have heard of “The Seven Flows.” As a brief overview for everyone else, the original version, and my interpretations are:

  1. The flow of people.
  2. The flow of information.
  3. The flow of raw materials (incoming materials).
  4. The flow of sub-assemblies (work-in-process).
  5. The flow of finished goods (outgoing materials).
  6. The flow of machines.
  7. The flow of engineering. (The subject of this post.)

A common explanation of “the flow of engineering” is “the footprints of the engineer on the shop floor.” I suppose that is nice-sounding at a philosophical level, but it doesn’t do anything for me because I still didn’t get what it looks like (unless we make engineers walk through wet paint before going to the work area).

Common interpretations are to point to all of the great gadgets, gizmos and devices that it does take an engineer (or at least someone with an engineer’s mindset, if not the formal training) to design and produce.

I think that misses the point.

All of those gizmos and gadgets should be there as countermeasures to real, actual problems that have either been encountered or were anticipated and prevented. But that is not a “flow.” It is a result.

My “put” here is that “The Flow Of Engineering” is better expressed as “The Flow of Problem Solving.”

When a problem is encountered in the work flow, what is the process to:

  • Detect that there even is a problem. (“A deviation from the standard”)
  • Stop trying to continue to blindly execute the same process as though there was no problem.
  • Fix or correct the problem to restore (at a minimum) safety and protect downstream from any quality issues.
  • Determine why it happened in the first place, and apply an effective countermeasure against the root cause.

If you do not see plain, clear, and convincing evidence that this is happening as you walk through or observe your work areas, then frankly, it probably isn’t happening.

Other evidence that it isn’t happening:

At the cultural and human-interaction level:

  • Leaders saying things like “Don’t just bring me the problem, bring a solution!” or belittling people for bring up “small problems” instead of just handling them.
  • People who bring up problems being branded as “complainers.”
  • A system where any line stop results in overtime.
  • No simple, on/off signal to call for assistance. No immediate response.
    • If initially getting help requires knowing who to phone, and making a long explanation before anyone else shows up, that ain’t it.
  • “Escalation” as something the customer (or customer process) does when the supplying organization doesn’t respond. Escalation must be automatic and based on elapsed-time-without-resolution.

Go look. How is your “Flow of Problem Solving?

“Packaging” is spelled M-U-D-A

In Mike Wroblewski’s blog “Got Boondoggle?” he comments on just how much packaging and dunnage is not visible in Toyota’s Industrial Equipment plant. Of course that is remarkable because of just how common it is to find the opposite condition. Factories (and offices) have lots of packaging around, and spend lots of time dealing with it.

Toyota has been working on this a long time. They have the added advantage of being an 800 pound gorilla with most of their suppliers. They can specify packaging and insist on things being done a certain way. In a lot of cases it is the supplier that is the 800 pound gorilla, and a lot of small companies have a hard time being heard. But you can still make a lot of difference if you apply the principle that packaging is muda.

A frequently invoked (and overworked analogy) is the assembler or operator as a surgeon. Everything must be ready for the value-add operation to be performed waste free. If there is waste in the process, keep it away from this point. Surgical instruments come in packages. But I can assure you the surgeon is not unwrapping the scalpel. So who is?

The instruments are prepared and made ready for use by the staff.

Now take this to your factory floor. The first step is to keep dunnage and packaging out of the production area. There are actually a lot of advantages to doing this. The chief one is obviously optimizing your value-add time. But all of that packaging also takes up space. EVERYTHING that enters the production area MUST have a process (meaning a person!) to get it OUT of the production area. Since you have to unpackage that stuff anyway, do it before it gets to production. This means that someone else picks the part and prepares it for use just like the staff in the operating room.

Do this and you have applied one of the principles Liker and Meier point out in The Toyota Way Fieldbook – if you can’t eliminate sources of variation, then isolate them. In other words, set up a barrier that contains the waste so that your value-adding operation sees the result of a perfect supplier.

You can then take the next step: Do this at receiving. If the parts do not arrive from the supplier packed the way that your internal material conveyance system needs them, then put the resources into receiving to convert what you get from your suppliers into what you wish you got from them.

This is applying the principle of systematically pushing waste upstream, closer to the point where it originates. The other thing you have done is force yourself to dedicate resources to deal with this waste instead of spreading it so thin the cost is hidden. You will know what it costs you in terms of people, time, space, etc. to deal with the fact that your suppliers don’t ship what you need. By highlighting the problem instead of burying it, you have the opportunity to address it.

One more thing – it might seem easier to take these little wastes and spread them thin. After all, if everybody just does one or two trivial tasks, it doesn’t seem so bad, does it? It is those trivial wastes, those 5 second, 30 second, 1 minute little things that accumulate to half your productivity. It takes work to see them and eliminate them. Don’t add more to the process on purpose.

Invert the Problem

One very good idea-creation tool is “inverting the problem” – developing ideas on how to cause the effect you are trying to prevent. This is a common approach for developing mistake-proofing, but I just saw a great use of the idea for general teaching.

Ask “How could we make this operation take as long as possible?” Then collect ideas from the team. Everything on the flip chart will be some form of waste that you are trying to avoid. In many cases, I think, even the most resistant minds would concede that nothing on this list is something we would do on purpose.

It follows, then, that if we see we are doing it that we ought to try to stop doing it. And that is what kaizen is all about.

Do Your People Solve the Problem or Work The System?

This article by Anita Tucker and Amy Edmondson at Harvard highlights a problem that is as common on the manufacturing floor as it is in the hospitals they studied:

When people encounter a problem that stops their work, they work the system, get what they need, and continue their work.

A lot of people call this initiative, and most organizations reward this behavior. Many of those organizations have actual or implied negative consequences for bringing up an issue that “you could have solved yourself.” Unfortunately this behavior only accomplishes one thing: It guarantees that the problem will occur again.

What is the big deal? Simple. Small problems accumulate. They do not go away, and more come into play every day. Eventually the Team Members are overwhelmed by “too much to do.” Supervisors press for “more people,” the organization grows in size, and the cycle continues. In health care all you have to do is spend an hour talking to harried nurse to know all of the things that keep them from providing patient care.

Go stand in the chalk circle on your own shop floor. What things keep your Team Members from doing their jobs?