Attack on Ambiguity

When real effort is spent getting to the cause of problems (vs. a reflex to find someone to blame), ambiguity often enters into the picture.

Problem solving is a process of asking questions and clarification.

Is a “defect-free” outcome of the process specified? Does the Team Member know what “success” is?
Is there a way for the Team Member to actually verify the result?
Does that check give the Team Member a clear Yes/No; Met/Not Met; Pass/Fail response? Or is there interpretation and judgment involved?

If there is good specification for “defect-free” – is there a specification for how to achieve it? Do you know what must be done to assure the result that you want? Does the Team Member know? Is the Team Member guided through the process? Are there verifications (poka-yoke, etc) at critical points?

If all of the above is in place, do you know what conditions must be in place for success? What is the minimum pressure for your air tools? Is there a pressure gage? Does it cut off the tool if the pressure is too low? Is there some visual check that all of the required parts, pieces, tools are there before work starts? Thank about the things you assume are there when the Team Member gets started.

For ALL OF THE ABOVE, if something isn’t right, is there a clear, unambiguous way to alert the Team Member immediately?
Is there a clear way for the Team Member to alert the chain-of-support that something isn’t right?

If there is a defined result, a defined process to achieve it, and all of the conditions required for success are present –
Is the Team Member alerted if he deviates from the specified process, or if a critical intermediate result is not right?
More poka-yoke.

Now you have the very basics for consistent execution.

Is the process carried out as you expect?
Is the result what you expected?

If not, then the process may be clear, but it clearly does not work. Stop, investigate. Fix it.

All of this is about getting more and more about what is SUPPOSED to happen and compare it to what is REALLY happening… continuously. I say continuously because “continuous improvement” does not happen unless there is continuous checking and continuous correction and problem solving. If you want “continuous improvement” you cannot rely on special “events” to get it. It has to be embedded into the work that is done every day.

Ask “Why?” – but How?

Get to the root cause by “Asking Why?” five times.
We have all heard it, read it. Our sensei’s have pounded it into us. It is a cliché, obviously, since getting to the root cause of a problem is (most of the time) a touch more complicated than just repeatedly asking “Why?”

Isn’t it?

Maybe not. Maybe it is a matter of skill.

Some people are really good at it. They seem to instinctively get to the core issue, and they are usually right. Others take the “Problem Solving” class and still seem to struggle. So what is it that the “naturals” do unconsciously?

Let me introduce another piece of data here. “Problem Solving” is taught to be an application of the scientific method. The scientific method, in turn, is hypothesis testing. How does that relate to “ask why? five times” ?

Each iteration of asking “why?” is an iteration of hypothesis testing.

How do you “ask why?”

Observe and gather information.
Formulate possible hypotheses.
For each reasonable possibility, determine what information would confirm or refute it. (Devise experiments, which really means “Decide what questions to ask next and figure out a way to answer them.”)
Observe, gather information, experiment. Get answers to those questions.
Confirm or refute possible causes.

At each level, a confirmed cause is the result of “observe and gather information” so the process iterates back to the top.

Eventually, though, a point is reached where going further either obviously makes no sense, or is of no additional help. If you are now looking at something you can fix, you are at the “root cause” for the purpose of the exercise. Yes, you can probably keep going, but part of this is knowing how far is far enough.

Is this something I can fix easily?
Does it make sense to go further?

Otherwise, iterate again.

Now: Devise a countermeasure.

A countermeasure, itself, is a hypothesis. You are saying “If I take this action, I should get this result” i.e. the problem goes away.

Put the countermeasure into place. Does it work? That is yet another experiment only now you are (hopefully) confirming or refuting your fix on every production cycle. The andon will tell you if you are right.

We tell people “Ask why five times” but we really don’t teach them how to “ask why.”

The book examples usually show this neat chain of cause/effect/cause/effect, but the real world isn’t that tidy. When the problem is first being investigated, each level often has many possibilities. Once the chain is built then the chain can be used as a check.

But that isn’t how you GET there.

Why don’t the books do a good job teaching this?

“They” say that critical thinking is difficult to teach. I disagree. If the people who do it unconsciously can step back and become consciously competent, and know how they do it, then it breaks down into a skill, and a skill can be taught.

A Real World Example
My computer works, but it’s network connection to the outside world doesn’t.
OK. What could be wrong?
It could be software in the computer.
It could be a problem with the hardware.

Look at where the cable plugs into the back of the computer. Are the little lights flashing? No? Then there is no data going through that connection.

How could that be?

Well.. the it could be a problem in the computer or operating system.
It could be a problem with the hardware in the computer.
It could be a bad cable.
It could be a problem behind the network jack on the wall.

The QUICKEST thing to do is unplug the cable and plug my co-worker’s cable into the computer. (Please make sure he isn’t busy with email before you do this!). Do the lights come on? Yes? Does your network stuff work now? Yes? Then it isn’t anything in the computer. You have just done a hypothesis test – conducted an experiment.

Take his KNOWN GOOD cable out of the wall and plug it into your jack. Does your network work now? Yes? You have a bad cable. No? It is a problem behind the jack.. call I.T. and tell them. (unless you are at home, then head to the little blue box in the basement and start looking at flashing lights down there. But same process.. as you systematically eliminate internal causes, you are left with an external one.)

This is a natural flow, but most people wouldn’t describe it as “asking why?” or “hypothesis testing” – and the big words scare them off.

Still – when you (the lean guru) are teaching others, it is important for them to understand HOW TO ASK WHY is just a process of learning by systematic elimination of the impossible. (Whatever remains, however unlikely, must be the truth – Author Conan Doyle through Sherlock Holmes)

The Value of People

How can some companies not only survive, but thrive when operating in “high cost labor” areas, while others are struggling even as they are busy chasing the lowest possible costs?

I would like to suggest that one key difference is the attitude toward people. On the one hand is the “people as cost” model. This model usually has a couple of built-in assumptions.

  • The number of people required to do a particular task is fixed, often against some kind of earned-hours standard.
  • The cost driver is wages, salaries and benefits.

On the other hand is the (seemingly) rare organization that truly believes that people are their strength, or the well worn out “our greatest asset.”

The assumptions which are required for this belief are:

  1. People’s net productivity can always be improved.
  2. The cost driver is the amount of time wasted coping with all of the small problems that keep things from going perfectly.

The above assumptions, of course, are anchored in a faith-based position that perfection is possible. (see “Chatter as Signal“)

So… what kinds of actions to each of these two models drive?

The first one – people are cost – says to find the cheapest possible labor and hire it. Since factory wages in China are (right now) running about 1/12 or less of those in the USA or Europe, that seems a logical choice. Here’s the rub.

You can outsource the entire job to another company – give the work to the lowest bidder. Now, if you truly believe that the amount of labor is fixed, and that only lower wages can change cost, then this is the obvious choice. You are relying on your superior supply chain management system to ensure you select a supplier that can maintain, and maybe even improve, the quality they deliver, plus hold the line against increases in materials, energy, and their own labor costs. In short, you are looking for a supplier who believes the opposite of what you do. Your ideal supplier knows they can bid aggressively, get your work, and then improve their profit position by applying continuous improvement.

Or you can export the production and set up your own operation in a “low wage area.” You are shifting your core beliefs about people to another culture, and another language. Communication (believe me!!) is a major issue, even if the managers work for you.

AND.. if “labor is cheap” then the solution to problems is to throw people at them. The cost differential you actually get almost NEVER reaches the advantage of a 1:1 substitution. Oh – and you just added 3-4 weeks to your lead / response times.

If, on the other hand, you take the attitude that the most precious resource in your operation is people’s time – no matter what you pay them, and take the attitude that to deliberately waste anyone’s time is to show great disrespect, then I would suggest that even in high-wage areas you can drive levels of improvement in productivity, quality and response to your customers that would be difficult to beat anywhere.

So – before you reflexively outsource or relocate to a “low wage area” please check your attitude about people. What are your expectations, and why is it that you don’t believe your own people are capable of delivering a 10x improvement?

Who are your competitors? What do they do?
What would you do if you had to compete with Toyota? or Komatsu? (to name two that come to mind) They are building product in your back yard, why can’t you?

Afterthought: Some companies end up outsourcing the skills they need to improve their own products and systems. They no longer understand the technology they sell, they no longer know how to make what they sell. I remember a time when I reminded an (arrogant) procurement executive that it was possible to outsource the entire procurement process just as easily. Another team had outsourced all of their direct labor management… they contracted the labor and the first level supervisors into their factory. Where did they really believe this was leading?

Chatter in an ISO Process

I have been in, or encountered, a number of organizations which had (or were working on) ISO-900x quality registrations. While I am fully aware of the intent of the ISO requirements, in the cases I have seen, the effect seems to fall well short of the goal.

On the surface, the types of processes mandated by ISO 9001 seem quite reasonable. They require knowing what your processes are, having documentation for them, and having systems that address problems to root cause.

The requirements are not a lot different from what I mentioned a couple of posts ago in Chatter as Signal. The example of the U.S. Navy’s nuclear propulsion operations would certainly meet the criteria (and then some).

So the question is:
What is fundamentally different about an organization with a “paper only” ISO certification that struggles with chaos every day vs. one which is truly process driven (whether they have an ISO certification or not)?

Chatter as noise.
Chatter as signal.

The Importance of Heijunka

My friend Tom poses an interesting question to production managers:

“If I ask you to produce different quantities and types of products every day, what quantity of people, materials, machines, and space do you need?”

Of course the answer is usually, at best, inarticulate and, at worst, a blank stare. There isn’t any way to know. Add to this the well-established research of the “bullwhip effect” which amplifies the magnitude of these fluctuations as you move up the supply chain, and it is easy to see the suppliers are really set up to fail.

Then he asks another question:

“If I ask you to produce the same quantities and types of products every day, or every hour, could you then answer the question?”

And, of course, the answer is that this is a “no brainer.” It would be very easy.

So the rhetorical question to ask is: Why does Toyota place such emphasis on heijunka?”
But my question is “Why don’t we all do it?”

Heijunka is a process of dampening variation from the production schedule. In English it is called “production leveling.” It comes in two steps:

  • Leveling the daily workload – smoothing out variations in the overall takt time.
  • Leveling the product mix within the daily work load – smoothing out variations in the demand from upstream processes.

Production leveling, however, is difficult, and the management has to have the fortitude to do it. Honestly, most don’t. They don’t like to deliberately set the necessary inventory and backlog buffers into place. So I’d like to explore some of the consequences of not doing it and then ask if these costs are worth it.

Consider this analogy.

Take a look at what modifications are necessary for a vehicle to traverse a rough, irregular road. The suspension must be beefed up, more power is required, the drive train is far more complex for 4×4. The road itself is unmarked, so the driver does not know where he is or where he is going without sophisticated navigation equipment.

Most of this additional hardware is actually unnecessary most of the time. But it might be needed, so it has to be there… just in case.

The driver of this vehicle is primarily concerned with what is right in front of the vehicle, and far less concerned about what is a mile (or even a few dozen meters) up the road. He will deal with that problem when he comes to it. Driving in this environment requires skill, training, experience, and continuous vigilance. People do this for recreation just for these reasons.

Now smooth out the road. Straighten out the hard curves. Give it some pavement. Put in signs so it is possible to navigate as you go. The same speed can be maintained by a vehicle which is lighter, has less power, a simpler drive system, a simpler suspension. Even though the engine is smaller, it is more efficient because it can run at a constant power output, the sudden accelerations are not necessary. Everything is easier.

The vehicle is much less expensive and much more efficient. The driver’s task is far simpler.

When you allow outside-induced variation to work its way through your system, you are putting potholes in the road. You are introducing sudden turns, sudden changes. Sometimes you are washing out entire bridges. People must be more and more vigilant and they simply miss more things. Their mental planning horizon shrinks to what they are working on right now, and maybe the next job. They certainly aren’t checking what they need tomorrow. They will worry about that in the morning.

The Effect on Materials

Even in the worst managed operations, people generally want to be able to provide what they are supposed to. They are motivated to be “good suppliers.” They also intrinsically understand that if they are idle (not producing) this is not good. Even if they are not provided with the tools and resources to do so, they will do the best they can to succeed – even if those things hurt the overall organization.

(I should note that most “management by measurement” systems actually encourage people to do things that hurt the overall organization, but that is another article.)

When these well meaning people encounter problems, they try to mitigate the effects of those problems with the resources they have available.

  • If their upstream suppliers do not deliver reliably they will add inventory so they have what they need.
  • Likewise, if their upstream suppliers do not deliver good quality, they will add some more inventory to make sure they have enough good material.
  • If there is quality fallout within their own process, they will add inventory and up production to cover that. By the way, that increase of production also increases the demand on the upstream suppliers, sometimes in unpredictable ways.
  • If their customers have irregular demand patterns, they will add inventory so the customers can have what they need, when they need it.
  • If there is batch transportation either upstream or downstream from them, they have to accumulate inventory for shipping.
  • If there are on a different shift schedule from either their customer or supplier, inventory accumulates to accommodate the mismatch.

Do you notice a theme here? The key point is: Without the system level view from their leadership, and without the problem solving support, all they can do is add inventory to cope.

Without leveling, any variation in demand will propagate upstream. At each step, two things happen:

  1. Processes that accumulate and batch orders progressively add to the amplitude of the variation.
  2. Irregularities within each process are added to the variation that comes from the customer.

By the time this hits your supply base, it is a tsunami. First the beach goes dry as it looks like the order base has dried up. (This is why you need to constantly reassure your suppliers with a forecast – because they can’t see regularity in your orders.) Then all of that water comes rushing in at once – and your suppliers can’t cope. Worse, they may have allocated the capacity elsewhere because they were tired of waiting on you. Lead times go up, things get ugly.

But even internally, all of this self-protection just adds more and more noise to the system.
So they add more and more inventory.

For a management team that is reluctant to deliberately add some inventory or backlog buffer to contain sources of variation and protect the rest of the system here is some news: Your people are already doing it, and in aggregate, they are adding FAR more inventory than would be needed with a systematic approach. They can only see the local problems, and each is just trying to be reliable – even if their efforts work in the opposite direction and actually introduce more variation into the system.

The Effect on People

I live in the Pacific Northwest of the USA. A fact of life living here is that, occasionally, the earth literally moves under our feet. I can tell you from experience that this is psychologically unsettling.

In our factories we do the same thing to people when the schedule changes every day. In the name of flexibility we shift requirements up and down. Add to that chasing shortages and hot list jobs around, and the daily work place is chaos. People are not sure if they are succeeding. Or, at best, they declare victory because they were not buried today.

Daily kaizen? That is just not going to happen in this environment. When you start talking about introducing flow, you threaten the self-protecting inventory buffers, and I can assure you that you will have a fight on your hands. Why? Because your people believe they need these buffers to get the job done – the job you want them to do. Now you are taking that away? Are you insane?

This is why it is critical to establish a basic takt as early as possible, then immediately start aligning the expectation to just meeting that takt.

Anything that keeps people from meeting takt becomes a problem and must be addressed. This is jidoka. Heijunka is a block in the foundation of the TPS “house” for a reason. Unless people are standing on solid ground, they can’t even consider anything like “just in time” or “stop and respond to problems” because they are spending all of their mental bandwidth just trying to figure out what is going on hour by hour.

Conclusion

When I was a military officer, we were trained in tactics designed to present our opponent with a constantly changing picture of what was happening. We wanted to inject as much confusion and uncertainty as possible. The mechanics of defeat on the battlefield are simple: The force subjected to this first shifts from action to reaction. They lose initiative, and therefore lose psychological control. Next the horizontal control linkages start breaking down. Each sub-unit starts to feel isolated from the others. They feel less a team and more on their own. Then, as more and more of their attention is shifted to self-preservation, the vertical chain of command breaks down. Each sub-unit is now mentally isolated and can be defeated in detail.

Ironically many factories are managed such that the workers on the shop floor are subjected to exactly these same conditions – and we wonder why they have a cynical view. We are defeating ourselves.

Chatter as Signal

As I promised, I am going to continue to over-play the afternoon my team spent with Steven Spear.

In his forthcoming book “Chasing the Rabbit” (to be published in the fall), he profiles what is different about those companies which seem to easily be increasing their lead against competitors when there is no apparent external advantage.

One of the core concepts he discussed was the nature of complexity in organizations, processes and products. It is the way this complexity is managed and handled that distinguishes the leaders from the pack of competitors that are fighting and jostling for second place.

In a complex system, there are invariably things people miss. Something is not defined, is ambiguous, or just plain wrong. These little things cause imperfection in the way people do things. They encounter these unexpected issues, and have to resolve them to get the job done.

This is “chatter” in Spear’s words. The sound made when imperfect parts try to mesh together.

Most organizations accept that they cannot possibly think of everything, that some degree of chatter is going to occur, and that people on the spot are paid to deal with it. That is, after all, their job. And the ones that are good at dealing with it are usually the ones who are spotlighted as the star performers.

The underlying assumptions here are:

  • Our processes and systems are complex.
  • We can’t possibly think of and plan for anything that might go wrong.
  • It is not realistic to expect perfection.
  • “Chatter is noise” and an inevitable part of the way things are in our business.

On the other hand, the organizations that are pulling further and further ahead take a different view.
Their underlying assumptions start out the same, then take a significant turn.

  • Our processes and systems are complex.
  • We can’t possibly think of and plan for anything that might go wrong.
  • But we believe perfection is possible.
  • Chatter is signal” and it tells us where we need to address something we missed.

We have all heard about Toyota’s jidoka and andon processes, so let me bring out another example, again, that was used by Spear.

The U.S. Navy has been operating nuclear reactors with a 100% (reactor) safety record for nearly (over?) 50 years. And they operate a lot of nuclear reactors. When they started, they were in totally new and unfamiliar territory – they were doing things that had never been done before. In fact, no one was even sure if it was possible.

They asked the question: How should this nuclear reactor be operated? They answered it with a set of incredibly specific procedures which everyone was expected to follow – exactly, without deviation in any way. These procedures represent the body of experience and knowledge of the U.S. Navy for operating nuclear reactors.

Here is the key point: ANYONE who departs from the procedure, in any way, no matter how trivial or minor, must report “an incident” which rockets up the chain of command. The reasons for the departure are understood. If there was something outside the scope of the procedure, the new procedure covers it. If something was unclear, it is clarified.

This may not be the Toyota Production System at work, but it is a version of something that makes it work: Jidoka.

If the process is not working, can not work, or conditions are not exactly as specified for the process to succeed, then STOP the process, understand the condition, correct it, restore the system to safe, quality operation, and address the reason it was necessary to do this.

Chatter is signal.

So – at a Toyota assembly line in Japan some years ago, I observed a Team Member drop a bolt. He pulled the andon cord and signaled a problem.

More about Overburden (Muri) in Health Care

The last post got way too long, and I wanted to get it out there. But of course, there are afterthoughts.

At a level higher than simple process chaos, overburden hits the entire organization when perceived demand is significantly greater than perceived capacity.

As I noted in the earlier post, segregating what should be routine from the true exceptions goes a long way, especially when there is work to continuously improve execution of routine things. This results in less capacity being used to process routine, and therefore, more capacity available to handle the true emergent stuff.

The next phase is to repeat the process, step by step, on the exceptions. Identify what makes them exceptions. Is there another process that can be isolated and segregated? Can you move something from “exception” to “routine” in some way?

Then look at what is left.

About 20 years ago, Philip Agre wrote a seminal PhD Dissertation at M.I.T. called “The Dynamic Structures of Everyday Life.” If you can find it, read it. This work was a major contributor to turning the science of symbolic artificial intelligence on its head. One of his conclusions was that almost everything we do is routine, and we do non-routine things in routine ways.

This thinking applies to complex, one-of-a-kind process situations. What “experience” brings to the table is knowing what things, that we know how to do routinely must be done; in what order; to gain control of the uncontrolled; and get the desired outcome.

In our heads, this is much messier than we want to believe it is. Fundamentally what we do is to try something we believe will have a certain effect, then see what effect it actually has. If the effect is the one we predicted, then we are one step closer to control and the stage is set for the next action; if not then we learn what did not work, gain a bit more understanding and try something else.

This is also how we build that thing called “experience” step by step, stretching our understanding, moving what we do not know into what we do. We do this as individuals, but it is only a truly exceptional organization that can do it as an institution. Learning is a process of prediction, testing and comparison.

The objective in these situations is to move an unknown, uncontrolled situation gradually toward familiar ground and make it into something routine.

Steven Spear quoted a health care worker that summed it up pretty well: “Air goes in and out, blood goes round and round. If either of those is not happening, we have a problem.” And in the most extreme medical emergency, the first steps are always to stabilize vital signs so that the patient will live long enough for the caregivers to understand the problem and develop countermeasures.

This is still, however, a customized sequence of tasks that should, themselves, be routine. Only the macro level varies. The more that can be done to stabilize the delivery of treatment to the patient, the less harried people will feel. They should not worry about the small things so they can pay attention to the big things.

The weak points in a complex system are the interconnections. People are not sure who should do, or has done, what. There are repeated transfers from one caregiver to another, often with far less than complete information – leaving it to the next caregiver to assess the situation all over again. Every time this happens presents an opportunity to overlook or misinterpret something that is already known.

By working very hard on execution of the things that should be routine, that much more mental capacity is made available to care for the patients. This means attacking ambiguity where ever it is found.

Mura, Muri (and Muda) in Health Care

Corrie van den Hoek, a regular reader and correspondent from The Netherlands, is working on applying kaizen in the health care industry. She left a comment on ‘The White Board’ asking my thoughts on the concepts of mura and muri in the health care field.

I think it is first important to define the terms because (1) Not everyone has heard them and (2) The translations from Japanese can differ a bit.

Mura is usually translated as “inconsistency.”

Muri is usually translated as “overburden.”

Mura and Muri are the brothers of the better-known Muda, which, of course, translates as “waste” or “unnecessary work.”

I am aware that it is possible to split hairs on the translations, but I think these suffice for the sake of discussion.

Like any industry, Health Care has a product to deliver (treatment of patients) and the administrative processes that support the care givers, patients, and keep it running as a business. There is huge room for improvement in both of these areas, and of course problems in one have impact on the other.

I started to get to these issues in this post, but did not go into any depth. The cool thing is that the article I was writing about in a general sense is actually written from a health care context. So I highly recommend reading it as some additional background.

Muri – Overburden – “Asking the unreasonable or impossible.”

In the article, Tucker and Edmondson refer to an “error” as doing something inappropriate or unnecessary, and a “problem” as something which interferes with accomplishing a task in the specified way.

Problems as Overburden

They cite a typical example of a problem. A Team Member’s task is to change linens. This task is routine. She goes to the storage area for linens on her floor, and finds none. She goes to another floor, and perhaps another, and ultimately finds the linens she needs, then returns to the task she was trying to accomplish in the first place. (She at least did not have to hire a taxi to deliver fresh linens from the service, as other caregivers reported they had done.)

At the end of the shift, however, I would wager this Team Member wasn’t able to get everything done. Or she had to hurry to do things. Perhaps the work left undone is now passed to someone else and will disrupt their work. All of this is an example of overburden – asking (or implicitly expecting) Team Members to do more than they should, or more than they can. At the very least, the floor she took the linens from now has fewer than they probably need, and another safari will be launched from that floor tomorrow.

In this case, the Team Member is implicitly expected to “do what must be done” in order to deliver care. There are no avenues to address, or even call out, the existence of these problems. Calling them out carries at least an implied professional or psychological risk of being branded a complainer, or “not a team player.”

Indeed, working around these kinds of issues is a major source of satisfaction and pride in the work culture. I quote from a quote in the article:

Working around problems is just part of my job. By being able to get IV bags or whatever else I need, it enables me to do my job and to have a positive impact on a person’s life – like being able to get them clean linen. And I am the kind of person who does not just get one set of linen, I will bring back several for the other nurses.

For management, the question is a simple one: Is this task one which you would deliberately design into this person’s work process? If not, then question why it must be done at all. But you can’t just question it. That implies the person doing it is doing something wrong. She isn’t. She is doing exactly what must be done to do the job she was given. Question why it must be done so you can remove the necessity to do it.

The Muri of Unnecessary Life-and-Death Decisions

Overburden is also the case where a Team Member is asked to make multiple perfect decisions in high-stress situations. I am not talking about deliberate decisions about, for example, what type of care to deliver. Rather I am talking about the simple decisions that are repeatedly forced on Team Members during the routine delivery of care. Many of these seemingly simple decisions are overburden because the Team Member should not be asked to make them at all. Making them adds to the work stress because, in medical care delivery, the consequences of an error can be catastrophic in terms of “negative patient outcome.”

A case that comes up time and time again in examples I hear – both from literature and in my own conversation with people inside the system is a classic one: A Team Member selects the wrong small vial of colorless liquid from a shelf or tray and injects it into a patient. Sometimes this is harmless. Other times it is fatal. These mistakes, however, only get the attention of the system when there is harm to the patient. And the attention of the system is nearly always focused on finding out who did it and assigning blame.

Steven Spear recounts a typical case in Fixing Health Care From The Inside.

He cites an investigation into a case where a woman recovering from routine surgery suddenly developed seizures. Her blood sugar level crashed, she lapsed into coma and died. Here is a key point from the investigation:

a nurse had responded to an alarm indicating that an arterial line had been blocked by a blood clot, and he had meant to flush the line with an anticoagulant, heparin. There was, however, no evidence that any heparin had been administered. What investigators did find was a used vial of insulin on the medication cart outside Mrs. Grant’s room, even though she had no condition for which insulin would be needed.

Instead of asking “Why did the care giver administer insulin instead of heparin?” how about asking “Why was insulin even in the room in the first place? This is simple 5S – eliminating the things that are not needed. Actually no. This is somewhat advanced 5S, because it is eliminating the things that are not needed NOW. Perhaps it is appropriate to have insulin in the room for some patients. But it apparently was not appropriate for this patient. And even if there are non-routine conditions which could require insulin, then the insulin should be stored in a place that forces a conscious and deliberate decision to retrieve it.

Key Point: Separate the routine from the non-routine. Separate normal from abnormal.

Another example was cited directly to me by a friend who works in Health Care. In another big-name big-city hospital a woman was in routine surgery. A staffer in the operating room chose between two clear vials of clear liquid, picked up the wrong one, and administered a cleaning substance to the patient, killing her.

Of course this scenario begs exactly the same questions as the one above it. If it doesn’t go into the patient, why is it in the room at the same time the patient is? And if must be in the room, why is it accessible in a routine way to a routine process?

Spear points out that for every death or serious injury there are many instances of these errors that do not result in serious problems, and many times that number of instances where the error is almost made, but it caught and corrected in time.

This is, in my opinion, a form of “overburden” because people are being asked to make decisions that have life-and-death consequences, and those decisions are entirely unnecessary if someone would only ask “Why did this person have to choose?” instead of “Who made the wrong choice” or (a little bit better) “Why was the wrong choice made?”

Whenever we inject ambiguity into the situation (or even allow ambiguity to persist) we are expecting someone – who may not expect it – to see it and resolve it.

Countermeasures:

Most times the proposed solution is around better labeling and identification. But I would like to suggest that “mistake proofing” is actually a process of:

  1. Systemically eliminating sources of errors by eliminating choices;
  2. If that can’t be done then putting up barriers that stop the process if an error is about to occur;
  3. and if that can’t be done by doing something that breaks unconscious routine in a way that forces the person to notice the impending error.

Better labeling falls into the third category here. Ask tougher questions, and support your people better.

What about Mura, or inconsistency?

Traditionally this is about a widely varying workload. In industry, the countermeasures are to establish a takt time, apply production leveling, set cycle times to the takt, and in general, work hard to keep the workload as even as possible. There are a lot of good benefits to this and the performance of the companies that do it very well suggests that doing it is worth the perceived costs and trouble.

One of the things frequently cited by Health Care is how their workload is wildly variant and unpredictable. These perceptions are certainly not unique to Health Care, but it is probably worthwhile exploring the situation from their context. I certainly don’t expect the Health Care community to make the leap from consumer goods or dump trucks to patient workloads and processing insurance claims.

Based on my limited dealing with Health Care, I am going to do a little conjecture, then attempt to go from there. If I am totally off base with my assumptions, feel free to correct me in a comment, and I’ll re-think.

As I see it, two big drivers for high day-to-day variation of demand on the system are:

  • Patients can show up at any time. This is especially true in Emergency Services, where, by definition, demand is unprogrammed.
  • Each individual case is potentially unique, or at the least, any one of them could go from routine to non-routine at any time.

Does that about capture it?

Shifting The Thinking A Bit

Not everything I propose here will work every time – there are true exceptions out there. But, in general, at least one of these concepts have usually helped people find some foundation of stability they can leverage.

Rather than looking at a varying aggregate workload, start breaking things down into individual streams, and finding components of stability within the variation.

Workload Variation

This graph represents a wildly varying workload. Most reasonable people are going to look at this and conclude they pretty much have to either be ready for anything, in any form, at any time.

But even in the face of wide demand swings, it is a rare operation the experiences -zero- or close to zero demand. There is some element which is reliable. Perhaps that element is small, but, at some level, it is usually there.

At first you probably won’t be able to control the wide swings, but what you can do is apply the principle of isolating instability.

Elements of Variation

This is exactly the same graph as the first one. The difference is the shading. The consistent part of the workload is shaded in green, the unstable or varying workload is shaded in red.

If you look for sources of stability, vs. causes or sources of instability, most operations can usually find something to leverage. This works particularly well in administrative processes, but I’ll work on applying it to the care-delivery flow in a bit.

An Administrative Flow

(Thanks to the GHC team for making me think about this in their context)

Imagine, if you will, a routine administrative process that is carried out many times a day. Many, if not most, of these processes involve something along the lines of:

  • Getting some initial piece of information that triggers the process itself.
  • Confirming known information, frequently doing routine research to gather more information.
  • Summarizing that information in some formal manner – a report, a request, a transaction.

In my little example a process just like this one was experiencing wildly varying workloads from day to day. Some days they could process 15 or more, other days they would get bogged down with one. Some days they would receive a lot, other days they would receive a few. The Arrivals followed all of the queuing models – work arrived in batches, in distribution biased to the right, with a long left tail. The team was working Saturdays and long hours just to keep up, and was often getting further and further behind.

To level the workload we had to do two things. First, we needed to understand the actual demand over some reasonable period of time. We took a week since that time interval matched the kinds of deadlines they were usually under. Your mileage may vary. Based on that, we looked at how many per day they needed to get through, every day, to keep up with the demand they were experiencing. From that we established a nominal takt time of an hour.

For the cases that arrived reasonably complete, and were reasonably routine, one person could easily complete the work in an hour. The first countermeasure, therefore, was to put an upstream filter into place. The idea was that one person would be dedicated to routine transactions. The supervisor would do a quick review for completeness, and if the “routine” criteria were met, they would be placed in the appropriate work queue.

This process had a built-in check. The assumption being tested was that a complete case should take an hour to process, never longer. If a case took longer than an hour to process, it should not have been placed in the “routine work queue.” Thus, at the 60 minute mark, if the processor was not done, he kicked that one out of the work queue, back to the supervisor and started the next one.

This process immediately stabilized and accelerated the throughput on the vast majority of cases which were, in fact, routine. Everything went faster because they were no longer stopping the entire train to deal with an exception. The routine stuff went through routinely. They isolated variable processing from routine processing.

Of course they didn’t ignore the abnormal cases. There were two types of exceptions to handle.

  • The case that should have been routine, but was not because it was lacking something required to process it.
  • The case was truly an exception – something difficult or complicated, which even with complete information, requires more work than normal to be processed.

In the first few weeks, the team had a lot of cases get kicked out of the “routine” work queue. Then the numbers started to drop. This is because, each time, the team learned a little more about what causes line stops, and did a little better job:

  • Defining what they needed from their upstream processes, and making sure they got it.
  • Screening the incoming work to make sure it was set to process routinely and quickly.

What about the true exceptions? These, of course, remained. But they no longer clogged up the pipeline and stopped processing of the routine. The true exceptions were managed from a priority queue with a visual control. The other team members would pick the next one on the queue, and work it. The group’s supervisor could re-shuffle the work queue at any point, so the most important was always the next one to be picked. However, as a rule, he would not interrupt a Team Member from one case to work another.

Over a fairly short period of time, the group’s throughput went up dramatically, they were no longer working weekends and overtime, and there was far less rework involved because they were catching the reasons for rework up front.

Now, apply this same thinking to any transaction that occurs in your Health Care arena. Processing insurance claims (or other financial transaction), for example, seems like something fairly similar to this.

But here is the point: Isolate the routine from the true exceptions. Establish a routine process to do routine things in routine ways. Process the exceptions separately.

What about delivery of care?

This gets a little trickier, but I think the same basic processes apply. If you think about it, most Emergency Rooms already do this with triage. But where they fall short is in establishing routines to do routine things, and having checks in place to make sure those things are happening as specified.

Thus, even with the best of intentions, the exceptions become the norm because they are allowed to become the norm.

Let’s look at routine, scheduled, surgery. There are fixed sequences of steps to prepare the patient, prepare the facility, and prepare the team. But I would contend that, even though “everybody knows what to do” there isn’t an expectation that everybody does it a particular way. The “Who does What, When” is not part of the expected routine. Thus, people don’t expect routine things to actually BE routine, so the non-routine things that mess up the process are taken as a matter of course.

Instead, assume that a routine, smooth, consistent process is possible. Then look for what keeps it from being ideal, and embrace those little things as kaizen opportunities… then address them!

This post is MUCH longer than I set out to make it. But I think the original question gets to the very core of the work most Health Care organizations need to tackle. I am going to stop writing, and throw it out there. I apologize if it is a little unpolished.

Hopefully it will generate a little discussion.

Jim Collins: “Good to Great” Website

Jim Collins book “Good to Great” has been a best selling business book for several years. But I am not so sure everyone knows about Jim Collins web site. It as on-line mini-lectures, and much more material that reinforces the concepts outlined in the book.

As for how the concepts in the book relate to “lean thinking” – I believe they are 100% congruent. Examining Toyota in the context of the model outlined in the book shows everything Collins calls out as the crucial factors that separate sustainable improvement from the flash-in-the-pan unsustainable variety.

The only difference I can see between Toyota and the companies that were profiled is that Toyota has had these ingredients pretty much from the beginning, and Collins’ research was looking at companies that acquired them well into their existence.