The Importance of Heijunka

My friend Tom poses an interesting question to production managers:

“If I ask you to produce different quantities and types of products every day, what quantity of people, materials, machines, and space do you need?”

Of course the answer is usually, at best, inarticulate and, at worst, a blank stare. There isn’t any way to know. Add to this the well-established research of the “bullwhip effect” which amplifies the magnitude of these fluctuations as you move up the supply chain, and it is easy to see the suppliers are really set up to fail.

Then he asks another question:

“If I ask you to produce the same quantities and types of products every day, or every hour, could you then answer the question?”

And, of course, the answer is that this is a “no brainer.” It would be very easy.

So the rhetorical question to ask is: Why does Toyota place such emphasis on heijunka?”
But my question is “Why don’t we all do it?”

Heijunka is a process of dampening variation from the production schedule. In English it is called “production leveling.” It comes in two steps:

  • Leveling the daily workload – smoothing out variations in the overall takt time.
  • Leveling the product mix within the daily work load – smoothing out variations in the demand from upstream processes.

Production leveling, however, is difficult, and the management has to have the fortitude to do it. Honestly, most don’t. They don’t like to deliberately set the necessary inventory and backlog buffers into place. So I’d like to explore some of the consequences of not doing it and then ask if these costs are worth it.

Consider this analogy.

Take a look at what modifications are necessary for a vehicle to traverse a rough, irregular road. The suspension must be beefed up, more power is required, the drive train is far more complex for 4×4. The road itself is unmarked, so the driver does not know where he is or where he is going without sophisticated navigation equipment.

Most of this additional hardware is actually unnecessary most of the time. But it might be needed, so it has to be there… just in case.

The driver of this vehicle is primarily concerned with what is right in front of the vehicle, and far less concerned about what is a mile (or even a few dozen meters) up the road. He will deal with that problem when he comes to it. Driving in this environment requires skill, training, experience, and continuous vigilance. People do this for recreation just for these reasons.

Now smooth out the road. Straighten out the hard curves. Give it some pavement. Put in signs so it is possible to navigate as you go. The same speed can be maintained by a vehicle which is lighter, has less power, a simpler drive system, a simpler suspension. Even though the engine is smaller, it is more efficient because it can run at a constant power output, the sudden accelerations are not necessary. Everything is easier.

The vehicle is much less expensive and much more efficient. The driver’s task is far simpler.

When you allow outside-induced variation to work its way through your system, you are putting potholes in the road. You are introducing sudden turns, sudden changes. Sometimes you are washing out entire bridges. People must be more and more vigilant and they simply miss more things. Their mental planning horizon shrinks to what they are working on right now, and maybe the next job. They certainly aren’t checking what they need tomorrow. They will worry about that in the morning.

The Effect on Materials

Even in the worst managed operations, people generally want to be able to provide what they are supposed to. They are motivated to be “good suppliers.” They also intrinsically understand that if they are idle (not producing) this is not good. Even if they are not provided with the tools and resources to do so, they will do the best they can to succeed – even if those things hurt the overall organization.

(I should note that most “management by measurement” systems actually encourage people to do things that hurt the overall organization, but that is another article.)

When these well meaning people encounter problems, they try to mitigate the effects of those problems with the resources they have available.

  • If their upstream suppliers do not deliver reliably they will add inventory so they have what they need.
  • Likewise, if their upstream suppliers do not deliver good quality, they will add some more inventory to make sure they have enough good material.
  • If there is quality fallout within their own process, they will add inventory and up production to cover that. By the way, that increase of production also increases the demand on the upstream suppliers, sometimes in unpredictable ways.
  • If their customers have irregular demand patterns, they will add inventory so the customers can have what they need, when they need it.
  • If there is batch transportation either upstream or downstream from them, they have to accumulate inventory for shipping.
  • If there are on a different shift schedule from either their customer or supplier, inventory accumulates to accommodate the mismatch.

Do you notice a theme here? The key point is: Without the system level view from their leadership, and without the problem solving support, all they can do is add inventory to cope.

Without leveling, any variation in demand will propagate upstream. At each step, two things happen:

  1. Processes that accumulate and batch orders progressively add to the amplitude of the variation.
  2. Irregularities within each process are added to the variation that comes from the customer.

By the time this hits your supply base, it is a tsunami. First the beach goes dry as it looks like the order base has dried up. (This is why you need to constantly reassure your suppliers with a forecast – because they can’t see regularity in your orders.) Then all of that water comes rushing in at once – and your suppliers can’t cope. Worse, they may have allocated the capacity elsewhere because they were tired of waiting on you. Lead times go up, things get ugly.

But even internally, all of this self-protection just adds more and more noise to the system.
So they add more and more inventory.

For a management team that is reluctant to deliberately add some inventory or backlog buffer to contain sources of variation and protect the rest of the system here is some news: Your people are already doing it, and in aggregate, they are adding FAR more inventory than would be needed with a systematic approach. They can only see the local problems, and each is just trying to be reliable – even if their efforts work in the opposite direction and actually introduce more variation into the system.

The Effect on People

I live in the Pacific Northwest of the USA. A fact of life living here is that, occasionally, the earth literally moves under our feet. I can tell you from experience that this is psychologically unsettling.

In our factories we do the same thing to people when the schedule changes every day. In the name of flexibility we shift requirements up and down. Add to that chasing shortages and hot list jobs around, and the daily work place is chaos. People are not sure if they are succeeding. Or, at best, they declare victory because they were not buried today.

Daily kaizen? That is just not going to happen in this environment. When you start talking about introducing flow, you threaten the self-protecting inventory buffers, and I can assure you that you will have a fight on your hands. Why? Because your people believe they need these buffers to get the job done – the job you want them to do. Now you are taking that away? Are you insane?

This is why it is critical to establish a basic takt as early as possible, then immediately start aligning the expectation to just meeting that takt.

Anything that keeps people from meeting takt becomes a problem and must be addressed. This is jidoka. Heijunka is a block in the foundation of the TPS “house” for a reason. Unless people are standing on solid ground, they can’t even consider anything like “just in time” or “stop and respond to problems” because they are spending all of their mental bandwidth just trying to figure out what is going on hour by hour.

Conclusion

When I was a military officer, we were trained in tactics designed to present our opponent with a constantly changing picture of what was happening. We wanted to inject as much confusion and uncertainty as possible. The mechanics of defeat on the battlefield are simple: The force subjected to this first shifts from action to reaction. They lose initiative, and therefore lose psychological control. Next the horizontal control linkages start breaking down. Each sub-unit starts to feel isolated from the others. They feel less a team and more on their own. Then, as more and more of their attention is shifted to self-preservation, the vertical chain of command breaks down. Each sub-unit is now mentally isolated and can be defeated in detail.

Ironically many factories are managed such that the workers on the shop floor are subjected to exactly these same conditions – and we wonder why they have a cynical view. We are defeating ourselves.

Chatter as Signal

As I promised, I am going to continue to over-play the afternoon my team spent with Steven Spear.

In his forthcoming book “Chasing the Rabbit” (to be published in the fall), he profiles what is different about those companies which seem to easily be increasing their lead against competitors when there is no apparent external advantage.

One of the core concepts he discussed was the nature of complexity in organizations, processes and products. It is the way this complexity is managed and handled that distinguishes the leaders from the pack of competitors that are fighting and jostling for second place.

In a complex system, there are invariably things people miss. Something is not defined, is ambiguous, or just plain wrong. These little things cause imperfection in the way people do things. They encounter these unexpected issues, and have to resolve them to get the job done.

This is “chatter” in Spear’s words. The sound made when imperfect parts try to mesh together.

Most organizations accept that they cannot possibly think of everything, that some degree of chatter is going to occur, and that people on the spot are paid to deal with it. That is, after all, their job. And the ones that are good at dealing with it are usually the ones who are spotlighted as the star performers.

The underlying assumptions here are:

  • Our processes and systems are complex.
  • We can’t possibly think of and plan for anything that might go wrong.
  • It is not realistic to expect perfection.
  • “Chatter is noise” and an inevitable part of the way things are in our business.

On the other hand, the organizations that are pulling further and further ahead take a different view.
Their underlying assumptions start out the same, then take a significant turn.

  • Our processes and systems are complex.
  • We can’t possibly think of and plan for anything that might go wrong.
  • But we believe perfection is possible.
  • Chatter is signal” and it tells us where we need to address something we missed.

We have all heard about Toyota’s jidoka and andon processes, so let me bring out another example, again, that was used by Spear.

The U.S. Navy has been operating nuclear reactors with a 100% (reactor) safety record for nearly (over?) 50 years. And they operate a lot of nuclear reactors. When they started, they were in totally new and unfamiliar territory – they were doing things that had never been done before. In fact, no one was even sure if it was possible.

They asked the question: How should this nuclear reactor be operated? They answered it with a set of incredibly specific procedures which everyone was expected to follow – exactly, without deviation in any way. These procedures represent the body of experience and knowledge of the U.S. Navy for operating nuclear reactors.

Here is the key point: ANYONE who departs from the procedure, in any way, no matter how trivial or minor, must report “an incident” which rockets up the chain of command. The reasons for the departure are understood. If there was something outside the scope of the procedure, the new procedure covers it. If something was unclear, it is clarified.

This may not be the Toyota Production System at work, but it is a version of something that makes it work: Jidoka.

If the process is not working, can not work, or conditions are not exactly as specified for the process to succeed, then STOP the process, understand the condition, correct it, restore the system to safe, quality operation, and address the reason it was necessary to do this.

Chatter is signal.

So – at a Toyota assembly line in Japan some years ago, I observed a Team Member drop a bolt. He pulled the andon cord and signaled a problem.

More about Overburden (Muri) in Health Care

The last post got way too long, and I wanted to get it out there. But of course, there are afterthoughts.

At a level higher than simple process chaos, overburden hits the entire organization when perceived demand is significantly greater than perceived capacity.

As I noted in the earlier post, segregating what should be routine from the true exceptions goes a long way, especially when there is work to continuously improve execution of routine things. This results in less capacity being used to process routine, and therefore, more capacity available to handle the true emergent stuff.

The next phase is to repeat the process, step by step, on the exceptions. Identify what makes them exceptions. Is there another process that can be isolated and segregated? Can you move something from “exception” to “routine” in some way?

Then look at what is left.

About 20 years ago, Philip Agre wrote a seminal PhD Dissertation at M.I.T. called “The Dynamic Structures of Everyday Life.” If you can find it, read it. This work was a major contributor to turning the science of symbolic artificial intelligence on its head. One of his conclusions was that almost everything we do is routine, and we do non-routine things in routine ways.

This thinking applies to complex, one-of-a-kind process situations. What “experience” brings to the table is knowing what things, that we know how to do routinely must be done; in what order; to gain control of the uncontrolled; and get the desired outcome.

In our heads, this is much messier than we want to believe it is. Fundamentally what we do is to try something we believe will have a certain effect, then see what effect it actually has. If the effect is the one we predicted, then we are one step closer to control and the stage is set for the next action; if not then we learn what did not work, gain a bit more understanding and try something else.

This is also how we build that thing called “experience” step by step, stretching our understanding, moving what we do not know into what we do. We do this as individuals, but it is only a truly exceptional organization that can do it as an institution. Learning is a process of prediction, testing and comparison.

The objective in these situations is to move an unknown, uncontrolled situation gradually toward familiar ground and make it into something routine.

Steven Spear quoted a health care worker that summed it up pretty well: “Air goes in and out, blood goes round and round. If either of those is not happening, we have a problem.” And in the most extreme medical emergency, the first steps are always to stabilize vital signs so that the patient will live long enough for the caregivers to understand the problem and develop countermeasures.

This is still, however, a customized sequence of tasks that should, themselves, be routine. Only the macro level varies. The more that can be done to stabilize the delivery of treatment to the patient, the less harried people will feel. They should not worry about the small things so they can pay attention to the big things.

The weak points in a complex system are the interconnections. People are not sure who should do, or has done, what. There are repeated transfers from one caregiver to another, often with far less than complete information – leaving it to the next caregiver to assess the situation all over again. Every time this happens presents an opportunity to overlook or misinterpret something that is already known.

By working very hard on execution of the things that should be routine, that much more mental capacity is made available to care for the patients. This means attacking ambiguity where ever it is found.

Jim Collins: “Good to Great” Website

Jim Collins book “Good to Great” has been a best selling business book for several years. But I am not so sure everyone knows about Jim Collins web site. It as on-line mini-lectures, and much more material that reinforces the concepts outlined in the book.

As for how the concepts in the book relate to “lean thinking” – I believe they are 100% congruent. Examining Toyota in the context of the model outlined in the book shows everything Collins calls out as the crucial factors that separate sustainable improvement from the flash-in-the-pan unsustainable variety.

The only difference I can see between Toyota and the companies that were profiled is that Toyota has had these ingredients pretty much from the beginning, and Collins’ research was looking at companies that acquired them well into their existence.

Be A Perfect Supplier; Be A Perfect Customer

Operations that work to the “push” are well known for complex and interdependent problems. What looks like a problem in one area often has causes, or parts of causes, in other areas. Quality problems, delivery problems (late, too much, too little, wrong stuff), sub-optimizing attempts to reduce local cost.. all of these things propagate unchecked through the plant. To fix one area means having to fix almost all of the others at once. This initial improvement gridlock is pretty common.

When you start talking about implementing JIT in an environment like that, the pushback is visceral and, to be honest, legitimate. The only reason they get anything done is because the system runs to sloppy tolerances and doesn’t expect much. JIT demands a degree of mutual vulnerability, at least it seems that way when it is first presented.

The other really big psychological issue is that lean is often presented as a solution to all of these problems. Quite correctly, the survivalist shop floor supervisors don’t see that. And they are right. The problems do not go away when you implement flow. I sometimes find it surprising how many people don’t get that. All they see is fewer problems in operations that have flow, and they mix up cause and effect. Good flow is the result of solving the problems. Not the other way round, but I digress.

If you are dealing with this problem gridlock, where do you start? The first step is to contain the problems as close as possible to their sources.

The objective is to apply what temporary countermeasures are necessary to appear as a perfect supplier to your downstream customers; and the appear as a perfect customer to your upstream suppliers.

So what is a “perfect supplier?” That is probably the easier of the two logical questions to answer. A perfect supplier is capable of supplying what you need; when you needed it; with perfect quality; one-by-one; at takt.

What is a “perfect customer?” This one is a little harder, but it is good to look back at what makes a perfect supplier. Ask yourself – what things does the customer do that makes it difficult to be a good supplier? What does a bad customer look like?

  • They order or demand things in batches.
  • They give no advance notice about what they need.
  • Their demand is unpredictable and inconsistent.

A lot of this seeming unpredictability actually originates in the supplying process. I recall a case where the manager of a fabrication shop swore that his customer’s demands were totally random. At the assembly plants, though, they operated to takt with a steady mixed-model schedule. There was very little change from one day to the next. Why the big disconnect? The fabrication ship ran things in big batches, and set up big batch pull signals. Naturally those big batch pull signals would go a long time between trigger points, so they would seem to come back at arbitrary times, for huge amounts. Self-inflicted gunshot wound. Once they took the simple step of shipping things in smaller containers, a lot of that seeming instability went away. Smaller containers meant more frequent releases of pull orders, which gave them a cleaner picture of the demand picture. Think of it this way: The smaller the pixels on your screen, the more resolution you have in the image.

So that does the perfect customer look like? Level, predictable demand at takt with no major fluctuations.

Think of the purpose of heijunka or leveling production. Because customer demand arrives in spikes, batches, lumps, the leveling process is necessary to make that demand appear to be arriving exactly at takt time.

Although the books, such as Learning To See say there is only one pacemaker process or scheduling point, non-trivial flows frequently require re-establishing the pulse.

This is especially true if orders are batched up either through the ordering process itself or the delivery process. An example of this is a manual kanban process between an assembler and the supplier. Even though there is a paced assembly line and good leveling, kanban cards are collected and delivered to the supplying process in transportation-interval “chunks.” The supplier needs to have their own heijunka board to re-level the demand and pick at takt from the supermarkets. The alternative is that the demand arrives at the production cells in those same batches, and the smooth takt image is lost.

In a Previous Company we were working a project to establish pull on a trans-continental value stream that had five major operations, all in different geographic locations. To use the word “monument” does not even begin to describe the capital infrastructure involved, and there were a lot of these assets shared with other value streams, so relocating and directly connecting flows was out of the question. There were unreliable processes, big batches, transportation batches, end-using customers’ orders in huge, sudden surges based on their surge based business cycle. Step by step we isolated inventory buffers and ended up putting in heijunka to re-level the demand at nearly every stage of the process. It was big, ugly, cumbersome, but it worked to isolate problems within a process vs. pass them up and down stream.

The objective was simple: Use inventory buffers and heijunka to make each process in the chain appear as a perfect customer to its suppliers – always pulling exactly at takt. The consuming process “owned” the inventory buffers necessary to do this. Reason: Simple. The problems that cause them to be a less-than-perfect customer are theirs, so they own the inventory that is necessary to protect their suppliers from those problems. Likewise, that process owned whatever inventory was necessary them to appear to be a perfect supplier. They had to enable their customer to pull one-by-one, exactly at takt, from them, even if their problems kept them from producing that way.

Never mind that the downstream process didn’t actually consume at takt. THEIR inventory buffer translated their spiky signal into one which reflected the takt time.

All of this was very sophisticated and complicated, but in the long haul it worked. Megabucks of inventory came out of the system. Megabucks remained, but we knew exactly why it was there, and who had to solve what problems to reduce it.

If you can’t be a perfect customer, create the illusion that you are.

If you can’t be a perfect supplier, create the illusion that you are.

Then you own the problems yourself, you own the inventory-consequences of having those problems, and you control your own destiny.