Just a Few Seconds

What is a few seconds of delay? Why is it such a big deal?

Consider this example.

While touring the Pilsner Urquell brewery in (surprise!) Pilsen, Czech Republic, we saw a lot of really good information boards, general organization, and a clear management commitment to continuous improvement.

Their packaging plant produces 60,000 bottles of beer an hour. Even though they produce “bottles of beer” this is more of a process industry than a discrete product industry. Most of the operations are highly automated with human supervision, rather than human operated. So do the principles of “lean” apply?

Partly, it depends on your definition of “lean.” If you subscribe to the most common partial definition, where “lean” is a set of tools, then most people would struggle finding relevance to the context. On the other hand, if you look at this as a structure for the work, the work place, and the organization to drive out and solve problems – then you are in comfortable territory here.

While those machines are largely automated, they do occasionally have problems.

Those problems can be seen as a disruption or slowing of production. But in a large complex operation, many times these things are subtle and hard to pick up.

If you are running a process industry, consider these questions.

What is your nominal, expected rate of throughput at each and every critical juncture in the plant?

How do you know you are achieving that rate?

What is the threshold of slowing or disruption that will get your attention?

Remember, we want to look at “chatter as signal” here. While failure is a common condition, it is not a condition we ignore.

conveyer

This conveyer is coming out of the carton machine. Bottles go in. Flat cartons go in. Glue pellets go in. Cartons of beer come out, at a pretty good clip.

While we were there, though, a carton was mangled. As it got just downstream of this spot, the line was stopped. I don’t know if it was an automatic or a manual stop – I would hope it was automatic.

A couple of team members pulled the carton out, and re-started the line.

The line was stopped for 32 seconds.

Count how many cartons of beer go by in 32 seconds. Subtract that from the day’s production. It isn’t one mangled carton, it is almost 60 cartons of beer that will not be made today.

Sitting next to our mangled carton was another one, presumably from earlier in the shift.

120 cartons.

“Stop and respond to every deviation.”

Why did the line stop? Because the carton was mangled. Good call.

Why was the carton mangled?

Again – this is an operation bordering on world class, but I don’t know what goes on behind the scenes. They could be doing everything I mention here. So please look at this as an opportunity for a hypothetical example.

This is the kind of problem that executives often decide is not worth solving. It is, in the grand scheme of profit and loss, floor sweepings.

Hopefully that is not the case at Pilsner Urquell. I honestly don’t know. But what I do know is I can estimate that a couple of pallets of beer never get made each day as long as this problem persists. If a thief stole two pallets of beer from the warehouse every day, security would be all over it.

In a “lean” operation, though, we pay attention to these things. Just a few seconds matter. Why? Because those few seconds are what stand between the current condition and a target of more production with no more capital equipment.

In a process industry, that equipment costs big, big bucks. So now we are talking, not about a case of beer, not even about a little problem, but rather the fact that if there is a systematic approach to dealing with these little problems we might be saving many millions in future investments.

How reliable and consistent is the equipment and machinery in your operation?

Do you carry out regular maintenance checks? How do you know? Is there a way to verify that those checks were actually made – or at a minimum, verify that someone had to go to the place where they should be done? How do you know? It is one thing to have the pretty pictures in notebooks or on a board. It is another to have a physical check of some kind.

If you can verify those checks are being made, great. That is standard work. You have created a working hypothesis:

“If we carry out these checks, and this routine maintenance, at these times, in this way, then we will never be surprised by a stoppage.”

Of course you will be wrong. Unexpected slowdowns and stoppages are going to happen. But in our lean world, chatter is signal. Being wrong tells us something we didn’t know when we created those checks. That unexpected failure or breakdown had some kind of precursor that we might have prevented, and certainly should have seen. So we set up the process to immediately detect a slowdown or stoppage and let us know. We verify that the checks have been made, and then look at what we must change about them to cover the new insight we have.

Maybe this time we are only saving a few seconds. But it is really impossible to measure the effects of problems that do not occur.

If you are overrun by these problems, deal with the ones you can, maybe just a few every day, but deal with them in the right way, using thorough problem solving to the root cause. Be able to answer the question “What did we learn about our process here?” with something that you didn’t know previously.

Just a few seconds, but just as you save sixty seconds to save a minute, those seconds add up. At the very least, know what is happening out there. Go and see.

Overburdened with Andon Calls

Bryan Zeigler has a great post on his “Lean is Good” blog site. Titled “Andon Calls and Muri,” he describes Toyota’s phenomenal  capacity for responding to problems, and then takes us back to where the rest of us are with some really great questions:

If it is physically impossible to answer every andon call in order to work on every problem, is it best to fix the first one that comes sequentially?  Then do work arounds and rework until we can respond to another one?

I have always used systems to prioritize what problems we work on whether it be pareto charts, value stream maps, or just plain standing in the circle.  Once directions, or as Toyota Kata describes them, process target conditions, are established and the highest priority items are “fixed” and then we move on the the next most important challenge.

Working on all problems in the process would overburden the organization’s problem solvers.  This would be a form of dreaded muri right?  I’ve read and heard much about the Toyota staffing levels required to operate TPS effectively.  Most range from 5 to 7 employees under each level of leadership position.  Again, my experiences are more like 30 to 40 employees under a 1st line leader.

Two questions:

  1. What percentage of daily problems are organizations that you work in staffed to handle?
  2. What philosophies do you utilize to ensure you don’t introduce Muri to the problem solving teammates in the organization?

Great observation, and great questions. And Bryan is certainly not the only one who has had the insight to ask them.

At this point, I have to issue a bit of a disclaimer. I have spent a full day on the shop floor in Bryan’s workplace. I can visualize exactly what he is talking about. Unfortunately schedules didn’t allow me to meet him, but I have a pretty good sense of the situation he is dealing with.

Since pretty much everyone has these issues when they start to contemplate andon calls, let’s start by reviewing the theoretical base, then moving into reality.

The core principle of jidoka is “Stop and respond to every abnormality.” That is what we are trying to do here. This means there must be a clear definition of what is “normal” and what is “abnormal.”

In the strictest sense, if it isn’t clearly defined as “normal” it is ambiguous.

In a system where the most basic fundamental is to define processes, ambiguity is a problem as well. Put another way, “ambiguity” is, by definition, “abnormal.”

So, in effect, we are asking the team member to let us know when anything is not clearly consistent with the defined norms.

The first response to an andon call is to clear the problem. If the “problem” is lack of clarity then deal with that. Replace uncertainty with clarity. Set some kind of an hard criteria for what does, and does not, need your attention. This means take management responsibility for the fact that the problem exists, and you aren’t going to do anything about it right now.

The next step is critical: If you can’t solve it right now, contain it.

“Containing the problem” means establish a temporary standard of some kind. Some kind of action that allows the team member to resume safe, defect free work. You might have introduced some inefficiency into the process, but safety and quality take priority.

And here is the dangerous part. You have the problem more or less under control. You can easily walk away and move on to the next one.

But consider this: You have the tiger in a cage. You are in the cage with it. You have to keep feeding the tiger (with time, resources) so it doesn’t eat your process. The only way to make the tiger go away is to get to the root cause.

This temporary standard does, however, give you a measure of stability. You can organize your problem solving efforts and focus on the ones that are the most critical to you. Meanwhile, however, you are burning resources feeding all of those tigers.

Typically a temporary countermeasure (problem containment) is some adjustment to the process. You have set a new standard work sequence that includes the steps required to keep this problem from affecting customers or escaping downline.

Yes, it is a work-around. But it is one you developed deliberately for a specific reason, until you can get to clearing that issue for good.

As you continue to identify problems and at least get them contained in some way, continue to refine the things you want to call attention to.

First, be explicitly clear on what things must trigger an andon call. These are the things you really want to know about when they happen. For sure it should be any safety issue and any issue that threatens quality. It could be an issue you are currently focused on resolving, such as late parts delivery, an upstream quality issue, a piece of unreliable equipment.

Then establish the time trigger. To do this you need to have three things pretty clear in your mind.

  • A good idea of how long the process is supposed to take.
  • A method for the team member to know when he is behind, and how much.
  • A standard for how much delay you are willing to tolerate. Put another way, how long to are you willing to let the team member get behind before he tells someone? My suggestion here is no more time than you can help him catch up. If he gets further behind than that, you are going to pass the problem to another part of the process downstream in the form of a late delivery.

Now you have some simple rules.

Please try to perform the standard work so we can see any problems with it.

  • If any unsafe condition exists, stop and pull the andon. Wait until we can clear the hazard.
  • Do not knowingly ship bad quality to the next process. Pull the andon so we can come, assess, and decide how we are going to deal with that.
  • If you have this problem, this problem, or this problem, stop and pull the andon so we can come and clear the problem as well as understand it as soon as it happens.
  • If you accumulate any delays longer than xx minutes, pull the andon.

This puts you in control. You get to decide how much excess capacity (how many extra people) to pad delays. You get to decide what problems trigger a call. You get to decide what you can handle.

All I ask is:

  • Do not tolerate unsafe conditions. Always stop the process and always initiate a call.
  • Do not tolerate a process that routinely passes bad quality down stream. Always initiate a call. Don’t put the team member in a position where he has to judge what “good enough” is. Have a hard standard and stick to it.
  • Always thank the team member for bringing problems to your attention. Never discourage an andon call.
  • Never allow an andon call to go unanswered. Set a response time standard, measure it, and apply the same problem solving principles to that.

The other thing I would suggest is a system to manage problem solving. There are some suggestions in this post on morning markets.

The key point is that any problem you decide not to work on has to have some kind of temporary countermeasure incorporated into your expectations. If you do something that adds time, you must allow time for it to be done. Doing otherwise is introducing overburden – or to Bryan’s point, shifting the overburden from your problem solving back to the team member.

If you pay attention to what is really happening, and take management responsibility for all of the problems that the team members encounter, then (and only then) can you set rules about which ones you will, and will not, work on right now.

The hardest part of all of this? It is the “taking management responsibility” part. Getting an effective andon call process into place requires as much (actually more) process discipline in the leader’s ranks as it does on the shop floor. This is discipline not to panic, not to wish problems away, and to respond as though the team member is doing you a favor for calling out a problem vs. causing it.

An andon call process is a vital step toward truly engaging the team members. And it begins the shift from intermittent improvement to continuous improvement.

Simple and Easy Processes

In the last post I commented on Ron Popeil’s product development approach – to make the product easy to demonstrate drives making it easy to use, which creates more value for the customer.

Let’s take the same thinking back to your internal customers.

What if, rather than just writing a procedure, you had to go and demonstrate it to the people who had to follow it? What if you had to demonstrate it well enough that they saw the benefit of doing it that way, and could demonstrate it back to you to confirm that they understood it? If you broke down the work and organized it to be easy to demonstrate and teach, would it look any different? (Hmmm. TWI Job Instruction actually sounds a lot like this.) Would you still ask “Why didn’t they just follow the procedure?”

Look at the information displays and the controls on your equipment. Do they provide total transparency that things are working? Or do they abstract and obscure reality in some way? Can your internal customer be sure things are going as expected?

Do controls give clear feedback that they are being set correctly? Are sequences of operations readily apparent?

How many “blinking 12:00” situations do you have out there on your shop floor – things that have been put into place, but nobody uses because nobody can really figure it out?

Come back to the design of the product itself. Is the manufacturing and assembly process apparent, obvious, and as simple as you can make it? Would it be designed differently if you had to demonstrate how to fabricate and assemble it?

How about your administrative processes? I recall, many years ago, a “process documentation process” being taught. In the class they were using “baking cookies” as a demonstration example. Yet the instructors, who presumably were experts, actually struggled trying to show how this works. This “process” was far less clear than they had thought it was when they had simply thought through it. “It did not work on TV.”

Look at your computer programs and their user interfaces. What makes sense to a programmer rarely makes sense in actual use. Watch over someone’s shoulder for a while. Could you easily demonstrate this process to someone else?

Ron Popeil cooks real chickens and real ribs in the production of his infomercials. He does not use contrived or carefully limited demonstration examples. As you look at your examples and exercises, how well do they stand up to the real world application? Can you go out to the shop floor and demonstrate your “product” in actual use?

This post is full of questions, not answers. I don’t have the answers. Only you (can) know how well your processes are engineered.

Design your production system (for product or service) as carefully as you would design the product or service itself.

Problems Hidden In The Open

We were down on the shop floor watching an assembly operation. The takt time was on the order of three hours. The assembler was new to the task, and the team leader periodically came by and asked if he was “doing OK.” The reply was always in the affirmative.

As the takt time wound down to under five minutes to completion, this operation was the only one not reporting “Done.”

The count down hit zero, things went red, the main line stopped, and the line stop time started ticking up.

The team leader, other assemblers, the supervisor began pitching in to assist. Between them, the job was completed in about 10 minutes, and the line restarted.

So, again, my favorite question:

What’s the problem?

Lets try breaking it down to four key questions.

  1. “What should be happening?”
  2. “What is actually happening?”
  3. The above two questions define the gap.

  4. Why does the gap exist?”
  5. “What are we doing about it?”

These questions simply re-frame PDCA, but without so much abstraction.

So, in this situation:

What should be happening?
Two things come to mind immediately.

  1. The work should be complete on time.
  2. As soon as you know it isn’t going to be complete on time, please tell someone so we can get you help.

For this to work, though, the team member needs a clear and unambiguous way to answer a key question of his own: Am I on track to finish on time? Ideally the answer to this question is a clear “Yes” or a clear “No,” with no ambiguity or judgment involved. (Like any “Check” it should produce a binary result.)

On an automobile line with a takt time on the order of 55 seconds, the assembler can get a good sense of this. If he loses more than three or four seconds, he isn’t going to make it. But “a good sense” isn’t good enough.

Even in this fast-moving situation, you will see visual indicators that help the team member answer this question. Take a look at this photo.

toyota-assy

See the white hash marks along line at the bottom of the picture? Those mark off the moving line work zone into ten increments of about 5 ½ seconds. The assembler knows where he should be as he performs each task. If he is a hash mark behind, he isn’t going to finish on time. Pull the andon. We can safely say that, in this example, we have accomplished (1) and (2) above.

With longer takt times, it is much tougher for a human to have a good sense of how much time will be required to complete the remaining work. That makes it that much more critical that some kind of intermediate milestones are clearly established and linked to time.

What would be a reasonable increment for these checks? –> How far behind are you willing to let your worker get before someone else finds out? I’d say a good starting point is at the point when he can’t recover the time himself, the problem is no longer his. Following the standard work is the responsibility of the team member. Recovering to takt time is the team leader’s domain. At the very least, he is the one who pitches in and helps, or gets someone else to do so. But he can’t do this if he doesn’t know there is a problem.

So – what should be happening?

The team member must have continuous positive confirmation that he is on track to complete the work on time. With the failure of that positive confirmation, he should pull the andon and get assistance.

The team member must call for assistance (“pull the andon”) if his work falls behind the expected progress for any reason whatsoever.

What is actually happening?

In our example, the team member didn’t get help until it was too late. In fact, he verbally assured the team leader he was “OK” on a couple of occasions. The line stop was irrefutable evidence of a problem. That was a good thing. This company has a takt time, and runs to it. Think of what would have happened if they didn’t. It might take hours, or days, before this problem surfaced. (We are nowhere near the root cause yet. The line stop is just evidence of a problem, not the problem itself.)

Why does the gap exist?

It is a hell of a lot harder to answer this question than the other two. In this case, you are going to have to peel back a lot of layers before you get to the actual, systemic, root cause. But in the immediate sense, with a takt time bordering on three hours, there is really no realistic way a worker can judge if he has fallen too far behind to catch up. The fact that, in this case, the assembler was still learning the job, and that just compounds the situation.

From casual observation – when the team leader visited, he asked if things were OK and accepted the reply – I would start to investigate whether the team leader had a good sense himself of where the work should be at his regular check points… if he has regular check points at all.

But all of this is speculation, because after 10 minutes of watching the initial response to the line stop, our little group had moved on. I am mentioning these things as possibilities because you likely have the same issues in your shop. (And if you don’t have a rigorous sense of takt time, it is equally likely you don’t know about those issues even at the level we saw here. At least THIS company can see the evidence of the problem. That is a credit to their visual controls.)

What are we going to do about it?

Obviously there are a couple of immediate things that can be addressed to at least contain the problem. (That is, convert a hard line stop into multiple andon calls so the actual problems are seen earlier.)

I would want to establish a regular routine for the team leader’s checks. His leader standard work. At regular intervals, he should be checking progress of the work. How often? How far behind do you want the assembly to get before you are certain someone finds out about the problem? In this case, even every 20 minutes is less rigorous than the hash marks on the auto assembly line. But it would be a start.

So we have the team leader coming by every 20 minutes.

But he can’t just ask “How is it going?” We clearly saw that didn’t work. It isn’t that the assembler lied to him, it is that the assembler didn’t know because there was no standard.

What work should be complete 20 minutes into the work cycle? At 40 minutes? At 60? What verifiable facts can the team leader check by observation? There are a lot of ways to do this, most of them very simple and non-intrusive. Think it through.

But wait – now the team leader himself has standard work. What cues him to do it? Is he supposed to notice that 20 minutes has elapsed? In this case, the company already has a pretty sophisticated andon and sound system. It would be a pretty simple matter to put in an audible signal that told the team leader to make his checks. But, again, that is just one solution. I can think of a couple of others. Can you?

What is the team leader checking for? This is a critical question.

Think about it.

What was the original answer to “What should be happening?” (which is “the standard”)

We said:

  1. The work should be complete on time.
  2. As soon as you know it isn’t going to be complete on time, please tell someone so we can get you help.

We want the assembler himself to be checking #1.

So why do we have the team leader check?

So he can verify that the assembler is pulling the andon when he should. This is important because it is human nature not to ask for help until it is too late. This isn’t limited to factory floors. How many cardiac patients die because they ignored the warning symptoms for fear that it isn’t serious enough to get help?

It isn’t enough to ask the team member to call for help. You have to expect it, encourage it and require it.

Interestingly enough, as I was writing this post, John Shook posted his story about converting the culture at NUMMI.

A cornerstone of Respect for People is the conviction that all employees have the right to be successful every time they do their job. Part of doing their job is finding problems and making improvements. If we as management want people to be successful, to find problems, and make improvements, we have the obligation to provide the means to do so.

But, some of our GM colleagues questioned the wisdom of trying to install andon at NUMMI. “You intend to give these workers the right to stop the line?” they asked. Toyota’s answer: “No, we intend to give them the obligation to stop whenever they find a problem.” [emphasis added]

What was the problem in our example? We don’t know yet. We certainly can’t start looking for causes.

But the evidence of a problem was that the team member could not complete the work in the time expected. That is, he was not successful doing the job. And the line stopped because the support system failed to pick up the fact that he was falling behind until it was too late to recover.

It really does come down to respect for people.

If you want to go faster, stop.

Mark’s post on The Whiteboard tells a pretty common story. The good news is that this company has more business than they can handle. Pretty good results in these times. The bad news is that they are having problems ramping up production to meet the demand. In Mark’s words:

I’m working for a company that is very, very busy. They developed a new process that is the first of it’s kind and have taken the market share away from their competition. But they have not spent enough time making the process robust enough to handle the increase demand and the scrap costs are going out of the roof. Currently about 65K a day. Any suggestions? Our number 1 scrap producer is a machine that can not perform at the same capability as when Engineering did their run off…

At the risk of coming across as flip, the very first thing to do if a machine starts producing scrap material is to shut it down.  It is better to make nothing because that is a cheaper alternative than making stuff you can’t use.

However, it goes deeper than that.

Engineering had done a “run off” (which I presume was a test on theoretical speeds). Now actual performance isn’t meeting expectation. This is a problem.

But let’s rewind a bit and talk about how to manage a production ramp-up. Hopefully it is a problem more people will be having as the economy begins to recover.

Although this is in the context of the machine, exactly the same principles apply to any type of production. Only the context and the constraint changes.

Presumably there was some speed for this machine where it didn’t produce scrap, or the scrap was minimal. Going back to that time, here is what should have happened.

Promise production at the rate the machine is known to support.

Now crank up the speed a bit and see what happens. In the best case, you are overproducing a bit, but you are learning what the machine is actually capable of doing.

Crank it up a little more. Oops, scrap.

STOP!

Because you have been running a little faster than required, you have bought a little time. Understand why that scrap happened. Try to replicate it. Dig into the problem solving. Try to replicate the problem under controlled conditions. LEARN.

Hopefully you can find the cause and fix it.

Try it. Run the machine again, at the faster speed. Scrap? Back around to the “problem solving” cycle. Repeat until you can reliably run at the faster speed without scrap.

Then, and only then, promise the higher rate, because now you can reliably deliver it.

Then notch it up a bit until you encounter the next problem.

This cycle of promising only what you can actually deliver protects the customer while you are pushing the envelop internally to discover the next problem.

The alternative? Make a promise knowing you actually have no clue whether or not you can meet it.

But that’s what they did. So now they are burning a lot of money every day making scrap material.

The same principles apply, however. They are already not delivering what they promised. So throttle things back to the point where they can predict the results, and go from there. Pretending they can run faster than they can is not accomplishing anything other than burning money. Deal with facts, no matter how uncomfortable.

If you make a schedule based on what you wish you could do, you will have a schedule you wish you could meet.

No matter what, each time scrap is produced, the fact must be acknowledged. That allows the immediate response that is framed around a simple question:

“How the hell did this happen?”

Put another way, “What have we just learned about the limits of this process?”

It is only within that framework that you actually get any better. Anything else is relying on luck, and in this case at least, that didn’t work.

Looking at the wrong stuff: America’s Best Hospitals: The 2009-10 Honor Roll

This news piece, America’s Best Hospitals: The 2009-10 Honor Roll, originally got my attention because I hoped someone might be actually be paying attention to the things that make a real difference in our national debate about health care.

Unfortunately, it looks like more of the same.

This survey looks at things like technical capability – what kinds of specialty procedures these hospitals can perform, and their general reputation  and then ranks them accordingly.

But where are we asking about the basics?

Which hospitals kill or injure the fewest of their patients? What is the rate of post-operative or other opportunistic infection? How about medication errors? These are the things that all hospitals should be “getting right” and yet the evidence is overwhelming that most don’t. Further, nobody seems to be paying attention to it except tort lawyers.

Now take a look at this post on Steven Spear’s blog, and especially the Paul O’Neal commentary that he links to.

Tell me what makes a “good” hospital?

Is this a “problem?”

This morning I got an email from a friend that recounts a (still ongoing) story of a failed freezer.

We arrived home Tuesday from a week away to find the “extra” freezer in the garage totally kaput…..much of the stuff inside already ruined but some still partially frozen. It’s only 4 years old and within warranty, so [we] go on line and schedule an appointment with GE service for the next day, and spend hours sorting what [food] might be savable, getting bags of ice to try and bridge the time until (you would assume) they will exchange this unit with a new one. Tech comes out the next day, announces that the compressor is fried, and that he’ll order the part and see you in a week to install.

Needless to say, the customer is not exactly happy here. What could be saved now cannot. When they elevate the problem to “Customer Care” on the phone, the answer is basically holding the line to the warranty terms which give the company the option of replacing or repairing the unit.

Aside from speculation that the response would be different if this had been a commercial unit for a large corporate customer, this story brings up some interesting issues.

Clearly the company here is well within their agreement with the customer. That is (apparently) spelled out in black and white in the warranty, all approved by the legal department. And repair of the unit is the logical economic choice for the company.

But equally clearly, the customer here is not happy with the response.

All of my protestations about how an exchange unit shipped from their warehouse in Kent today would allow my wife to save her food falls on deaf ears. Not even a transfer to a “supervisor” for exception resolution could be arranged. If you don’t like it, tough luck..not buy another GE product? “hey, your choice” hard to believe!

And a customer with a technical problem has likely been turned into a customer for the competition.

So here is the question.

“Is this a problem?”

And when I say “problem” I mean, is this a “problem” from the standpoint of the company’s internal process?

I have my thoughts, and I’ll share them in a day or so. But I’d like to hear what you think.

How Many Production Decisions?

Whether in service delivery (including health care delivery), manufacturing, or any other production environment, your team members are likely having to make lots of decisions under perceived time pressure. Even with great visual aids, many of these processes are mistake-prone.

One of the reasons I like pre-kitting parts for a specific option configuration is that it separates the process of deciding which parts to pick from the process of installing them.

This might not seem that big a deal.

Fortunately, if you have a copy of Windows Vista, it comes with a great simulation that shows just how this can feel on the line.

Look under “Games” and start the “Purple Place” game. Select the building in the middle of the screen, and you will find yourself in a cake factory.

cakes

The idea is to look at the TV monitor on the left, and produce a cake that matches the picture. You can move the belt forward and back to position the cake under the various applicators. Then you simply select the correct choice. Seems pretty simple.

Go ahead, try it.

This screen shot is from the “Advanced” level, but even the “Beginner” is pretty easy to screw up unless you are focused and paying attention all of the time.

So – if you find yourself saying “All the employees need to do is look at the picture, and follow the directions – why is that so difficult?” then see how well you do on this game. Play it from the start of your regular work shift until the first break, say two hours, and see how many mistakes you make.

Now consider that your production environment is likely orders of magnitude more complex than this game for little kids. And you are expecting people to work all day, every day, without ever making a mistake.wrongcake

If you are in health care delivery – think about the picture of the finished cake as the physician’s instructions, and the production line as the actual process of filling the prescriptions, administering medications, protocols for preventing infections, record keeping procedures, and ask yourself if there aren’t many more opportunities for error – that are far better concealed – than the ones in this little game.

Just a thought for the day. Meanwhile – enjoy finding a work-related reason to have “Games” loaded onto your Vista machine! 🙂

Cool Email Mistake Proofing

My main desktop computer runs Ubuntu Linux. The default email client is called Evolution. A recent upgrade introduced a very cool feature. When I hit “Send” it looks for language in the email that might indicate I meant to include an attachment. If there is no attachment, it pops up this handy reminder:

screenshot-attachment-reminder

Maybe Microsoft Outlook does this too, I haven’t used the latest version, so I don’t know. But in any case, this is a great example of catching a likely error before it escapes the current process. I can’t count the number of times I have hit “Send” only to get an email reply “You didn’t include the attachment.” Obviously I was about to do it again, or I wouldn’t be writing this. 😉 Since I am sending out things like resumes right now, that is something I would really like to avoid.

When talking about mistake proofing, or poka-yoke, there are really three levels.

The first level prevents the error from happening in the first place. It forces correct execution of the correct steps in the correct order, the correct way. While ideal, it is sometimes easier said than done.

The next level detects an error as it is being made and immediately stops the process (and alerts the operator) before a defect is actually produced. That is the case here.

The third level detects a defect after it has occured, and stops the process so that the situation can be corrected before any more can be made.

Each has its place, and in a thorough implementation, it is common to find all of them in combination.

Related to this are process controls.

Each process has conditions which must exist for it to succeed. Having some way to verify those conditions exist prior to starting is a form of mistake-proofing. Let’s say, for example, that your torque guns rely on having a minimum air pressure to work correctly. Putting a sensor on the air line that shuts off the gun if the pressure drops below the threshold would be a form of stopping the process before a defect is actually produced.

A less robust version would sound an alarm, and leave it to the operator to correctly interpret the signal and stop the process himself. Your car does this if you start the engine without having the seatbelt fastened. (back around 1974-75 the engine would not start (see above), but too many people (i.e. Members of Congress) found this annoying so the regulation was repealed.)

Consider the question “Do I have all of the parts and tools I need?” What is the commonly applied method to ensure, at a glance, that the answer to this question is “Yes?”

If you answered “5S” then Ding! You’re right. That is one purpose of 5S.

A common question is how mistake-proofing relates to jidoka.

My answer is that they are intertwined. Jidoka calls for stopping the process and responding to a problem. Inherent in this is a mechanism to detect the problem in the first place.

The “respond” part includes two discrete steps:

  • Fixing or correcting the immediate issue.
  • Investigating, finding the root cause, and preventing recurrance.

Thus, the line stop can be initiated by a mistake-proofing mechanism (or by a person who was alerted by one), and mistake-proofing can be part of the countermeasure.

But it is not necessary to have mistake proofing to apply jidoka. It is only necessary for people to understand that they must initiate the problem correction and solving process (escalate the problem) whenever something unprogrammed happens. But mistake-proofing makes this a lot easier. First, people don’t have to be vigilant and catch everything themselves. But perhaps more importantly, they don’t have to take the (perceived) psychological risk of calling out a “problem.” The mechanics do that for them. It is safer for them to say “the machine stopped” than to say “I stopped the machine.”

Back to my email…

A3 – A Process, Not A Form

Kris Hallan is a frequent contributor on the LEI forum at lean.org.
In this post, he outlines some great experiences with trying to implement the “A3 process” in his organization. Lean Forums – A3.

One thing that really drove home what goes wrong most of the time with the A3 process, and frankly, with most well-intentioned efforts to bring good analysis into organizations, was his experience of an early effort to try to “require” it without having the behaviors to back it up:

One of the worst things you can do is require an A3 be written and then allow a poor A3 to get past you. This has a tendency to happen when you put an A3 mandate on something that you don’t necessarily have control over. For instance, we started by requiring all CAPEX [capital expenditure – ed] projects to be proposed using the A3 format (hoping that the A3 thought process would come with the format).

What we got was a lot of projects summarized on A3s and virtually no feedback to go back and improve anything. No one learned anything from the process, no hanei occured, and nemawashi was non-existent. It became a box that everyone had to check. This can have a very detrimental effect on people’s attitude toward A3. Since they don’t take it seriously, they can’t really learn anything from it. I would say that this actually moved us backwards in our understanding of problem solving.

I could not agree more. I have seen this in other companies. This is PLAN-DO without the CHECK and ACTion. Set an expectation, go through the motions of compliance, but don’t ever bother to see if it is actually working the way that is expected.

The good news, further into his post, is that Kris’s organization figured it out and found that doing it thoroughly is more important (and quicker!) than doing it fast.