Safety and Lean Manufacturing

This is a (belated) response to a post from Patsi Sells on The Whiteboard. She asked about safety and kaizen.

When first implementing some of the tools and mechanics of the TPS (especially in a manufacturing environment), many of the initial efforts seem to run afoul of the industrial safety professionals. My experience suggests a couple of basic causes.

  • Safety countermeasures are not seen as necessarily directly contributing to production of the product. Thus, they are often lumped in to “not value added” or “waste” – at least by implication, if not directly, by over-enthusiastic kaizen event leaders.
  • Safety professionals can be stuck on specific countermeasures vs. looking at the actual risk and identifying other possible countermeasures.
  • Safety professionals are concerned (and rightly so) about repetitive motion injury. They perceive that takt driven standard work might increase the risk of this.
  • There is overall a general lack of understanding of the difference between regulatory compliance and keeping people safe. While these two things overlap, neither is necessary nor sufficient to assure the other.

Let me start with the last point since it probably raised the most eyebrows.

An industrial safety program has two distinct and discrete objectives:

  1. To keep people from getting hurt.
  2. To comply with all laws and regulations

As I said above, while meeting both of these objectives are absolutely necessary, meeting the requirements of one does not guarantee meeting the requirements of the other. To put it more directly –

  • It is possible to be fully compliant with all health, safety and environmental regulations and still have a workplace which is extremely dangerous.
  • It is possible to have a very healthy, safe and environmentally friendly workplace and still run afoul of laws and regulations.

Thus, it is necessary to ensure that each of these things are discretely addressed in any successful program.

It is important for both kaizen and safety professionals to be aware of this. Some things are simply required by law, even if they are wasteful, and sometimes interpretation is up to the whim of a local inspector who might be having a bad day. Whatever the case, for each regulatory requirement there must be a specific, targeted countermeasure, just as there must be one for each physical risk. Sometimes these things overlap, but sometimes they do not, and that is the key point here.

I will digress for a paragraph and make this point however: If you have the basics of workplace organization in place, you make a much better first impression on a health and safety inspector than if the place is a mess. If your fire department sees the fire extinguishers are all where they should be and up to date, access aisles and evacuation doors clear, etc. he is more biased toward the assumption that you have your basic act together than if the place is a mess. Everyone has some violations, but when they are blatant, simple things it starts you off with a bad impression, and inspectors usually start digging.

Moving beyond simple regulatory compliance, the objectives of “lean manufacturing” and safety are 100% congruent. Ethically it is never acceptable to knowingly put people in harm’s way for the sake of production. At a more pragmatic level, a hazardous workplace will adversely affect quality, production, cost, delivery and morale. I will not descend to the level of cost-justifying safety simply because it isn’t necessary. But it is equally true that an unsafe workplace has higher costs than a safe one.

The good news is that the kaizen process is incredibly effective at dealing with safety hazards.

Moving up on the above list, one of the biggest places where the safety professionals can help smooth out the work is with their expertise in ergonomics.

What the kaizen people should take a little time to understand is this: Motions with poor ergonomics nearly always take longer than motions with good ergonomics. To put it a little more clearly: Ergonomic improvements are kaizen. Once a good team understands the difference between good and bad ergonomics, they can quickly see many small improvements which all accumulate.

By standardizing the method, we standardize the good ergonomics. Where there are no standard methods, the Team Members will each develop their own which may, or may not, be the safest way to do the job.

Standardizing the right motions reduces the chance of repetitive motion injury. And paying attention to why unusual motions are necessary comes back to reducing variation and overload (muri, mura) in the workplace.

At the next level up, standard work is your best friend.

To develop any process you first take a good look and specify exactly what result you expect to achieve.
Define a defect-free outcome.
Part of that definition is “perfectly safe.” No one is going to argue with this. Of course you want a process that is perfectly safe, and produces a defect free outcome. Who wants the opposite? But until “defect free” and “perfectly safe” are explicitly defined or specified, there may be room for interpretation. Do this before working on the process steps.

Next define the work you believe will deliver a perfectly safe, defect-free outcome. Define the content, sequence and timing of the work.

Now you have a specification for content, sequence, timing and outcome – the four elements of activity that Steven Spear called out in “Decoding the DNA of the Toyota Production System.”

Next – try it. Run the process exactly as you specified it. Verify two things:

  • That the Team Member can perform the process as specified – there is nothing keeping him from doing it.
  • That the Team Member actually does it that way and put in guides, controls, mistake-proofing where things go off track.

Check your results.
Was it perfectly safe? Did you see any risks? How are the ergonomics?
Adjust as necessary, then repeat.

Was the result defect free? How do you know? Is there a way for the Team Member (or an automatic step) to verify the result?

In practice, you are “trying it” and “checking the results” not just when you are developing the process, but every time it is done.

If a defect is produced at some point, then you know either:

  • The process didn’t work as you expected.
  • For some reason (which you don’t know yet), the Team Member did something different, or omitted a step.

Likewise, if the Team Member so much as skins a knuckle, you know the same things: The process is not perfectly safe or the process wasn’t (or couldn’t be) followed.

In most cases where the process was not followed you likely had a Team Member doing something in good faith to “get the job done.” This is the time to gently remind him that, when he can’t follow the process as designed, to please pull the andon and let someone know. Maybe you will have to make an exception to correct the current situation, but you HAVE to know if the process isn’t working or is unworkable.

Bottom line?
You get perfect safety exactly the same way you get perfect quality.
The methods and approach for getting it, and the methods and approach for correction and countermeasure are exactly the same.

Remember:
The right process produces the right result.

If you aren’t getting the result you want, then take a look at the process.

What Nukes – a little more clear.

I re-read my “What Nukes?” post and realized I was really rambling. I want to reiterate a key point more clearly because I think it is important.

In the “Bad Apple” theory there is an implied assumption that the cause of an accident or other problem was one person who, at that moment in time, was not following the documented rules or procedures.

Except in the most egregious cases, such as deliberate misconduct, that is likely not the case. Most organizations have a set of “norms” that operate at some level of violation of the written or established procedures. The reasons for this are many, but usually it is because good people are doing the best they can, in the conditions they are given, to get the job done.

Failure to follow the rules does not result in an accident or incident.

Have you every run a red light or a stop sign? It happens thousands of times every day. It almost never results in an accident. Only when other contributing conditions are ripe will an accident result. Running a stop sign AND a car coming through the intersection.

The same goes for quality checks, and the more reliable an “almost 100%” process becomes, the more vulnerable you are. If a defect is only rarely produced, it is unlikely that any kind of human-based inspection will catch it. The faster the work cycle, the more this is true. The mind numbs, it is impossible to always pay attention to the detail, and the mind sees what it expects. “Failure to pay attention” is never an adequate root cause. It is blaming an unlucky Team Member for an omission that everyone makes every day just going through life. It is just, in this case, “there was a car coming through the intersection.” It is bad luck. It is being blamed for red beads in Deming’s paddle experiment.

So attaching the failure of an individual, while it is easy, avoids the core issue:

People’s failure in critical processes is a SYSTEM PROBLEM. You must investigate from the viewpoint of the person at the pointy end. What did he see? What did he perceive? What did he believe was happening and why was that belief reasonable given his interpretation of the circumstances at the time.

The post about “sticky visual controls” got to this. Your mistake-alerts or problem signals must penetrate conciousness and demand attention if they do not actually shut down the process.

What Nukes?

Cruise Missiles

Warning to Reader: This piece has a lot of free-association flow to it!

Oops. A few weeks ago a story emerged in the press that a B-52 had flown from North Dakota to Louisiana with half-a-dozen nuclear armed missiles under its wing. The aircrew thought they were transporting disarmed missiles. This is a rather major oh-oh for the USAF, as in general, they are supposed to keep track of nuclear warheads. (Yeah, I am understating this. I, by the way, can speak from a small amount of experience as I once held a certification to deal with these things, so I have some idea how rigorous the procedures are.)

Normally the military deals with nuclear weapons issues with a simple “We do not confirm or deny…” but in this case they have released an unprecedented amount of information, including a confirmation that nukes were on a particular plane in a particular location at a particular time.

The news story of the report summarized a culture of casual disregard for the procedures – the standard work – for handling nukes. I quote the gist of it here:

A main reason for the error was that crews had decided not to follow a complex schedule under which the status of the missiles is tracked while they are disarmed, loaded, moved and so on, one official said on condition of anonymity because he was not authorized to speak on the record.

The airmen replaced the schedule with their own “informal” system, he said, though he didn’t say why they did that nor how long they had been doing it their own way.

“This was an unacceptable mistake and a clear deviation from our exacting standards,” Air Force Secretary Michael W. Wynne said at a Pentagon press conference with Newton. “We hold ourselves accountable to the American people and want to ensure proper corrective action has been taken.”

So what’s the point, and what has this got to do with lean manufacturing?

The right process produces the right result.

As true as this is, it isn’t the point. The point is that the Airmen didn’t follow the procedures. And now the Air Force will apply the “Bad Apple” theory, weed out the people who are to blame, re-emphasize the correct procedures everywhere else, and call it good.

How often do you do this when there is a quality problem, an accident or a near miss? How often to you cite “Human Error” or “not following procedures” or “didn’t follow standard work” as a so-called root cause?

You need to keep asking “why” some more, probably three or four more times.


Field Guide to Understanding Human ErrorTo this end, I believe Sydney Dekker’s book “Field Guide To Understanding Human Error” should be mandatory reading for all safety and quality processionals.

Dekker has done most of his research in the aviation industry, and mostly around accidents and incidents, but his work applies anywhere that people’s mistakes can result in problems.

In the USAF case cited above, there was (according to the reports in the open press) a culture of casual disregard for the established procedures. This probably worked for months or years because there wasn’t a problem. The “norms” of the organization differed from “the rules” and I would speculate there was considerable peer pressure, and possibly even supervisory pressure, to stick with the “norms” as they seemed to be adequate.

Admittedly, in this case, things went further than they normally do, but let’s take it away from nuclear weapons and into an industrial work environment.

Look at your fork truck drivers. Assuming they got the same training I did, they were taught a set of “rules” regarding always fastening seat belts, managing the weight of the load, keeping speed down and under control, checking what is behind and to the sides before starting a turn (as the rear-end swings out.. the opposite of a car). All of these things are necessary to ensure safe operation.

Now go to the shop floor. Things are late. The place is crowded. The drivers are under time pressure, real or perceived. They have to continuously mount and dismount. The seatbelt is a pain. They get to work, have the meeting, then are expected to be driving, so there is no real time for the “required” mechanical checks. They start taking little shortcuts in order to get the job done the way they believe they are expected to do it. The “rules” become supplemented by “the norms.” This works because The Rules apply an extra margin of safety that is well above the other random things that just happen around us every day. The Norms – the way things are actually done erode that safety margin a little bit, but normally nothing happens.

Murphy’s Law is wrong. Things that could go wrong usually don’t.

The “Bad Apple” theory suggest that accidents (and defects) are the fault of a few people who refuse to follow the correct procedures. “If only ‘they’ followed ‘the rules’ then this would not have happened.” But that does not ask why they didn’t do it that way.

Recall another couple of catastrophes: We have lost two Space Shuttle crews to the same problem. In both the Challenger and Columbia accident reports, the investigators cite a culture where a problem which could have caused an airframe loss happened frequently. Eventually concern about it became routine. Then, one time, other factors come into play and what usually happens didn’t happen and we are wringing our hands about what happened this time. Truth is it nearly happened every time. But we don’t see that because we assume that every bad incident is an exception, the result of something different this time. In reality, it is usually just bad luck in a system which eroded to the point where luck was relied upon to ensure a safe, quality outcome. In this case they didn’t single out “bad apples” because the investigations were actually done pretty well. Unfortunately the culture at NASA didn’t adjust accordingly. (Plus Space Flight involves the management of unimaginable amounts of energy, and sometimes that energy goes where we don’t want it to.)

So – those quality checks in your standard work. Do you have explicit time built in to the work cycle to do them? Are your team members under pressure real or perceived to go faster?

What happens if there is an accident or a defect? Does the single team member who, today, was doing the same thing that everyone does every day get called out and blamed? Just look at your accident reports to find out. If the countermeasure is “Team Member trained” or “Team Member told to pay more attention” or just about anything else that calls out action on a single Team Member then… guilty.

What about everybody else? Following an incident or accident, the organization emphasizes following The Rules. They put up banners, have all-hands meetings, maybe even tape signs up in the work place as reminders and call them “visual controls.” And everything goes great for a few weeks, but then the inevitable pressure returns and The Norms are re-asserted.

Another example: Steve and I were watching an inspection process. The product was small and composed of layers of material assembled by machine. Sometimes the machine screwed up and left one out. More rarely, it screwed up and doubled something up. As a countermeasure, the Team Member was to take each item and place it on a precise scale, note the weight, and compare the weight to a chart of the normal ranges for the various products.

There were a couple of problems with this. First, the human factors were terrible. The scale had a digital readout. The chart was printed and taped to the table. The Team Member had to know what product it was, reference the correct line on the chart, and compare a displayed number with a set of displayed numbers which were expressed to two decimal places. So the scale might say “5.42” and she had to verify whether that was in or out of the range of “5.38 – 5.45”

Human nature, when reading numbers, is that you will see what you expect to see. You might recall that it was different after five or six more reads. So telling the Team Member to “pay more attention” if she made a mistake was unreasonable. Remember, she is doing this for a 12 hour shift. There is no way anyone could pay attention continuously in this kind of work. If a defective item got through, though, there would be a root cause of “Team Member didn’t pay attention.” She is set up to fail.

But wait, there’s more!

She was weighing the items two at a time. Then she was mentally dividing the weight by two, and then looking it up. Even if she was very good at the mental math and had the acceptable range memorized, that isn’t going to work. Plus, and this is the key point, in the unlikely but possible scenario where the machine left out a layer in one item, then doubled up the next, the net weight of the two defective items together would be just fine.

“Why do you weight two at a time?” Answer: “It’s faster.” This is true, but:

  • It doesn’t work.
  • She doesn’t need to go faster.

Her cycle time for weighing single items was well within the required work pace. But the supervisor was under pressure for more output because of problems elsewhere, and had translated that pressure to the Team Member in a vague “work faster if you can” way. It was the norm in that area, which was different from the rules.

Where is all of this going?

The Air Force has ruined 70 careers as a result of the cruise missile incident. They may have been right to do so, I wasn’t there, and this was a pretty serious case. But the fact that it got to this point is a process and system breakdown, and it goes way beyond the base involved.

Go to your own shop floor. Stand in the chalk circle. Watch, in detail, what is actually happening. Compare it with what you believe should be happening. Then start asking “Why?” and include:

“Why do people believe they have to take this shortcut?”

“Packaging” is spelled M-U-D-A

In Mike Wroblewski’s blog “Got Boondoggle?” he comments on just how much packaging and dunnage is not visible in Toyota’s Industrial Equipment plant. Of course that is remarkable because of just how common it is to find the opposite condition. Factories (and offices) have lots of packaging around, and spend lots of time dealing with it.

Toyota has been working on this a long time. They have the added advantage of being an 800 pound gorilla with most of their suppliers. They can specify packaging and insist on things being done a certain way. In a lot of cases it is the supplier that is the 800 pound gorilla, and a lot of small companies have a hard time being heard. But you can still make a lot of difference if you apply the principle that packaging is muda.

A frequently invoked (and overworked analogy) is the assembler or operator as a surgeon. Everything must be ready for the value-add operation to be performed waste free. If there is waste in the process, keep it away from this point. Surgical instruments come in packages. But I can assure you the surgeon is not unwrapping the scalpel. So who is?

The instruments are prepared and made ready for use by the staff.

Now take this to your factory floor. The first step is to keep dunnage and packaging out of the production area. There are actually a lot of advantages to doing this. The chief one is obviously optimizing your value-add time. But all of that packaging also takes up space. EVERYTHING that enters the production area MUST have a process (meaning a person!) to get it OUT of the production area. Since you have to unpackage that stuff anyway, do it before it gets to production. This means that someone else picks the part and prepares it for use just like the staff in the operating room.

Do this and you have applied one of the principles Liker and Meier point out in The Toyota Way Fieldbook – if you can’t eliminate sources of variation, then isolate them. In other words, set up a barrier that contains the waste so that your value-adding operation sees the result of a perfect supplier.

You can then take the next step: Do this at receiving. If the parts do not arrive from the supplier packed the way that your internal material conveyance system needs them, then put the resources into receiving to convert what you get from your suppliers into what you wish you got from them.

This is applying the principle of systematically pushing waste upstream, closer to the point where it originates. The other thing you have done is force yourself to dedicate resources to deal with this waste instead of spreading it so thin the cost is hidden. You will know what it costs you in terms of people, time, space, etc. to deal with the fact that your suppliers don’t ship what you need. By highlighting the problem instead of burying it, you have the opportunity to address it.

One more thing – it might seem easier to take these little wastes and spread them thin. After all, if everybody just does one or two trivial tasks, it doesn’t seem so bad, does it? It is those trivial wastes, those 5 second, 30 second, 1 minute little things that accumulate to half your productivity. It takes work to see them and eliminate them. Don’t add more to the process on purpose.

Standards Protect the Team Members

One of my kaizen-specialists-in-training just came to me asking for help. The Team Members he is working with are not seeing the need to understand sources of work variation.

I hear that a lot, both in companies I have worked in and in the online forums. Everyone seems to think it is a problem in their company, their culture – that they are unique with this problem.

The idea of a unique problem is variation on the “our process / environment / product is different so ____ won’t work here.” Someday I will make a list of the standard management “reasons why not” but that isn’t the topic of this post.

I told him:

  1. This is not unique to China, or to this facility. The same resistance a always comes up, and nearly always comes up the same way once the Team Members begin to realize we are serious.
  2. There is no way to just change people’s minds all at once.

Here is something to explain to the concerned Team Members: The standard process is there to protect the team member. If there is a problem, and the standard process was followed, then the only focus for investigation can be where the process itself broke down. Countermeasures are focused on improving the strength of the process.

If, on the other hand, the process was not followed (or if there is no process), then the team member is vulnerable. Instead of the “Five Why’s” the investigation usually starts with the “Five Who’s” – who did it? Countermeasures focus on the individual who happened to be doing the work when the process failure occurred.

As you introduce the concept of standard work into an area that is not used to it, it is probably futile to try to tighten down everything at once. The good news is that you really don’t have to.

Start with the key things that must be done a certain way to preserve safety and quality. If they are explained well and mistake-proofed well, there is usually little disagreement that these things are important.

The next step is to make it clear that the above are totally mandatory. If anything gets in the way of doing those operations exactly as specified, then STOP. Do not just work around the problem, because doing so makes you (the Team Member) vulnerable to the Five Who’s inquisition.

If you focus here for a while, you will start to get more consistent execution leading to more consistent output, which is what you want anyway.

Then start looking at consistent delivery and all of a sudden the concept of variation in time comes into play. Why was this late? The welder ran out of wire, I had to go get some more, I couldn’t find the guy with the key to the locker…… Go work on that. At each step you must establish that the point of all of this is to build a system that responds to the needs of the people doing the work.

A Quality Story

On a cloudy morning a few years ago I started my truck, turned on the headlights and noticed one of them was not working. While changing a headlight isn’t that big a deal, I needed a state safety inspection anyway, and I really didn’t want to mess around under the hood.

So I called the auto repair and tire place up the road, and arranged to come in.

When I got there, I explained that the right headlight was out. Since I knew from experience that the left one would probably fail in a few weeks, I said “Go ahead and change the left one too, and I need the state safety inspection.”

Unfortunately they only had one headlight in stock, so they couldn’t change both, I said OK and got a soda to wait.

A while later, everything was good-to-go, I drove back to work and went home that evening. The next day was sunny.

The following day, however, was cloudy again. I turned on the headlights in the garage, and the right headlight was still out. Hmmm. I took a quick look and noticed that the LEFT headlight was new. They had changed the wrong one.

When I got back to work, I called the shop, explained the issue, and they said come over and they would make it right. (Meaning they would replace the RIGHT headlight at no charge.) After handing the keys back in, the mechanic who did the work came out and tried to convince me that he had simply followed the instructions, and it wasn’t his fault.

OK — let’s break down this situation from the perspective of quality and delivery.

Starting from the end of the story, why did the mechanic feel obligated to try to convince me that I had responsibility for a mistake that the owner had already agreed to correct? What could possibly be gained by trying to make the customer feel wrong here?

Now let’s go back to the beginning.
Customer reports a headlight is out. What is the very first thing you would do? How would you confirm the problem?

Turn on the headlights and check. This takes about 10 seconds. If there was a confusing communication about the problem, this is the time to discover it.

Once the repair has been completed, how would you confirm that it worked?
Turn on the headlights and check. This would verify that the results were as intended. (What does the customer need here? Two headlights that work.)

The other item that was on the instruction was a state safety inspection. Now I am not an expert on the New York State safety inspection, but I am willing to bet at least a can of soda that it includes the headlights. So as part of that inspection, the mechanic should turn on the headlights and check.

Even if the problem had gone undetected up to this point, it should have been caught here. Obviously the “inspection” was just a paperwork drill in this case.

Of course, as the customer, I also failed to turn on the headlights and check when I picked up the vehicle. Silly me.

What is the point of all of this?

The first step of solving any problem is to understand or verify the current condition and compare it with your expectations or standard. (More about standards and expectations later.)

Once a countermeasure had been developed and put into place, the situation must be re-checked to ensure that the countermeasure worked as intended and the process or system is back to the standard condition.

Any process has some kind of intended result. In this case, the headlight repair process should result in two headlights that work. Yet the results were not verified, and the incorrect product was delivered to the customer.

So what did all of this cost?

It cost me my time.
It cost the repair place

  • Double the mechanic time.
  • A headlight +the time to go get one during business hours (because they had not replaced the one they installed on the left two days earlier).
  • Good will. (Later, when they failed to tighten the filter after an oil change, I stopped doing business there altogether. One mistake I was willing to overlook, but I was starting to see a trend developing. I also later had some extensive maintenance and repairs done on the truck, but I didn’t have them done there because I could not trust them to change a headlight or change the oil.)

By the way, I still have the truck. It is coming up on 200,000 miles and runs great.