ASQ CQA – 5. Quality Tools and Techniques Part 15

  1. 5H Failure Mode and Effects Analysis

In addition to FMEA, we will talk about P FMEA as well, where P is for process and D FMEA, where D is for design. So let’s understand what is FMEA. FMEA is a design tool which is used for systematically analyzing the potential failures and identifying their effects. And this helps us us in identifying what all could go wrong and then prioritizing actions which we need to take. So, when we talk of FMEA failure mode and effects analysis, where we look at failure modes, failure modes are how things could go wrong and the effect of that, this FMEA can be done at various stages in product development.

So let’s understand that when you are developing a product, let’s say you are designing a new type of car, or maybe new type of laptop or mic or whatever new product you are designing, the first thing in that is the concept design, where you design the product at a very high level, at the conceptual level. And then you do the detailed design where you design all the parts which will fit together to make that particular product.

That is the design stage. And the next stage is the production stage or the process stage. So these are the stages in product development. FMEA can be used at all these stages. At the conceptual stage, we have concept FMEA. At the design stage, we have design FMEA or D FMEA. At the process stage or manufacturing stage we have process FMEA.

So if we look at the table on the left, which shows that design FMEA identifies failures associated with the design, which includes the product malfunctioning, the product life or safety hazards, whereas the process FMEA covers the failures associated with the product quality, process reliability and customer dissatisfaction. So here we talked about the stages where we can do FMEA. So let’s look at these stages in a little bit more detail here, where we will talk more about design FMEA and the process FMEA.

So in design we do design FMEA. And now let’s say if your product is complex, then you can divide your FMEA into number of parts, systems, subsystems or at component level. And I’m sure at this stage, if you don’t know about FMEA, you might be getting a little bit confused. So what I will do is once I have talked about these basics, then I will show you an example of FMEA that will make things more clear. So now let’s come back to the topic that when we do design FMEA, at the design stage, we can do FMEA.

For systems. Let’s say if you are designing a new car, then our system will be, let’s say braking system. In braking system, what all could go wrong? What will be the effect of that, what we need to do? Similarly, we can look at another system, let’s say the drive system or the battery system, if we are designing a battery operated car. So we look at the systems, we look at subsystems or we could even go at the component level depending upon the complexity of the product coming to the process FMEA. Process FMEA can be broadly classified into two areas the production FMEA and the assembly FMEA.

So once you start making those cars, you will be making certain components, the pieces there you will be using production FMEA for the production stages. So here also you can go to the system level, subsystem level or the component level. Similarly, when you assemble all these pieces you will be doing the assembly FMEA and there also you can go at the system subsystem and component level. So this basically tells that FMEA can be done at various stages and at various levels. And once again, we have still not seen what an FMEA looks like. Before we look at the example, let’s understand that FMEA is a proactive tool. So this is not something to be done when the failure has happened. What we do here is we look at what all could go wrong. Let’s say if we are talking about the car, what all could go wrong if the braking system doesn’t work? Oh, that will be a fatal problem. So if the brake doesn’t work, people could die. So this is a very critical problem and then another failure mode could be, let’s say if we are making car panels and if the car panel, the body panel has a dance or a minor scratch or this is also failure, failure to meet the client requirement, what could go wrong?

Okay, this is not that critical as the brake failure. So we look at number of failures, what all could fail in the system, subsystem or component and once again, this is a living document because once you have identified all these problems, all these failures, you take action on them. So once you take action on them, then your failure modes could change.

So with this basic understanding now let’s look at an example here. This is a simple example. In this example I have a perfume making company which buys number of components or the ingredients for that perfume, mixes that in the right proportion, put in the bottle, put a nice packing and sell in the market. So this is what I’m doing, a very simple operation in that what all could go wrong? So for this I have made FMEA here, which is failure board and effects analysis. So let’s look at this column by column.

So what we have in the first column is process or the requirement. So here you list down all the processes. My first process is receiving all those ingredients for the perfume. This is the receiving part, then I do mixing part, then the next step will be packing, next step will be dispatch and so on. So I’m just taking a few steps only to demonstrate this FMEA. Let’s concentrate on receiving part only. So I have received a number of ingredients. And now what could go wrong here? Things could go wrong is that I could get wrong ingredients which could have a poor performance. My perfume might not be smelling as nice because I got wrong ingredients. So this is something which could happen. In case of car, we talked about the failure could be the brake failure, failure could be scratch on the car body or something.

But in this particular case the wrong ingredients is the failure mode when we receive this material at that stage. So what is the effect of that? Effect of that will be inconsistent quality of perfume. So now what we do is we grade all these failures at three scales and these scales or numbers are severity.

We look at the severity, we look at occurrence and we look at detection. So we give three numbers to each of these failures and as we go further we will talk about that. But let’s talk about the severity here. So severity here is eight. When I get wrong ingredients and the quality of my perfume is inconsistent, the severity could be on the scale of one to ten. One is low severity, there’s not too much of impact. Ten is the severe. In case of brake failure, this will be ten. In case of scratch, this might be on the scale of one to ten, this could be three, four or whatever this could be, this is done by team working and you give numbers to each of these failure effects. So this is number eight here. So the next thing what I do is I look at the causes of this failure. The cause of this failure could be unclear specification or substandard material supplied by the supplier. So either I have not specified properly what I need or maybe supplier knows what I need, but they have provided me substandard material.

So now what is the chance of this? And for that we will give another number which is occurrence. How frequently this is expected to happen? Occurrence number for unclear specification is three and for substandard material supplied by supplier is six. So there is less chance that my specifications are unclear because I have made sure that my specification are really clear. I have gone through review of that specification but then there is a good chance that the supplier could provide me the substandard material. And now for each of these type of failure modes, what is current control, how do we control this, how do we control that? Our specifications are clear. So for that what we have is review and approval of specification. So we have multiple stages of review before we send it to suppliers. And what is the current control in place for substandard material supplied by supplier? So what we do is we have third party certification agency and we have in house lab which checks that. And now the next number here is the detection. How easy it is to detect that problem.

So if something goes wrong, whether this is easy to detect or this might not be easy to detect and this could lead to much more bigger problems because you are not able to detect the problem. So in this particular example both of these are given number four. And here also detection from one to ten means one is very easy to detect and which could be in case of automation, where something goes wrong and automatically this gets detected that something has gone wrong. And ten is where things cannot be detected and this particular incidence happens without any detection. So now, when we have given these three numbers the severity, occurrence and detection, what we do next is we multiply these numbers and we come out with another number which is called as RPN or the risk priority number. This basically tells us how much risky this particular case is.

So RPN here is 96 in first case and 192 in second case. And this basically is multiplication of eight multiplied by six. And that multiplied by four will give you 192 and eight multiplied by three multiplied by four will give you 96. So these are two RPN numbers and once we have all this completed FMEA, we have multiple numbers here. Then what this number tells you is that if the number is higher, that is something which has more priority. So in this particular case, this particular failure mode or the cause of failure mode has more priority. So you need to make sure that you get the right material supplied by your sub suppliers. And as I was earlier telling, this is a live document. So the current control in place was a third party certification. And now what you might want to recommend here is additional, let’s say additional checks or certifications to be provided by supplier.

So once you have put that additional step, then what could happen is that this rating might change. So if this rating gets changed or reduced because now you have taken some action, then the next item will be the more priority. So this was FMEA or failure modes and effects analysis. So earlier I have talked about three numbers which were severity, occurrence and detection. All these three numbers are on the scale of one to 10 and here is the summary of that. Severity on one to 10 means one means no effect. Client might not even notice that severity ten is serious safety hazard in case of occurrence also, which is on the scale of one to ten. Occurrence of one means this is a rare event and occurrence of ten means the failure is almost inevitable, failure will happen. And similarly detection which is on the scale of one to 10, detection of one means the current system will definitely pick that problem because we have automation in place and detection ten means the current system cannot detect the problem. And at the top I have given this which is RPN is equal to multiplication of severity, occurrence and detection. So RPN will be anything between one and 1000.

When everything is one severity one, occurrence one, detection one, then RPN will be one. And when all these three numbers are ten, the maximum value of RPN will be 1000. So what we do in FMEA here is the summary of steps which you take. So when you create an FMEA, first thing what you do is you identify key process steps. And that’s what we did here. In the first column we identified the steps. Then you look at the failure mode for each of these steps.

And then you identify failure effect and severity which is here. Then you identify cause and occurrence which is done here. And then you identify control and detection which is next two columns. And then based on that you find out the RPN value and you prioritize the action based on RPN. So higher RPNs are attended to first and then lower and lower. And then based on these RPN value you determine the action plan and then you recalculate the RPN. If you have taken some action and the severity or occurrence or detection might have reduced so based on that you change these numbers.

  1. 5H Hazard Analysis and Critical Control Points (HACCP)

So understand what is HACCP, which is hazard Analysis and Critical control points. This is an internationally recognized system to enhance food safety throughout the food chain. So the first thing here is that HACCP is related to food safety. So what we do in HACCP is we identify potential hazards and these hazards are related to food. Food earlier also when we talked about FMEA there also we were looking at hazards or risk, what all could go wrong. Here also we are doing the similar thing, what we did in FMEA. But here our focus is on food safety. And then we implement control measures at a specified point in the process. So at this particular point, this is the control which we need to take. At this particular point, this is additional control which we have to have in place for food safety. And the third thing is monitoring and verifying that the control measures are working as intended.

So these are three broad things which cover HACCP. So in HACCP there are seven core principles. Let’s broadly understand these. The first one is conduct a hazard analysis. And what you do in conducting a hazard analysis is you look at all the hazards which are related to the food safety and you evaluate that what is the likelihood and severity of each of these hazards. And then the next thing is determine the critical control point that at this particular stage, this is the control which we need to have. For example, let’s take a simple example of storing the food. So in your house also when you store the food, you put that in a refrigerator. So if you put in a refrigerator, that food can stay for longer. So here what you need to do is you need to identify critical control point that when we receive, let’s say, the meat in the factory or the food in the factory which is processing food, that what are the critical control points? First control point is the receiving of the food. The next will be the processing and so on. So you identify all those control points and then you establish critical limits there. Let’s say in case of storing food, you list down that your critical limit is that the temperature should be less than four degrees and the food should not be stored more than four days. So this is what you do in establishing critical limits.

And then you establish monitoring procedures. How would you monitor that? So in monitoring, let’s say the QA will be responsible for measuring the temperature of the storage area every day, every hour, or whatever it is, whatever you decide. So this will be establishing the monitoring procedure. The next thing is establish corrective actions. So if things go wrong, if your critical limits are not met, then what action need to be taken that will be determined in establishing corrective actions? And then the 6th core principle here is establish verification procedure. That how would you verify that all these actions, all these critical control points were adhered to.

This is done in step number six. And then step number seven is establish record keeping and documentation procedure. You need to keep record of that, that whatever action you have taken, whether those were complying to the critical limits which you establish. So this was a brief introduction to HACCP. And just to remind you, in addition to the CQA exam which you are planning to take now, after doing this course, there is a separate certification issued by ASQ, which is for certified HACCP auditor. So if you are doing audit of a food facility so you could even go for that particular certification where the focus will be on food safety. But let’s come back to CQA, in that we were just required to have a high level understanding of HACCP as a risk management tool.

  1. 5H Critical to Quality (CTQ) Analysis

Before we talk about CTQ or critical to quality analysis, let’s understand another term which is VOC or voice of customer. This particular term which is the voice of customer is used to describe the indepth process of capturing customers expectations, preferences and aversions. And once again, we are learning this as a part of risk management tool because when you are running a business, you need to meet the expectations or the needs of the client. If you are not able to meet that, that is a risk to your business. Because if you don’t meet the needs of customer, the customer might be dissatisfied and then they might leave you and go to another company. Or in the worst case they might sue you in the court for the ordination, not meeting something which was required as a part of contract or as a part of expectation. The voice of the customer could be stated or unstated. Let’s take a simple example of me running a clinic. So I have a clinic where I have a number of doctors attending to the patients which are coming to that clinic.

So the customer in this particular case are the patients who are coming to the clinic. And what is the voice of customer? They want good service from this particular facility, this particular clinic. So this is voice of customer. They want good service from the clinic. But then what do I do with that? So for that we have another term which is called as CTQ. But before CTQ, let’s understand that the problem with the VOC or the voice of customer is that voice of customer is vague. Customer wants good service from the clinic. That’s very vague. I cannot do anything about that. It’s difficult to define. And this is where CTQ comes into picture. So what we do in CTQ is we break down all the customer needs or the voice of customer into specific actions, specific things. And that’s what CTQ analysis does.

CTQ analysis converts the voice of customer into specific actions, specific measurable actions. So coming back to the same example of clinic here, the voice of customer is that customer wants good service in the clinic. Now, what does this good service means? So for that there are a number of drivers. One driver is the timely service, another driver is the cleanliness of the clinic, the cost associated with this service and so on. So these are the key drivers for this particular service. And now once you break that down into smaller pieces, for example, the timeliness here, if I break this down into CTQ, that will convert into, let’s say these two cases here, where the time which is from the registration to calling by the doctor is less than five minutes.

So the patient comes to the clinic, gets themselves registered and then they should be waiting more than five minutes for doctor to call them. This is one CTQ. The second CTQ could be the doctor consulting time is greater than ten minutes. The patient doesn’t want that. They go to doctor and doctor just hurries and rushes them and push them out. So they want that doctor attends to them properly. So here the CTQ is that doctors should give them more than ten minutes. This was for the timeliness as the driver. Then we can have similar Ctqs for cleanliness, for the cost and few other things as well.

So what we have done here is we have broken VOC, which was a quite broad term, into very specific things, specific, measurable things. So the difference between the voice of customer and CTQ is that voice of customer is less specific, but CTQ is more specific. Voice of customer is hard to measure. You really cannot measure that they want good service from the clinic. This is not measurable. But once it comes to CTQ, CTQ are easy to measure.

So whatever product or service you are providing, you might want to do CTQ analysis to convert the voice of customer into specific measurable things, which could help you in achieving what client needs. This will help you in mitigating risks related to not meeting the client requirements. So with this, we complete this topic of risk management tools.

img