RELIABILITY BASED METHODOLOGY FOR VENDOR SELECTION- A CASE STUDY
Damodar Garg , P S Sarma Budhavarapu, Sudhangshu C
Reliability Centre, Global R&D Centre, Crompton Greaves Ltd, Kanjur Marg (E), Mumbai.
•
e-mail: damodar.garg@cgglobal.com
ABSTRACT
The objective of every industry is to manufacture and supply products that will perform its intended functions without fail in the actual field. A reliable design (there is a misconception that a reliable design will always give a reliable product) may not necessarily turn out to be a reliable product always. Even if a product having reliable design is manufactured and used in the field its reliability may be unsatisfactory. The reason for this low reliability may be the product was poorly manufactured by using substandard manufacturing processes. Therefore to produce a reliable product, evaluation of manufacturing processes or vendors is necessary. In this work an attempt has been made to formulate a methodology which will help in evaluating the design reliability as well as the vendor selection process. The proposed methodology includes reliability prediction to effectively predict the design reliability and HALT (Highly Accelerated Life Testing) for vendor selection by qualitatively comparing the prototypes of same design manufactured by different vendors. A case study on a power electronic product is included to explain the methodology.
1. INTRODUCTION
Product performance and reliability are essential to success in today's global market. For product's where large development cycle time cannot be afforded because of unavoidable market competition, evaluation of design reliability and vendor selection becomes very important to have significant confidence on the product before it goes to the end customer.
Design reliability also called as Inherent reliability or built in reliability, is a measure of the overall "robustness" of a system or piece of equipment [1]. It is probably the single most important characteristic of any system or piece of equipment in terms of determining overall reliability performance. The Design reliability of a system or device is determined by its configuration and component selection. It provides an upper limit to the reliability and availability that can be achieved. In other words, no matter how much inspection or maintenance we perform, we will never exceed the design reliability unless we change the design of the product. If one can produce, operate, maintain, and inspect a device as required and decided, one will be able to harvest all of the design reliability. On the other hand, if there are gaps in the manufacturing methods, operating, maintenance or inspection practices, one will harvest only part of the design reliability.
Selection of manufacturing processes (or vendor) to produce a product (or component of the product) plays a major role to achieve design reliability. There is a misconception many designers has that if product passes initial testing and inspection, it is reliable. However, if product is produced by non standard or wrong manufacturing processes it may pass the initial inspection and testing phase but it will fail when it goes in the field. As an example, type of solder joint say dry solder joints could pass initial testing at the manufacturer end, but may cause failures in the field as the result of thermal cycling or vibration. This type of failure may not occur due to an improper design, but rather it is the result of an inferior manufacturing process. Therefore the evaluation of the manufacturing processes (or vendor) is required after product design is completed this will help the manufacturer to achieve design reliability. Sometimes it also helps attaining improvement areas in design to achieve higher reliability of the product up on modification.
It is often said, "Reliability must be designed into a product". This emphasizes the fact that nothing can make a poor design reliable. However, it is quite possible, and quite common, for a good design to be compromised by other factors. Therefore, designer has to measure the design reliability of the product at the design stage so that the reliability goal can be met before the product design is completed. A reliability prediction made for a product is based on its design and is an estimate of design reliability, since it assumes part failure rates, manufacturing quality, and
handling factors are all as expected. To assure that the achieved reliability is reasonably close to predicted reliability. Reliability prediction also can be use by designer to select the design from alternatives.
Second way of measuring the reliability of the design qualitatively is by conducting Highly accelerated life test (HALT). HALT is an advanced tool which will be used to identify the weakest links in the product [2]. In HALT to identify the initial strength of the product it is experienced to high level of stresses. It is limit determination test to determine the design margin for specific stresses. It is a method of surfacing design and potential process problems more quickly and effectively in order to undertake corrective action for Reliability improvement. HALT is not to determine the MTBF of a product or to reach a stated Reliability measure.
HALT includes collection of the operational, functional and environmental specifications of the product & identification of the product unique potential operational, functional and environmental stresses. Select the vital few stresses from the trivial many. Identified stresses have to be applied on the product gradually to measure the strength of the product. The critical components which are failed at high stress levels should be identified.
Failure analysis has to be conducted on Critical to Reliability (CTR) components. Root cause of the problems should be identified. Design modifications for further increase in the strength of product have to be suggested. The implementation of this methodology in different fields has been explained in the ref [3, 4, 5 and 6].
In this study, "Circuit Breaker control system" is used for this reliability analysis. The purposes of the study is to find design limit, critical to reliability (CTR) component (or weak link) of the product and vendor evaluation. Failure analysis on the failed sample, in the HALT testing, is conducted to identify sub systems/components and their failure mode. Corrective action identified and implemented to improve destructive limit of the product. To validate the identified corrective action, again HALT is conducted. The HALT done in phase 2 is also used for vendor evaluation. Vendor evaluation is done on the basis of qualitative analysis.
2. PROPOSED METHODOLOGY FOR DESIGN RELIABILITY EVALUATION AND VENDOR SELECTION:
The method proposed for selecting manufacturing processes/vendor is based on qualitative analysis of the product by HALT. Detailed flow chart is shown in Figure 1 followed by the description of each steps in paragraphs.
Figure 1: Block diagram of proposed manufacturing processes or vendor selection methodology
Methodology Description
1. First step of the methodology is to predict the reliability of the design. Reliability prediction can be done either by using Mil Standards or by using failure rate obtained by past failure data of similar type of products. The required inputs for conducting reliability prediction are product or sub assembly BOM, product specification, reliability prediction standards. Reliability prediction calculates the MTBF or failure rate by considering the components used in design, their failure data by standards and stress conditions. Therefore, prediction helps in calculating approx design reliability and finding the CTR component based on higher failure rate of the product.
2. Second step of the methodology is specification testing i.e. product is tested with in its specification range to find out the patent failures with respect to the intended functions (features) and environmental conditions.
3. Third step of the methodology involves the identification of the CTR component by highly accelerated life testing (HALT). In this method product is tested above and below the specification range to find out the latent failures which can originate in the due course of time once the product is fielded. The operating limit that the product can withstand is also found in this step.
4. Final step of the methodology involves comparison on the basis of qualitative analysis of the manufacturing processes/vendors by conducting HALT on the products. In this step, the outcome of HALT is compared for the similar design products manufactured by different vendors. The vendor is then selected on the basis of higher operating limits and failures critical to the performance of the product.
3. VENDOR EVALUATION FOR CIRCUIT BREAKER CONTROL SYSTEM - A CASE STUDY
3.1. Product Description
The Circuit breaker control system controls the CLOSE and OPEN operations in circuit breakers by monitoring and analysing incoming signals and user inputs. It recognizes and reports potential circuit breaker operation or maintenance requirements, before they become critical. It monitors and displays: grid phase voltages, line currents and grid frequency. It measures breaker's operation timings, coil currents, pole discrepancy and contact wear out and displays them with the help of HMI unit. Circuit breaker control system provides to drive 3 Close Coils and 6 Trip Coils suitable for breakers with double trip circuits. The circuit breaker control system can be divided into six modules HMI, SMPS board, DSP controller, Analog input card, digital I/O card, Analog digital I/O card. Figure 2 shows the functional block diagram of these modules.
DSP Controller board is main CPU of the system. It receives, preprocesses and analyses the Analog and Digital signals, performs necessary diagnostics calculations and communicates the results to the user through hMi Unit. Analog input (AI) board receives different inputs like grid voltage and current, fault current, coil current etc. It processes these inputs by filter, scales and level shifts and sends to DSP controller. DSP Controller Board collects and measures signals from AI Board through an internal A/D Converter and displays them on LCD of HMI Unit. Digital Input/Output (DIO) Board receives 24V DC command inputs like Remote CLOSE, Remote TRIP, TEST/ SERVICE position, Breaker CLOSE and Breaker OPEN status from Panel. DIO board generates the Breaker CLOSE, Breaker OPEN commands for the R pole of Circuit Breaker. Analog Digital Input Output Board (ADIO) receives 24V commands from 'Remote' Panel or 'Local' HMI and generates the Breaker CLOSE, Breaker TRIP, Auxiliary TRIP commands for Y and B poles of Circuit Breaker. The SMPS Board receives 125V DC supply Input from the station batteries and generates regulated supplies for remaining all units. HMI unit enables the User to operate Circuit Breaker through its front panel keyboard and displays the results.
I_____________________________________I
Figure 2: Block diagram of circuit breaker control system
3.2. Test Set-Up
A test setup was prepared for online monitoring of the product during testing. This setup is designed to simulate various functions of circuit breaker control system. Setup include six relays (to simulate open and close conditions of breaker), reset relay, fault signal relay, air pressure relay and Sf6 gas relay. The functions of these relays are as described below.
Reset relay : Once closed it will send the ON signal to CBCS. Then CBCS
supposed to close the three coils of the breaker
Fault signal relay : Once open it will send the fault signal to CBCS. Then CBCS supposed to trip all six coils of the breaker
Sf6 gas relay : Once open it will send the SF6 fault signal to CBCS. Then CBCS
supposed to trip all six coils of the breaker and will not reset them until unless Sf6 relay is closed again.
Air Pressure relay : Once open it will send the Air pressure fault signal to CBCS. Then CBCS supposed to trip all six coils of the breaker and will not reset them until unless Air pressure relay is closed again.
3.3. Methodology
Step 1: Reliability Prediction:
Reliability prediction for the control system was done using MIL 217F and Bellcore prediction standards. Required information to conduct reliability prediction i.e. Bill of material (BOM), product specification and BOM component detailed is collected from product designer. The product has 5 electronic boards. The failure rate of each board is calculated separately to arrive at the system failure rate. The failure rate of sub systems is shown in the form of bar chart in Figure 3.
Figure 3: Predicted failure rate of Controller Components by lambda predict
Controller card has the highest failure rate therefore it is our CTR component. Total system failure rate is sum of all component failure rates as all the components are in series.
Step 2: Specification testing:
Specification testing is conducted to verify the performance of the system for the design specifications. Three prototypes were developed by Crompton Greaves ltd. R&D and specification testing was conducted. To monitor performance of the product online, simulation setup is used. In specification testing Dry Heat, Damp Heat test is conducted according to their specification i.e. 0-50°C with 95% RH. Vibration and shock testing is conducted according to IEC standard as applicable to the product. All three samples have passed the specification testing.
Step 3: HALT testing:
HALT stresses on the product simulate failure conditions in quick time. The failure modes obtained here may take several months to be exhibited under normal conditions [6]. For example, IC of the display card failed in 2 weeks at 80°C may take several weeks to be found at 50°C, and may take several months to be exhibited at 30°C. While it is not reasonable to expect the product will ever encounter these intense conditions, the same failure modes will be encountered at lower stress levels in much longer periods of time.
Figure 4: Step stress testing
HALT is conducted on all three prototypes by step stress method shown in Figure 5. In this method stresses are increased in steps and at each step functional testing parameters are checked after stabilization of the stress. Firstly Low temperature step stress was conducted which verified that the product has an operating limit of -40°C. Secondly high temperature step stress was conducted to get the operating limit at high temperature but unfortunately the product started misbehaving at 55° with 95% RH and restored its function when the temperature was brought down to 50°C. The high temperature test concludes that the product is experiencing soft failure at 55° with
95% RH.
The soft failure was found to be related to pole discrepancy. To find out the root cause of the pole discrepancy designers first analyzed DIO card which was found to be working fine. After DIO the controller card was analyzed and it was found that there is a problem with controller coding. Coding software was modified and HALT was re-conducted to validate the modification. Now product is working fine till 110°C. Vibration, Voltage, Damp Heat + voltage HALT is conducted. Operating limit at all condition is shown in Table 1.
Table 1 : Operating limits of Circuit breaker control system in HALT
HALT Stress type Operating limits
Cold Temperature -40°C
High Temperature 110°C
Vibration 14g , 20-2000Hz
Voltage 75-270Volt
Cold Temperature + Voltage -40°C with 70 to 270 Volt
High Temperature + Voltage 100°C with 70 to 270 Volt
Step 4: Qualitative analysis of Manufacturing Processes/Vendors by results obtained through HALT
Literature [5-7] says if a product has higher operating limits, it will have higher Reliability than the other product of similar design and features, when used in the same environment. We used this approach to evaluate vendors/manufacturing processes. The limitation of this methodology is that we cannot quantitatively calculate the magnitude of reliability. It is a qualitative approach to find out the best vendor.
Table 2: Operating limits of Circuit breaker control system manufactured by different vendor in HALT
Stress Type Product from Manufacturer A Product from Manufacturer B Product from Manufacturer C
A1 A2 B1 B2 C1 C2
Voltage Step Stress 75-270Volt 75-270Volt 75-270Volt 75-270Volt 75-270Volt 75-270Volt
Low temperature step stress -40°C -40°C -40°C -40°C -40°C -40UC
High Temperature and Humidity Step Stress* 100°C (Soft Failure) 110°C (Soft Failure) 65°C (Soft failure) 55°C (Hard Failure) 85°C (Soft Failure) 90UC (Soft failure)
Combine Voltage and Low temperature -40°C with 75-270Volt -40°C with 75-270Volt -40°C with 75-270Volt -- -40°C with 75-270Volt -40UC with 75-270Volt
Combine Voltage and High Temperature 100°C with 100-270Volt (soft failure at 100°C at 95 Volt) 100T with 75270 Volt (soft failure at 105°C) 60°C with 75-270 Volt (Soft failure at 65°C) -- 85UC with 75-270 Volt (Hard failure at 90UC) 85UC with 90-270 Volt (Soft failure at 85UC with 90 Volt)
Vibration testing 14g, 20-2000Hz 14g, 202000Hz 14g, 202000Hz -- -- 14g, 20-2000Hz
* There is no humidity at 100°C and above.
In our analysis, we selected 3 vendors (Manufacturer A, B and C) and 2 nos. of product were
manufactured by each vendor. Unique identification number were provided as mentioned, A1, A2 - manufactured by vendor A, B1, B2 - manufactured by vendor B, and C1, C2 - manufactured by vendor C. Same design as well as BOM was provided to all the vendors. Once the products were received following activity were conducted to analyze the vendors.
1. Visual checkup of PCB - All the component were matching with the BOM provided.
2. Functional testing - All 6 products working satisfactory.
3. HALT - Conducted on all sample manufactured by vendor A, B and C. The Operating limit found from HALT is shown in Table 2.
From Table 2 it is clear that the "manufacture A" samples are withstanding more stress than the other manufacturer samples. So "manufacturer A" can be recommended for the mass production.
4. CONCLUSION
In this paper, a methodology for assessing inherent design reliability and sustaining that design reliability by selecting appropriate vendor and manufacturing processes for an electronic system has been proposed. The proposed analysis facilitates the designer to assess the improvement areas which on implementation can help in enhancing the useful life of the product. It also helps designer to understand the importance of vendor/process selection for the improvement of the present and next generation product's performance in the field. The proposed methodology has been successfully implemented for an electronic system and is presented in detail by taking ubiquitous "Circuit Breaker control system" as a case study.
5. ACKNOWLEDGEMENT
The authors would like to express their gratitude to the management of Crompton Greaves Ltd., for providing the necessary authorization and opportunity to present this paper.
6. REFERENCES
[1]. Charles E. Ebeling (2000), "An introduction to reliability and maintainability engineering", Tata McGraw-Hill, New Delhi, 2000.
[2]. Dmitri Kececioglu, "Reliability and life testing handbook", Volume-2
[3]. R. Munikoti and P. Dhar (1988), "Highly Accelerated Life Testing (HALT) for Multilayer Ceramic Capacitor Qualification", IEEE Trans. Component, Hybrids and Manufacturing Technology. 11:342-345
[4]. R. Confer, J. Canner, T. 'Irostle and S. Kurtz, "Use of Highly Accelerated Life Test (HALT) to Determine Reliability of Multilayer Ceramic Capacitors", IEEE
[5]. Mahesh K. Chengalva, Ron A. Webster and Derek G. Packard (2004), "Simplified Highly-Accelerated Life Testing on Components for Product-level Vibration Reliability Enhancement", Inter Society Conference on Thermal Phenomena, 231-237.
[6]. Alvin Hsu, Danny LS Huang, Gerald Chang and Jimmy Yang, "Understanding HALT Application in Desktop, NB and Server", IEEE