Krishnan J., Vijayaraghavan R RT&A, No 1 (77) PROCESS CAPABILITY ANALYSIS FOR NON-NORMAL DATA_Volume 19, March 2024
PROCESS CAPABILITY ANALYSIS FOR NON NORMAL DATA BASED ON BOX-COX TRANSFORMATION THROUGH TESTS OF GOODNESS OF FIT
J. Krishnan
Department of Mathematics, Sri Krishna Adithya College of Arts and Science Coimbatore - 641042, Tamil Nadu, INDIA [email protected]
R. Vijayaraghavan
Department of Statistics, Bharathiar University Coimbatore 641 046, Tamil Nadu, INDIA [email protected]
Abstract
Process capability analysis is an effective and efficient tool for quality assurance. When the distribution of the underlying quality characteristics is not normal, modifications of the basic process capability indices are required. Literature in process control provides avenues to resolve the issue of non-normality and data transformation is one of the approaches frequently applied in practice. Primarily the Box - Cox transformation (BCT) is employed to transform the non normal data into normal data which originally utilizes the method of maximum likelihood estimation (MLE) to find the single transformation parameter A. There are alternative methods to estimate the optimal parametric value A using goodness of fit tests rather using MLE method. In order to bring improved estimates, this paper makes a fresh attempt to estimate process capability analysis (PCA) using transformed data through different goodness of fit tests. The simulation study uses variety of asymmetric behaviors from a Weibull distribution generating a random sample of 100 data points to find the best goodness of fit test for better process capability estimates that are compared to the standard of six sigma results for nonnormal data. Final result shows that Shapiro-Wilk's (SW) and Artificial Covariate (AC) methods are performing well when compared to the method of MLE. Minitab software and R programming language were utilized for data simulation and analysis.
Keywords: Goodness of fit tests, Box-Cox Transformation, Asymmetric, MLE, Weibull distribution, Six sigma.
1. Introduction
Process capability indices (PCIs), the statistical tools in quality control, are widely used to meet the required targets set in most of the manufacturing industries. Process capability analysis (PCA) addresses the issues relating to how well a manufacturing process meets the required specification. PCIs defined from normality assumptions cannot be used to accurately measure the performance of non-normal processes. Data transformation for preserving a somewhat normal distribution has been recommended in [5]. The empirical study made in [4] has demonstrated that the findings of transformed data are much superior to the results of the original data. The literature surveys demonstrate that for non-normal distributions such as Lognormal, Weibull, etc., the transformation methods perform well when compared to non-transformation (NT) methods and are considered as consistently superior to NT methods. Further,
NT methods are found to be inadequate in capturing the capability of the process unless the underlying distribution is close to or approximately normal. NT methods are unsatisfactory because the distribution deviates significantly from normal. See, [15].
In PCA, the process variation is defined based on the measure 'standard deviation'. The short-term and long-term variability may be addressed by the estimated standard deviation obtained from the random sample observations and such an estimate is used while computing the process capability. The short-term variability is considered for computing the process capability indices whereas long-term variability is taken for calculating process performance indices. Hence, capability indices are calculated using samples of data based on short-term or within group variation, whereas performance indices are calculated using all the data points using long-term or overall variation. The process capability indices are denoted by Cp and Cpk, and process performance indices are demoted by Pp and Ppk. A detailed review on various methods that are chosen for performance comparison in their ability to handle non-normality in the computation of process capability indices is presented in [13]. The most common and traditional indices being applied by manufacturing industry are process capability index Cp and process capability ratio Cpk which are given below in Table 1 along with the respective performance indices, where x is the sample mean, USL is the upper specification limit and LSL is the lower specification limit.
Table 1: Process Capability and Process Performance Indices Process capability indices Process performance indices
n USL-LSL _ _ USL-LSL
Lp —--P p —-
6ctw ^overall
Cpk = Min (CPU, CPL) Ppk = Min (CPU, CPL)
CPU — USL-x, CPL = ^ PPU ppl— . LSL
3crW 3ctW ^overall 3°overall
According to [15], a better understanding is required about Box - Cox transformation (BCT) and its parameter estimation approach utilizing a search method to estimate the process capabilities. In [17], a method of converting non-normal data into normal data to analyze the data using the process capability indices and an improved Box-Cox transformation model have been proposed to deal with non-normal data and to calculate its process capability indices. In [1], the method of maximum likelihood estimation (MLE) was utilized for finding the ideal parameter A in Box-Cox transformation. Alternative methods to MLE approach utilizing goodness of fit tests (normality tests) were developed in [3], [10] and [11]. By examining the effect of conversion of non-normal data into normal data with the use of different goodness of fit tests, it is demonstrated in [3] that the method of MLE in estimating the BCT parameter A could be biased and ineffective. The competence of the different goodness of fit test was also determined in [3] by various measures of errors, estimates of PCI, PPI and defective parts per million (PPM) products.
In order to get improvised estimates of PCI and the result within the standard of six sigma level, a new attempt is made in this paper to estimate process capability analysis implementing different goodness of fit tests in BCT. The results of different goodness of fits tests are recorded and presented to help the practitioner to choose the method which will produce the improvised results in various asymmetric situations, viz., low, moderate and high. Thus, the objectives of this paper is to examine the effectiveness of the different goodness of fit tests involving transformation of non-normal data into normal data using BCT and to recommend a superior test that will produce higher values of process capability with minimum of error and PPM values. It also verifies whether the proposed method produce the results within the standard of six sigma level.
2. Methodology
Transforming non-normal data into normal data is one of the frequently used approaches in practice when the observed data do not satisfy the normality assumption. A few approaches which are applied in practice to transform the non-normal data into normal include Johnson's system of transformation (JST), Box-Cox transformation (BCT) and Rosenblatt transformation (RT). Though JST and BCT approaches are equally efficient, the latter would be preferred over the first one for handling non-normal data when computer assisted analysis is available and it also outperforms the other methods. See, [12]. Further, when compared with the JST method, BCT method is more accurate and precise. BCT provides a family of power transformations that will optimally normalize a particular variable. As stated in [2], the BCT method transforms non-normal data into normal data on the positive response variable x as shown in the below expressions:
x2-1
X 1, for!* 0
X " (1)
log x, for.X = 0
It may be noted that since an analysis of variance is unchanged by a linear transformation, the expressions given (1) is equivalent to
X |xX, for! * 0
x (2)
[log x, for.X = 0
The estimation of A is done through various goodness of tests for normality, that are available in the literature, which includes tests, such as Shapiro - Wilk (SW), Anderson Darling (AD), Cramer Von Mises (CVM), Pearson Chi-square (PC), Shapiro - Francia (SF), Lillefors (Kolmogorov - Simirnov) (LT / KS), Jarque - Bera (JB), and artificial covariate method (AC). The BCT approach given in [2] involves the method of maximum likelihood estimation (MLE). Two alternative approaches proposed in [10] and [11], respectively, considered Box - Cox power transformation using maximization of the Shapiro - Wilk W statistics which forces the data to get closer to normal as much as possible and Anderson - Darling test. In these approaches, Newton - Rapson algorithm has been used to obtain A. A method is proposed in [3] to simulate a single artificial and non-informative covariate and to find A minimizing the sum of squares of errors among several simple linear regression models.
The results of the earlier studies presented in the literature, particularly in [1], [7], [10], [14], [16] and [18], would be useful to understand the significance of tests of goodness of fit while transforming nonnormal data into normal data. [10] Shows that the test based on SW statistic is a powerful test of normality for a variety of non-normal distributions, the SW statistic is reliable for small samples and in regression applications, the statistic would yield higher R2. It is asserted in [7] that the test based on SW statistic is the most powerful test for non-normal distributions.
According to [14], JB test is preferable to the Shapiro-Wilk test when the data exhibit a symmetric distribution with medium or long tails, or a slightly skewed distribution with long tails. [18] Ascertained that the test based on SW statistic is the best one for asymmetric distributions and powerful for symmetric short tailed distributions and has good power qualities throughout a wide variety of asymmetric distributions. Based on the results of a simulation study provided in [1], it is found that all of the transforming approaches performed similarly to one another. One may refer to [9] and [19] for the details on the concepts of six-sigma tools and process capability analysis for non-normal data, respectively.
3. Weibull Distribution
Weibull distribution is applicable to a wide range of non-normal processes because it is capable of generating a variety of distinct curves based on its parameters. It exhibits a significant tail behavior, showing a significant effect on the capability of the process. It is frequently utilized in applications that focus on quality and reliability to analyze failure data and to comprehend how failures take place or how often products fail.
The probability density function of a Weibull random variable is given by the following form:
f ( x) =
a
f \ x
aÂJ 0,
, x > 0 x < 0
where a > 0 and ft > 0 are the shape and scale parameters, respectively.
The mean, the variance and the measure of skweness of the Weibull distribution are, respectively, given as follows:
E( X) = j = p r(l + 1/ a) V (X) = g2 = P2 |r(l + 2 / a)-- (r(l +1/ a))2 J
Sk =n =\[P3 r(l + 3/a)-3j<72 -j3J
G
The Weibull distribution with three sets of shape and scale parameters, say (2.8, 3.5), (1.8, 2.0), and (1.0, 1.3) is considered in [6]. The sets of parameters are categorized for the purpose of assessing the effectiveness of low, moderate, and high asymmetric behaviors during the transformation of non-normal data into normal data and carrying out the process capability analysis. The shapes of the density function of Weibull distribution for these sets of parameters are shown in Figure 1.
Figure 1: Asymmetric Behavior of Weibull Distribution
4. Numerical Illustrations
For a simulation set-up, the data set of size 100 is generated using different asymmetric levels of Weibull distribution. Minitab and R programming were utilized for data simulation and analysis purpose. As given in [6], the lower and upper specification limits are taken as 0.0 and 10. A combination of the box plot, descriptive statistics, measures of errors, like bias, percentage bias, median absolute error (MdAE), root mean square error (RMSE) and radar chart can be used to assess the effectiveness of the method.
a
a
e
This paper considers only the measures of errors and radar plots. In particular, bias, MdAE and RMSE are taken while transforming non-normal data into normal data using different goodness of fit tests in Box - Cox transformation. Once the transformation has been completed, the data have been further utilized to estimate process capability and process performance index and to choose the most effective approach among different goodness of fit tests. According to [8], a process is categorized as inadequate, if PCI < 1.00; capable, if 1.00 < PCI < 1.33; satisfactory, if 1.33 < PCI < 1.50; excellent, if 1.50 < PCI < 2.00; and super, if > 2.00. Automotive industry uses Cpk = 1.33 as a benchmark in assessing the capability of the process. If Cp and Cpk are more than or equal to 2 and 1.5, respectively, a process is said to be under six-sigma controls. Similarly, Pp and Ppl must be more than 2 and 1.5, respectively, for a process to generate six-sigma results. See, [8].
In order to guarantee the quality of the final product and reduce the number of faulty items, quality practitioners will also focus on PPM values. Table 2 lists the process fallout in defective parts per million products in relation to the proportion of good items and PPM values for various sigma levels. The main goal of all quality and industry practitioners is to reach 6a limits and a defect rate of 3.4 PPM has been associated with the process using these indices. On the other hand, the process performance indices, namely Pp and Ppk are utilized in the industries, particularly in the automobile sector, as the second sorts of estimators.
Table 2: Process Fallout in Defective Parts per Million with Respect to Different Sigma Levels
Sigma Level Percentage PPM Values
6 99.9997% 3.4
5 99.98% 233
4 99.4% 6,210
3 93.3% 66,807
2 69.1% 308,537
1 30.9% 691,462
4.1 Low Asymmetric Distribution
In this sub-section, low asymmetric Weibull distribution with the skewness of 0.13 and 0.31 for the combination of shape and scale parameters 2.8 and 3.5, respectively, has been taken for simulation study. From the error point of view, Bias, MdAE and RMSE values are very less for AD, CVM, SF, LT and PC goodness of fit tests and this ensures that the transformed values are very closer to normal data with minimum error values. For more information, Table 3 and Figure 2 may be referred. On the other hand, from estimation point of view, the transformed data are further taken for the estimation of process capability and process performance. The transformed data sets from SW, LT, AC, and MLE tests show the closeness to the standard normal and produce better results when compared to other methods. The PPM values are recorded as a minimum of 656 and a maximum of 1939 corresponding to the above said methods and are better than the results of 3a and 4a limits and closer to the result of 5a standards. For more information, Table 4 and 5 may be referred.
4.2 Moderate Asymmetric Distribution
A Weibull distribution with the shape and scale parameters fixed as 1.80 and 2.0, respectively, will represent the moderate asymmetrical non-normal data with skewness 0.64 and 0.94. In the simulation study, Minitab (M_T) transforms non-normal data into much closer normal data with minimum Bias, MdAE and RMSE values compared to other methods and the corresponding estimate of PC is smaller but with higher PPM values compared to the benchmark result. Thus, the method of transformation using Minitab cannot be taken as a competent method. One may refer to Table 6 and Figure 3.
Table 3: Various Measures of Error Values for Low Asymmetric Data After Data Transformation
Low Asymmetry (SK=0.13)
Methods
Low Asymmetry (SK=0.31) Weibull distribution (a=2
Bias MdAE RMSE Bias MdAE RMSE
SW 1.300 1.245 1.322 1.391 1.320 1.428
AD 1.226 1.184 1.240 1.335 1.273 1.363
CVM 1.226 1.184 1.240 1.246 1.200 1.263
PC 0.527 0.646 0.663 1.391 1.320 1.428
SF 1.271 1.221 1.289 1.363 1.297 1.396
LT 0.571 0.665 0.677 1.391 1.320 1.428
JB 1.285 1.233 1.306 1.377 1.309 1.412
AC 1.343 1.281 1.371 1.392 1.321 1.429
MLE 1.342 1.280 1.370 1.391 1.320 1.428
M_T * * * 1.434 1.345 1.706
* Transformation not done
Figure 2: Radar Chart for Various Measures of Errors After Normalization of Low Asymmetric Distribution
Table 4: Estimates of Process Capability and Process Performance Indices for W(2.8, 3.5) Distribution Having Sk = 0.13 After Normalization via Goodness of Fit Tests
Method
A Value LSL USL
PCI (Within Capability)
PPI (Overall Capability)
Cp Cpk PPM Pp Ppk PPM
W(2.8, 3.5) - 0 10 1.30 0.82 6828 1.27 0.81 7667
SW 0.75 -1.33 6.16 1.29 1.07 656 1.25 1.04 904
AD 0.79 -1.27 6.54 1.29 1.02 1066 1.25 1.00 1402
CVM 0.85 -1.18 7.15 1.28 0.96 2051 1.25 0.93 2543
PC 0.75 -1.33 6.16 1.29 1.07 656 1.25 1.04 904
SF 0.77 -1.30 6.35 1.29 1.05 841 1.25 1.02 1130
LT 0.75 -1.33 6.16 1.29 1.07 656 1.25 1.04 904
JB 0.76 -1.32 6.26 1.29 1.06 731 1.25 1.03 995
AC 0.75 -1.33 6.16 1.29 1.07 656 1.25 1.04 904
MLE 0.75 -1.33 6.16 1.29 1.07 656 1.25 1.04 904
M_T 0.50 0.00 3.16 1.42 1.28 66 1.36 1.22 127
Table 5: Estimates of Process Capability and Process Performance Indices for W(2.8, 3.5) Distribution Having Sk = 0.31 After Normalization via Goodness of Fit Tests
Method
A Value LSL
USL
PCI (Within Capability)
PPI (Overall Capability)
Cp Cpk PPM Pp Ppk PPM
W(2.8, 3.5) - 0 10 1.27 0.80 8026 1.32 0.83 6362
SW 0.81 -1.23 6.74 1.24 0.96 1939 1.29 1.00 1389
AD 0.86 -1.16 7.26 1.25 0.91 3051 1.29 0.95 2259
CVM 0.86 -1.16 7.26 1.25 0.91 3051 1.29 0.95 2259
PC 1.24 -0.81 13.21 1.36 0.67 22553 1.41 0.69 19197
SF 0.83 -1.20 6.94 1.24 0.94 2351 1.29 0.98 1708
LT 1.22 -0.82 12.78 1.35 0.86 21189 1.40 0.70 17959
JB 0.82 -1.22 6.84 1.24 0.95 2106 1.29 0.99 1518
AC 0.78 -1.28 6.44 1.24 1.00 1402 1.29 1.03 981
MLE 0.78 -1.28 6.44 1.24 1.00 1407 1.29 1.03 985
M_T - 0 10 1.27 0.80 8026 1.32 0.83 6362
Table 6: Various Measures of Error Values for Moderate Asymmetric Data After Data Transformation
Methods
Moderate Asymmetry (SK=0.64) Weibull distribution (a=1.8, [3=2.0)
Moderate Asymmetry (SK=0.94) Weibull distribution (a=1.8, [=2.0)
Bias MdAE RMSE Bias MdAE RMSE
SW 1.204 1.108 1.231 1.271 1.137 1.321
AD 1.195 1.102 1.219 1.255 1.127 1.301
CVM 1.175 1.090 1.195 1.247 1.122 1.290
PC 1.282 1.156 1.326 1.192 1.091 1.221
SF 1.201 1.106 1.227 1.271 1.137 1.321
LT 1.223 1.118 1.253 1.271 1.137 1.321
JB 1.211 1.111 1.238 1.271 1.137 1.321
AC 1.207 1.110 1.234 1.282 1.143 1.335
MLE 1.207 1.110 1.234 1.283 1.143 1.336
M_T 0.420 0.304 0.703 0.524 0.383 0.863
Kadiir chart foi WfLBx 2.0) & Sv=0.64 SW
1.5
/te-
m_t
MLE
AC :
ad
CVM
■ — Bias
---MdAE
---RMSE
LT
Kfldar chart far Vi(LB, 2.0) & St=094 SW
1.5
MT j*.
yt
MLE f-^'tj)}) i •
AC
a
\vv
AD
\CVM---Bias
w
---MdAE
PC -----KMSE
LT
Figure 3: Radar Chart for Various Measures of Errors After Normalization of Moderate Asymmetric Distribution
Besides M_T transformation, the CVM, AD, AF, AC and SW methods of transformation produce less errors and the PC, LT, JB, AC, MLE and SW methods of transformation yield the target results during the estimation of process capability and process performance indices along with the minimum PPM values. For the moderate asymmetric situations, the minimum and maximum PPM values were recorded as 81 and 241, respectively. The goodness of fit tests in the estimation of process capability for moderate asymmetric distribution shows the better results than 3a, 4a and 5a limits and approach towards the standard of 6a. One may also refer to Table 7 and 8 for more information.
Table 7: Estimates of Process Capability and Process Performance Indices for W(1.8, 2.0) Distribution Having Sk = 0.64 After Normalization via Goodness of Fit Tests
PCI (Within PPI (Overall Method A Value LSL USL Capability)_Capability)
Cp Cpk PPM Pp Ppk PPM
W(1.8, 2.0) - 0 10 1.79 0.59 37568 1.80 0.60 36938
SW 0.45 -2.22 4.04 1.44 1.23 110 1.44 1.23 114
AD 0.48 -2.08 4.21 1.44 1.16 252 1.43 1.16 259
CVM 0.54 -1.85 4.57 1.43 1.04 900 1.43 1.04 914
PC 0.19 -5.26 2.89 2.02 1.25 92 2.01 1.24 99
SF 0.46 -2.17 4.10 1.44 1.21 149 1.44 1.20 154
LT 0.39 -2.56 3.73 1.48 1.41 14 1.48 1.40 15
JB 0.43 -2.33 3.93 1.45 1.29 56 1.45 1.28 59
AC 0.44 -2.27 3.99 1.45 1.26 81 1.45 1.25 84
MLE 0.44 -2.27 3.99 1.45 1.26 81 1.45 1.25 84
M_T 0.50 0 3.16 1.43 1.12 398 1.43 1.12 408
Table 8: Estimates of Process Capability and Process Performance Indices for W(1.8, 2.0) Distribution Having Sk = 0.94 After Normalization via Goodness of Fit Tests
PCI (Within PPI (Overall Method A Value LSL USL Capability)_Capability)
Cp Cpk PPM Pp Ppk PPM
W(1.8, 2.0) - 0 10 1.50 0.54 51629 1.54 0.56 47940
SW 0.43 -2.33 3.93 1.28 1.17 241 1.32 1.21 151
AD 0.47 -2.13 4.15 1.26 1.08 623 1.30 1.11 428
CVM 0.49 -2.04 4.27 1.26 1.04 949 1.30 1.07 674
PC 0.62 -1.61 5.11 1.26 0.84 6101 1.30 0.86 4922
SF 0.43 -2.33 3.93 1.28 1.17 241 1.32 1.21 154
LT 0.43 -2.33 3.93 1.28 1.17 241 1.32 1.21 154
JB 0.43 -2.33 3.93 1.28 1.17 241 1.32 1.21 154
AC 0.40 -2.50 3.78 1.30 1.25 118 1.34 1.29 70
MLE 0.40 -2.50 3.78 1.30 1.25 118 1.34 1.29 70
M_T 0.50 0 3.16 1.26 1.02 1143 1.30 1.05 822
4.3. High Asymmetric Distribution
A Weibull distribution with the shape and scale parameters fixed as 1.0 and 1.3, respectively, will represent the high asymmetrical non-normal data with skewness 1.35 and 1.76. Among the different methods, Minitab (M_T) transforms non-normal data into much closer normal data with minimum Bias, MdAE and RMSE values when compared to other methods, but the corresponding estimate of PCA
shows smaller and more PPM values compared to the standard requirements. Therefore, the method of transformation using Minitab (M_T) cannot be taken as an effective method. One may refer to Table 9 and Figure 4 for more information. From the point of view of errors, after transforming non normal data into normal data using different goodness of fit tests, the LT, SF, AC and SW, PC and AD methods produce fewer errors. Moreover, the methods such as AC, JB, SW, AD and MLE yield better estimates of process capability and process performance along with lesser PPM values. In this case, the minimum and maximum PPM values are recorded as 740 and 3075, respectively. The goodness of fit tests in the estimation of process capability for moderate asymmetric distribution shows that the process is better than 3a and 4a and approach towards the standard of 5a. One may refer to Table 10 and 11 for more information.
Table 9: Various Measures of Error Values for High Symmetric Data After Data Transformation
Methods
High Asymmetry (SK = 1.35) Weibull distribution (a = 1.0, [3=1.3)
High Asymmetry (SK = 1.76) Weibull distribution (a=1.0, [=1.3)
Bias MdAE RMSE Bias MdAE RMSE
SW 1.473 1.261 1.584 1.382 1.165 1.474
AD 1.480 1.265 1.593 1.414 1.174 1.519
CVM 1.486 1.269 1.602 1.414 1.174 1.519
PC 1.363 1.196 1.442 1.490 1.198 1.641
SF 1.466 1.257 1.576 1.376 1.163 1.465
LT 1.440 1.241 1.542 1.364 1.159 1.448
JB 1.493 1.273 1.611 1.382 1.165 1.474
AC 1.479 1.265 1.593 1.382 1.164 1.472
MLE 1.480 1.265 1.593 1.382 1.165 1.474
M_T 0.536 0.466 1.308 0.237 0.369 0.966
Figure 4: Radar Chart for Various Measures of Errors after Normalization of High Asymmetric Distribution
5. Results and Discussion
Data transformation and estimation of process capability analysis are the two aspects considered in this section. The effectiveness of different goodness of fit tests is determined by various measures of errors such as Bias, MdAE and RMSE. Based on the numerical illustrations provided in the previous section, it is found that the methods of AD and CVM tests produce lesser errors in low and moderate asymmetric situations, the methods of SW and SF tests yield considerably lesser errors in the case of moderate and high asymmetric behaviors, and the methods of LT and AC tests perform better only on high asymmetric situations. Similarly, the methods of PC, LT, JB, DME, and M_T tests yield better estimates, but provide
greater PPM values while estimating process capability and process performance indices.
Table 10: Estimates of Process Capability and Process Performance Indices for W(1.0, 1.3) Distribution Having Sk = 1.35 After Normalization via Goodness of Fit Tests
Method A Value LSL USL PCI (Within Capability) PPI (Overall Capability)
Cp Cpk PPM Pp Ppk PPM
W(1.0, 1.3) - 0 10 1.42 0.34 156902 1.39 0.33 160815
SW 0.26 -3.85 3.15 1.14 1.09 744 1.13 1.08 821
AD 0.21 -4.76 2.96 1.22 1.01 1248 1.21 1.00 1316
CVM 0.21 -4.76 2.96 1.22 1.01 1248 1.21 1.00 1316
PC 0.1 -10.0 2.59 1.82 0.84 6000 1.83 0.84 5871
SF 0.27 -3.70 3.19 1.12 1.10 770 1.11 1.09 856
LT 0.29 -3.45 3.28 1.11 1.07 956 1.09 1.06 1071
JB 0.26 -3.85 3.15 1.14 1.09 744 1.13 1.08 821
AC 0.26 -3.85 3.15 1.14 1.09 740 1.13 1.08 817
MLE 0.26 -3.85 3.15 1.14 1.09 744 1.13 1.08 821
M_T 0.28 0 1.90 1.11 1.11 834 1.10 1.10 932
Table 11: Estimates of Process Capability and Process Performance Indices for W(1.0, 1.3) Distribution Having Sk = 1.76 After Normalization via Goodness of Fit Tests
Method A Value LSL USL PCI (Within Capability) PPI (Overall Capability)
Cp Cpk PPM Pp Ppk PPM
W(1.0, 1.3) - 0 10 1.12 0.35 148540 1.15 0.36 142686
SW 0.29 -3.45 3.28 1.00 0.95 3033 0.99 0.94 3397
AD 0.28 -3.57 3.23 1.01 0.94 3075 0.99 0.93 3459
CVM 0.27 -3.70 3.19 1.02 0.93 3136 1.01 0.91 3539
PC 0.46 -2.17 4.10 0.92 0.69 19756 0.92 0.69 19840
SF 0.30 -3.33 3.32 0.99 0.96 3173 0.98 0.95 3533
LT 0.34 -2.94 3.49 0.95 0.90 4652 0.95 0.90 5012
JB 0.26 -3.85 3.15 1.04 0.92 3265 1.02 0.90 3694
AC 0.28 -3.57 3.23 1.01 0.94 3075 0.99 0.93 3458
MLE 0.28 -3.57 3.23 1.01 0.94 3075 0.99 0.93 3458
M_T 0.24 0.00 1.74 1.06 0.90 3639 1.05 0.88 4132
Thus, as a result, it will not be thought of as a useful way to evaluate the capability or a performance of the process, though the methods of SW, AC, SF and MLE tests produce superior results with better estimates and lesser PPM values when compared to other and traditional methods. A small PPM value generally assures that fewer items will be rejected, and it must be lower than the benchmark values to obtain six sigma results. On the basis of the numerical illustrations, it can be observed that the different tests of goodness of fit would guarantee better performance (656 as the minimum and 1939 as the maximum PPM values) in comparison to the typical PPM values of the 3a and 4a limits, and are very close to the outcome of the 5a limits only in low asymmetric behaviors.
The PPM values for moderately asymmetric conditions are found to be 81 and 241 as minimum and maximum values, respectively. This outcome surpasses the 3a, 4a, and 5a limits and is getting closer to
the benchmark of 6a outcomes. The minimum and maximum PPM values of 740 and 3075 would ensure that the procedure is better than the 3a and 4a limits only under high asymmetrical circumstances. One may refer to Table 12 for the better understanding of the efficiency of different normality tests under various asymmetric behaviors while dealing with non-normal quality characteristics based on the numerical examples, results and discussion.
Table 12: Efficiency of Various Tests of Goodness of Fit in Data Transformation and Estimation of Process Capability and Process Performance Indices for Weibull Distribution
Different Asymmetric Levels Efficiency in data transformation Efficiency in estimation of PCI/PPI
Low Asymmetric Moderate Asymmetric High Asymmetric Low Asymmetric Moderate Asymmetric High Asymmetric
Skewness 0.13 0.31 0.64 0.94 1.35 1.76 0.13 0.31 0.64 0.94 1.35 1.76
SW ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓
AD ✓ ✓ ✓ ✓ ✓
CVM ✓ ✓ ✓ ✓
PC ✓ ✓ ✓ ✓* ✓* ✓*
SF ✓ ✓ ✓ ✓ ✓ ✓ ✓
LT ✓ ✓ ✓ ✓ ✓ ✓* ✓ ✓
JB ✓ ✓ ✓ ✓ ✓*
AC ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓
MLE ✓ ✓ ✓ ✓ ✓ ✓
M_T @ ✓ ✓ ✓ ✓* ✓*
DME ✓$ ✓$ ✓$ ✓$ ✓$ ✓$
DME - Direct Minitab Estimation I @ - No transformation done I ✓ - less errors and/or better estimates and less PPM values I ✓* -Produces less error but higher PPM values I ✓$ - Produces Better estimates but higher PPM values.
6. Conclusion
Process capability analysis is important for any production process and useful for its continuous improvement. This study attempts to compare the ability of various tests of goodness of fit over the method of maximum likelihood in the estimation of the parameter involved in Box - Cox transformation. Primarily, the effectiveness of the tests of goodness of fit in transforming non-normal data into normal data is assessed through various measures of errors along with a radar chart. Based on the numerical example, the solutions to the research problem are turned out and it is observed that, regardless of using different formulas, the estimates of process capability and process performance indices approximately match. It is to be noted that the performance of process capability analysis for non-normal data purely depends on the choices of variation taken into account. Further, the transformed data is extended towards estimating process capability and process performance in order to identify the effective methods for non-normal quality characteristics. As per the results and discussion, one may observe that the measures of errors, and estimates of PCI, PPI and PPM values from SW, AC, SF and MLE methods of goodness of fit tests have higher accuracy in data transformation, greater power in estimating process capability or process performance and leaves smaller PPM values in all asymmetric situations.
By taking into account of the research problem, the SW test outperforms the other tests while
transforming non-normal data into normal data and estimating process capability / performance with smaller PPM values in all the asymmetric situations. However, other methods of tests such as AC and MLE methods can also be considered for handling non-normal quality characteristics and producing considerably good results. Application of different goodness of fit tests to estimate PCA yields smaller PPM values and obviously better results than 3a, 4a and 5a limits. Implementing goodness of fit tests further helps to obtain the results that are closer to the six sigma standards than the traditional MLE method. Thus, the current MLE technique could be effectively substituted by using goodness of fits tests in Box-Cox transformation to achieve desired results in estimating process capability.
References
[1] Asar, O., Ilk, O., and Dag, O. (2017). Estimating Box-Cox Power Transformation Parameter via Goodness-of-Fit Tests, Communications in Statistics - Simulation and Computation, 46, 91 - 105.
[2] Box, G. E. P., and Cox, D. R. (1964). An analysis of Transformations. Journal of the Royal Statistical Society: Series B (Methodological), 26, 211 - 243.
[3] Dag O., Asar, O., and Ilk, O. (2014). A Methodology to Implement Box-Cox Transformation When No Covariate is Available, Communications in Statistics - Simulation and Computation, 43, 1740 - 1759.
[4] Gunter, B. H. (1989). The Use and Abuse of Cpk, Quality Progress, 22, 108 - 109.
[5] Kane V E (1986), Process Capability Indices, Journal of Quality Technology, 18, 41 - 52.
[6] Kashif, M., Aslam, M., Al Marshadi, A. H., Jun, C-H. (2017), Evaluation of Modified Non-normal Process Capability Index and Its Bootstrap Confidence Intervals, IEEE Access, 5, 12135 - 12142.
[7] Oztuna, D., Elhan, A., and Tuccar, E. (2006). Investigation of Four Different Normality Tests In Terms of Type 1 Error Rate and Power under Different Distributions, Turkish Journal of Medical Sciences, 36, 171 - 176.
[8] Pearn, W. L., and Chen, K. -S. (2002). One-sided Capability Indices CPU and CPL: Decision Making with Sample Information, International Journal of Quality & Reliability Management, 19, 221 - 245.
[9] Pyzdek, T. (2003). PyzdekSix Sigma Handbook: A Complete Guide for Green Belts, Black Belts, and Managers at All Levels, McGraw-Hill Inc., New York.
[10] Rahman, M. (1999). Estimating the Box-Cox Transformation via Shapiro-Wilk W Statistic, Communications in Statistics - Simulation and Computation, 28, 223 - 241.
[11] Rahman, M., and Pearson, L. M. (2008). Anderson-Darling statistic in Estimating the Box-Cox Transformation Parameter, Journal of Applied Probability & Statistics, 3, 23 - 35.
[12] Sennaroglu, B., and Senvar, O. (2015). Performance Comparison of Box-Cox Transformation and Weighted Variance Methods with Weibull Distribution, Journal of Aeronautics and Space Technologies, 8, 49 - 55.
[13] Tang, L. C., Than, S. E. (1999). Computing Process Capability Indices for Non-normal Data: A Review and Comparative Study, Quality and Reliability Engineering International, 15, 339 - 353.
[14] Thadewald, T., and Buning, H. (2007). Jarque-Bera Test and its Competitors for Testing Normality - A Power Comparison, Journal of Applied Statistics, 34, 87 - 105.
[15] Swamy, D. R., Nagesh, P., and Wooluru, Y. (2016). Process Capability Indices for Non-normal Distribution - A Review, Proceedings of the International Conference on Operations Research and Management, January 21 - 22, Mysuru, India.
[16] Wooluru, Y., Swamy, D. R., and Nagesh, P. (2016). Process Capability Estimation for Non-normally Distributed Data using Robust Methods - A Comparative Study, International Journal of Quality Research, 10, 407 - 420.
[17] Yang Y and Zhu H (2018). A Study on Non-normal Process Capability Analysis based on Box-Cox Transformation, Proceedings of the 3rd International Conference on Computational Intelligence and Applications (ICCIA), Hong Kong, China, IEEE, 240 - 243.
[18] Yap, B. W., and Sim, C. H. (2011). Comparisons of Various Types of Normality Tests, Journal of Statistical Computation and Simulation, 18, 2141 - 2155.
[19] Yoap, T. (2006). Process Capability Analysis for Non-normal Data with Minitab, In: Six Sigma: Advances Tools for Black Belts and Master Black Belts, Eds. Tang, L. C., Goh, T. N., Yam, H. S., &Yoap, T, 131 - 149, John Wiley & Sons Ltd., The Atrian, England.