Научная статья на тему 'Some distribution systems for process capability analysis with non-normal data used in quality control'

Some distribution systems for process capability analysis with non-normal data used in quality control Текст научной статьи по специальности «Науки о Земле и смежные экологические науки»

CC BY
419
77
i Надоели баннеры? Вы всегда можете отключить рекламу.
Ключевые слова
GENERATOR OF RANDOM NUMBERS / NON-NORMAL DATA DISTRIBUTIONS / PEARSON DISTRIBUTION SYSTEM / SKEWNESS AND KURTOSIS / THIRD AND FOURTH ORDER MOMENT / PROCESS CAPABILITY / QUALITY CONTROL

Аннотация научной статьи по наукам о Земле и смежным экологическим наукам, автор научной работы — Dr Bachioua Lahcene

In this study, some important issues about process capability and performance have been highlighted, particularly when the distribution of a process characteristic is non-normal. The processing capability and performance analysis have become an inevitable step in the quality management of modern industrial processes. Determination of the performance capability of a stable process using the standard process capability indices (Cp, Cpk) requires that the quality characteristics of the underlying process data should follow a non-normal distribution. It is becoming more critical than ever to assess precisely process losses due to non-compliance with customer specifications. To assess these losses, the industry is widely using process capability indices. Deviations from this normality assumption could lead to erroneous results when applying conventional statistical capability measures which are based on some assumption. Many researchers have been investigating solutions to the nonnormality problem. In case, when data does not obey normal distribution, the key issue in this analysis is to obtain a correct estimate of process performance. When the distribution of a process characteristic is nonnormal, process capability indices calculated using conventional methods could often lead to an erroneous and misleading interpretation of the process’s capability. Typically, the processes follow normal probability distribution ensuring a high percentage of the process measurements falling between ± 3s of the process mean and the total spread amounts to about 6s variations. This article describes the estimation of Cp and Cpk, commonly used process capability indices (PCI), in case of non-normal data using the characteristics of Pearson, Burr, Johnson and Tukey systems distributions.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Текст научной работы на тему «Some distribution systems for process capability analysis with non-normal data used in quality control»

Dr Bachioua Lahcene

Department of Basic sciences, Preparatory Year, University Of Hail, Hail, Saudi Arabia P.O 2440 E-mail: drbachioua@gmail.com

Some Distribution Systems for Process Capability Analysis with Non-Normal Data used in Quality Control

Abstract. In this study, some important issues about process capability and performance have been highlighted, particularly when the distribution of a process characteristic is non-normal. The processing capability and performance analysis have become an inevitable step in the quality management of modern industrial processes. Determination of the performance capability of a stable process using the standard process capability indices (Cp, Cpk) requires that the quality characteristics of the underlying process data should follow a non-normal distribution. It is becoming more critical than ever to assess precisely process losses due to non-compliance with customer specifications. To assess these losses, the industry is widely using process capability indices.

Deviations from this normality assumption could lead to erroneous results when applying conventional statistical capability measures which are based on some assumption. Many researchers have been investigating solutions to the non-normality problem. In case, when data does not obey normal distribution, the key issue in this analysis is to obtain a correct estimate of process performance.

When the distribution of a process characteristic is non- normal, process capability indices calculated using conventional methods could often lead to an erroneous and misleading interpretation of the process's capability. Typically, the processes follow normal probability distribution ensuring a high percentage of the process measurements falling between ± 3s of the process mean and the total spread amounts to about 6s variations. This article describes the estimation of Cp and Cpk, commonly used process capability indices (PCI), in case of non-normal data using the characteristics of Pearson, Burr, Johnson and Tukey systems distributions.

Keywords: Generator of random numbers, non-normal data distributions, Pearson distribution system, skewness and kurtosis, Third and fourth order moment, process capability, quality control

1. Introduction

There are two types of non-normal data. The first type of data is such that fits some distribution such as lognormal, Weibull, exponential, gamma distributions. The second type of data is such that fits to the mixture of multiple distributions or processes, with which it is possible to transform the data that follows the normal distribution. A normally distributed data become discrete and non-normal, when errors in the data are rounded off or when employing a measuring device with poor resolution. In order to overcome, the problem of non-normal data, a more accurate measurement system is to be employed.

Normally distributed data have a small percentage of extreme values. A large number of extreme values in a data set to result in a skewed distribution, in such case normality may be achieved by determining the reason

for the errors. The reasons for these errors are due to measurement errors or data entry errors. Eliminating these extreme values may result in the normality of data, but it is important to outline the causes of this error, before eliminating them. In some instances, data may not be normally distributed as it may arise from more than one process, operator or shift, or from a process that frequently shifts. When two or more normally distributed data sets are overlapped, data may appear to be bimodal or multimodal (i.e. it will have two or more most frequent values).

There are many data types that follow a non-normal distribution. If a process has many values close to zero or a natural limit, the data distribution will skew to the right or to the left. In such a case, a transformation such as the Box-Cox power transformation may help in obtaining the normal data. There are six reasons for the non-normality of the data. They are Data follows

a different distribution, Values Close to Zero or a natural limit, Insufficient data discrimination, Sorted data, Extreme values, and Overlap of two or more processes.

The standard process capability is based on three important assumptions, that is: (i) the process is itself in control, (ii) target value and specifications of quality characteristics are clearly specified, and (iii) the process should measures quality characteristics that follow a normal distribution. When the data is not normally distributed, capability analysis can still produce useful results by using non-parametric indices, or by transforming the data so that it conforms better to a normal distribution than its original form. In order to find whether a process is in control, quality control charts like R and X Bar charts can be used.

Earlier, Lovelace and Swain (2009) estimated quintiles by probability plotting technique and then control limits to determine whether the process is in statistical control or not [12]. When non-normal data is encountered, there are two approaches one may use to perform capability analysis. The first approach is to select a nonnormal distribution model, that fits the nonnormal data and then analyze the data using a capability analysis for non-normal data. The other approach is to transform the data, that fits the normal distribution and then analyze the data using a capability analysis for normal data.

A non-normally distributed data can be transformed to normality. However, the transformations used should not change the relative ordering of the values, and should not alter the distance between successively ordered values to change the overall shape of the distribution. The capability analysis of nonnormal data is carried out in the manner, same as the analysis done on data with a normal distribution. The capability is determined by comparing the width of the process variation to the width of the specification.

Many non-normal distributions can be used to model a response. If an alternative distribution to the normal distribution is viable, then the exponential, lognormal, and Weibull distributions usually work. The extreme value and gamma distributions have their applications, but if neither the normal, exponential, lognormal, nor Weibull distributions provide a good fit, the

data may be a mixture of a number of different populations. In that case, the data collector needs to separate the data by population and analyze each individually [10].

2. Problem Questions

There are few questions that arise when analyzing non-normal data such as, whether the data is slightly, moderately or extremely non-normally distributed? . The manner in which the non-normal distribution appears bimodal, multimodal or skewed. Do all data values fall within standard deviations? Whether some rules have to be adopted for non-normal distributions, if yes, then how do you handle this data?

3. Problem Definition

Before the advent of computers, there was only one practical way to deal with data that were not distributed normally. Somerville and Montgomery studied the errors that can occur in calculating Cp or Cpk for non-normal distribution. They concluded that for the four non-normal distributions: the t, the gamma, the lognormal, and the Weibull distributions, the magnitude of error can vary substantially depending on the true (unknown) distribution parameters [15].

Hence, if the capability indices based on the normal assumption concerning the data are used to deal with non-normal observations, the values of the capability indices may be incorrect and quite likely misrepresent the actual product quality [14]. It is to note that analysis on the paint thickness data reveals that the process did not always follow a normal distribution pattern. For the normally distributed data, process capability indices, Cpk or Ppk, could be used without any problem. However, pooling the normal and nonnormal data together and treating them as the same is an inappropriate approach.

Although non-normal distribution approximations may conform to the paint thickness data for Pearson, Burr, Johnson and Tukey's systems distributions, when the process capability indices are calculated for the same process characteristic (i.e. paint thickness), varied Cpk values are obtained, depending on which non-normal distribution is assumed. The main problem with the distribution of some

types of measurement is that then variability is related to the mean value. Many measurements in agricultural research are of this kind [7].

4. Process Capability Indices

Even though process capability analysis can be performed without specification limits, process capability ratios incorporating specification

limits are found to be useful. CF is the 'Upper Process Capability'(i.e., the process capability

to meet the upper spec. limit). CL is the 'Lower Process Capability'(i.e., the process capability

to meet the lower spec limit). limit. CP is equal

to the lower than CPU andCL

CP is a better

measure of process capability than CP of CR

since CP takes into account the actual process center compared to the target. The first type of

process capability ratio is known as the CP index and is given as,

Cj =

[USL - LSL]

6a

Process Average

J

where USL and LSL are the upper and lower specification limits respectively, and a is the standard deviation of the process characteristic. Six sigma spread of the process is the basic definition of process capability, when the quality characteristic follows normal distribution. Hence,

CP measures how good the process is capable of

meeting the specifications. Since / + 3a is the

upper natural tolerance limit (UNTL) and / - 3a is the lower natural tolerance limit (LNTL),

CP is equal to the ratio of the specification tolerance to natural tolerance. It is obvious that if the specification tolerance is narrower than the natural tolerance, the process will produce defective items as can be seen in figure (1).

Clearly, CP reveals nothing about the centering of the process. One may split the Cp index into two, in order to find how the process meets the specification tolerances above and below the process mean / separately. One should note

that CP is not equal to the sum of CL andCPU . If the process is centered at /, then CP = CP

= C

JU

6 sigma

Fig. (1): Process Capability Ratios

.The index Cpk is then defined as CK = min {CP , Cjy }. If CK <CP, it means that the process is not centered. If CP is high but not CK , it means that the process needs centering.

Hence, CP is said to measure the potential

capability of the process, whereas CK measures the actual capability. For the

case of one-sided specification limits, CP

skewness

reasonable to assume that N follows ti Normal dtftribulioii

Fig. (2): The relation between kurtosis and skewness of continuous distributions

is not defined and hence CP Recommended values for CPL

or Cj^ is used. , Cpj and Cjk

are the same as for CP. The relation between kurtosis and Skewness can be seen in figure (2).

4. Identifying the Non-normal Distribution

In addition to the most commonly used Clements' percentile method, there exist other methods for estimation of process performance for non-normal distribution, such as the Box-Cox transformation method, Burr percentile method, and Cumulative Distribution Function (CDF) method [6].

A method of simulating multivariate nonnormal distributions by using the Pearson distribution system which can represent a wide class of distributions with various skewness and kurtosis is as follows. Firstly, a procedure of generating random numbers from the Pearson distribution system including type IV distribution which was difficult to implement is derived. Secondly, the Pearson distribution system is applied to generate random numbers from the multivariate non-normal distributions with specified marginal skewness and kurtosis.

The non-normal data is assumed to follow a normal distribution using the Box-Cox transformation or the Johnson transformation. One can then use the transformed data with any tool that assumes normality. This question is rigorously answered through an identification analysis of skew-normal distributions. Particular attention is focused on the statistical model which actually defines the sampling process generating the observations.

Johnson distributions have become popular due to their flexibility in fitting situations with combinations of Skewness and kurtosis. There is one Johnson (Su) for unbounded data, one for bounded on both tails (Sb), and one leading to the lognormal distributions (Sl). Once these potential causes of non-normality are removed, it should be considered whether the process itself may not be one that is expected to be normal. When items are added or subtracted, they tend to normalize, but when multiplied or divided, they tend to a lognormal distribution, such as sizes of living tissues.

However, these methods can be widely applied to multivariate analysis and statistical modeling [16]. Figure (3) shows the shape characterization plane into regions for mound-shaped probability models, J-shaped probability models, and U-shaped probability models.

Fig. (3): Probability models((S) , (Sb), (S)) in Shape Characterization Plane

Fig. (4): Coverage of three-sigma limits

4.1. New Monte Carlo Simulation Method for Non-Normal Data Distribution

Figure (4) shows the shape characterization plane into regions for mound-shaped probability models, J-shaped probability models, and U-shaped probability models. Starting on the lower left, the bottom region is the region where the three-sigma interval gives 100% coverage. The slight above this region, shown by the darker shading is the region where the three-sigma interval will give better than 99.5 %coverage. Continuing in a clockwise direction, successive slices define regions where the three-sigma intervals will give better than 99% coverage, better than 98.5% coverage.

4.2. The Extended Lambda Tukey's Family Distributions(EGLD)

The First Generalized Lambda Distribution was proposed by Tukey (1960), and the Tukey's family of distribution is defined by the quintile

function F 1 (u) and is given as,

F \u) = <

A- [uA- (1 - u)A], A ^ 0 (1 - u )-1 log (u), A = 0

(1)

where ix is uniformly distributed random variable on (0, 1).The rectangular and logistic distributions are also members of the above family of curves which has ability to assume a wide variety of shapes including the standard distribution types of exponential, normal, h2, uniform, lognormal etc. The extended generalized lambda distributions, is defined in terms of the inverse of the cumulative distribution as,

F- (u) = Ai + A2 uA -A5 (1 - u)A4 +

+ A6 uA (1 - u)"

0 ^ u < 1

Fig. (5): Skewness-kurtosis plane of EGLD Distributions

This family nests a wide range of symmetric and asymmetric distributions [5].The variable distribution to be generated can be no normal, uniform or Lambda distributed, which covers most of the practical area in the Skewness-kurtosis plane can be seen in Fig (5).

4.3. The Extended Burr Family Distributions

A generalization of Burr differential equation may be obtained by separation of variables of the form as follows

= F(x)[1 - F(x)]g(x).K[F(x)]

In what follows, a new continuous probability density function is derived as a solution to the generalized Burr differential equation (3). The expression for the cumulative distribution function of the new distribution is obtained. It is consider the generalized Burr differential equation (3) in the following form

(K [ F ( x)] )-1 dF ( x ) F(x)[1 - F(x)]

= g ( x) dx

(4)

It is possible to observe the space covered by the family of these distributions through Figure 6), and by reflecting on the coefficient of Skewness-kurtosis, it becomes clear that this family has a wide coverage which is not covered by the formula of the differential equation(4) that expresses this family.

4.4. The Extended Johnson Family Distributions

This family of distributions is perhaps the most versatile choice. It is based on a transformation of the standard normal variable. The flexibility provided by the choice of form and fitting parameters allows for great flexibility in adjusting the curve to fit the data. The fact that the Johnson system involves a transformation of the raw variable to a Normal variable allows estimates of the percentiles of the fitted distribution to be calculated from the Normal distribution percentiles, for use in control limit calculations or for Capability analysis. Although capability indices and control limits are generally defined for normal variables, this approach allows their calculation for all distribution types.

Meanwhile, the use of the modified beta func-

Uh U J- IT- VI Of A Ö I— Sj

— >

\ v\ a

X \ i.if i-f'I 3 CT.

t/1 T \ V 3

I X c E* n IE EL y h

Fig. (6): The Burr family Distribution on the shape characterization plane

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

tion is also discussed for the sake of comparison in order to present the advantage of the Johnson's SB function as a general distribution function. Therefore, the time evolution of particle size distributions using the Johnson's SB function as the initial distribution can be obtained by several lower order moment equations of the Johnson's SB function in conjunction with the General Distribution Estimation (GDE) during the Brownian coagulation process.

Simulation experiments indicate that fairly reasonable results of the time evolution of particle size distribution can be obtained with this proposed method in the free molecule regime, transition regime and continuum plus near continuum regime, respectively, at the early time

stage of evolution. The Johnson's SB function has the ability of describing the early time evolution of different initial particle size distributions.

The SB distributional model of Johnson's 1949 paper was introduced by a transformation to standard normality, that is, Z ~ N(0,1),

consisting of a linear scaling to the range (0,1) , a logic transformation, and an affine transformation, Z ~ y + aU . The model, in its original parameterization, has often been used in forest diameter distribution modelling. This procedure provides a family of Johnson transformations system consists of three types of curves. Several

examples of the transformation to normality for

density curves of Johnson SU .

A third approach to variance and estimation computation with equations of expressed as a return. For a continuous random variable X with an unknown target distribution which needs to be approximated, Johnson proposed a set of "normalizing" translations. These translations transform the continuous random variable X into a standard normal variable Z and have the general form,

v X-p

= a + bg(—) (6)

where a and b are shape parameters, p is a

location parameter, a is a scale parameter and

g (.)is a function defining the four families of distributions of the Johnson system

g(x) =

ln (x),

ln (x W x2 +1

ln

v1-x, x,

for the lognormal family, for the unbounded family,

for the bounded family, for the normal family.

As discussed before, the above system has the flexibility to match any feasible set of values for the mean, variance, skewness, and kurtosis.

With this system, the Skewness and kurtosis also uniquely identify the appropriate form for the

g (.) function. These extensions have been calculated, for each distribution that belongs to Johnson family. An attempt is made to adopt the unit beta model to x-bar charts of quality control in manufacturing. In this direction, upper and lower control limits (UCL and LCL) of x-bar control for dimensional measurements are estimated for the beta model and the observed differences between the beta and normal model control limits are discussed for the measurement sets. Given a continuous random variable X whose distribution is unknown and is to be approximated, Johnson proposed three normalizing transformations having the general form

^ (7)

Z = r+<f I

where f (.) denotes the transformation function, Z is a standard normal random variabler and j are shape parameters, Ais a scale parameter and u is a location parameter. Without loss of generality, it is assumed that j y 0 and Ay 0 . The first transformation proposed by Johnson defines the lognormal system of distributions denoted by SL

Z = r + jln{^j[yr+jln{X-[),

Sl

X y [

Curves cover the lognormal family. The bounded system of distributions SB is defined by:

f

Z = r + J ln

SB

X-[ [i + A- X

= ju< X< [ + A

curves cover the bounded distributions. The distributions can be bounded on either lower end, the upper end or both. This family covers gamma distributions, beta distributions and many others.

The unbounded system of distributions SU is defined by

Z = r + J ln

'X-u

A

, 1/2

+ 1

= r + J sinh

X - u A

- <x> ^ X < +<x

(10)

The SU curves are unbounded and covers the X and normal distributions.

The (ß1, ß2) points for lognormal distributions therefore lie on a curve described by the equations (7, 8, 9,10). Fig (3) displays the combinations of skewness and kurtosis that can be observed with Johnson curves. There is a unique Johnson distribution corresponding to each feasible combination of ß and ß2. Both matching of moments and matching of percentiles may be used for each family. However, matching of moments for the

Johnson SB family is very complex. Once the

&=t3

Exponential Negative Exponential Frechet

Largest Extreme Value Smallest Extreme Value Weibull

Gamma Family Loglsgistic Family Logistic

Log normal Family Normal U niform Impossible Area Sheet 1 {S=2.14, K=6.33> Sheet 1 Pre-transiorm {S=0.92. K=1.25)

Fig. (7): The Johnson Family Distribution on the shape characterization plane

parameters of the particular Johnson family are found, standard normal tables may be used to calculate quintiles or percentiles [3]. Figure (7) shows the most important links and differences for some important distributions.

4.5 Extended Generalized Pearson Families of Distributions

Capability analysis for non-normal data; if you are measuring flatness, the measurements can never be smaller than 0. In these cases, you will need to use Pearson curve fitting. Pearson curve fitting is a technique in which the distribution is compared to one of many theoretical distributions. If the data matches closely enough, it will pass a chi-square test and the capability indices will be useful. As with normally distributed data, if the data does not match one of the theoretical distributions, then the capability indices may be misleading and should not be used. The classical differential equation introduced by Karl Pearson during the late 19th century is a special case (Mohammad et al., 2010). And later studies by Elderton and Johnson (1969) and Johnson et al. (1994), among others. The new extended generalized Pearson family of distributions is characterized by general differential equation and is defined with implicit form. A random variable x that has a probability density function is said to have an extended generalized Pearson distribution of the following form,

dfx (x) dx

= g ( x) fx ( x)

(11)

where g (x )is integral real function, in terms of the model function statistical properties. The Equation (11) in general form is,

(12)

g (x)fx( x)dx -dfx (x) = 0

Equation (12) is a separable first-order ordinary differential equation that is algebraically reducible to a standard differential form in which each of the non-zero terms contains exactly one variable solution to this kind of equation is usually quite straightforward.

For example, the solution of the general solution of above equation is as follows

fdfx (x) (13)

One way of determining whether a given equation is separable or not, is to collect coefficients on the two differentials and see if the result can be put in the form

(13a)

ln {fx (x)) = J g(x)dx + c

Another way is to solve for a derivative and

compare the result with ff (x)) = g(x)fx (x). A general solution of the form can be found by first dividing by the product function to separate the variables and then integrating can be solved by first multiplying and subsequently integrating

fx (x) = exp {Jg(x)dx + c}= (14)

= X exp {Jg(x)dxj

The process of solving a separable equation will often involve division by one or more expressions. In such cases the results are valid where the divisors are not equal to zero but may or may not be meaningful for values of the variables for which the division is undefined. Such values require special consideration and may lead to singular solutions.

It implies from the definition of

X

that

fx (x) > 0, - ro < x

and

I fx (x) d = 1. An extended generalization of

J—w

the Pearson differential equation has appeared in literature, known as the generalized Pearson system of continuous probability distributions. Equation (14) derives a new family of distributions based on the generalized Pearson differential equation.

It is observed that the new distribution is skewed to the right and bears most of the properties of skewed distributions. Equation (14) develops some new classes of continuous probability distributions based on the generalized Pearson differential Equation (13). Some different families of distributions based on generalized Pearson differential Equations (7 to 14). One of these systems is the Pearson system [2]. Figure (8) shows the most important links and differences for some important distributions related to Pearson Families Distributions.

fx (x)

-J g ( x)dx

= c

O.S t 1.5 2 2.5 3 3.5 3

Fig. (8): The Pearson Families Distributions on the shape characterization plane

4.6 Skewness and Kurtosis of Non-normal Distribution

One of the applications of Skewness is testing the normality. Many statistics inferences require that a distribution be normal or nearly normal. A normal distribution has skewness and excess kurtosis of 0, so if distribution is close to those values then it is probably close to normal. The sample skewness measures the asymmetry of the empirical distribution. If it is far from 0 , then the distribution is not symmetric. Since the normal distribution is symmetric, a sample from the normal distribution should be close to 0. The sample kurtosis measures the "peakedness" of the distribution. If it is much greater than 0, then the distribution is more peaked than the normal distribution, which typically means that it has heavier tails. If it is less than 0, it is less peaked, which typically means that the distribution is bimodal. The sample kurtosis is bounded from below by -2 (a value that is obtained for a two-point distribution, which of course is extremely bimodal).

Skewness involves the symmetry of the distribution. Skewness that is normal

involves a perfectly symmetric distribution. A positively skewed distribution has scores clustered to the left, with the tail extending to the right. A negatively skewed distribution has scores clustered to the right, with the tail extending to the left. Kurtosis involves the peakedness of the distribution. Kurtosis that is normal involves a distribution that is bell-shaped and not too peaked or flat. Positive kurtosis is indicated by a peak. Negative kurtosis is indicated by a flat distribution. Both Skewness and Kurtosis are 0 in a normal distribution, so the farther away from 0, the more non-normal the distribution. The question is "how much" skew or kurtosis render the data non-normal? This is an arbitrary determination, and sometimes difficult to interpret using the values of Skewness and Kurtosis by these formulas:

(15)

Sk = ju3a-3 =ju3\e2] 3/2 = — L J ni

f 3 - n n i- ( A

xi ~ x xi ~x

V y V

-3/2

n i=1

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

f 4 - n n i- (

xi ~x xi ~x

V V J

4.7 Non-Normal Distributions - Fitting Distributions by Moments

The shapes of most continuous distributions can be sufficiently summarized in the first four moments: mean (first moment), variance (second moment), skewness (third moment), and kurtosis (fourth moment). Once a distribution has been fitted, then the expected percentile values are calculated under the (standardized) fitted curve, and estimate the proportion of items produced by the process that fall within the specification limits [11].

Johnson (1949) described a system of frequency curves that represents transformations of the standard normal curve (Hahn and Shapiro, 1967). By applying, these transformations to a standard normal variable, a wide variety of non-normal distributions can be approximated, including distributions which are bounded on either one or both sides (e.g., ^-shaped distributions). The advantage of this approach is that once a particular Johnson curve has been fit, the normal integral can be used to compute the expected percentage points under the respective curve. Methods for fitting Johnson curves, so as to approximate the first four moments of an empirical distribution [8].

5. Results and Discussions Process

Capability is a measure of the ability in a process to meet specifications. It tells us how good the individual component behaves. There are several methods to measure process capability including an estimation of the Ppm (defective parts per million). Capability indices such as Cp, Cpk, P Ppk are very popular; however, trying to summarize the capability through a single index is often misleading because the key information about the process is lost.

The Process Stability refers to the consistency of the process with respect to important process characteristics such as the average value of a key dimension or the variation in that key dimension. If the process behaves consistently over time, then the process is said to be stable or in control Process Capability enables you evaluate nonnormal process distribution using:

1. Box-Cox power Transformation

2. Johnson Transformation system

3. Clements Method using Pearson Curves

For instance, you need to know the underlying shape of the process distribution to calculate a meaningful Process Capability index. To properly calculate a capability index for non-normal data, you either need to transform the data to normal, or use special case calculations for non-normal processes.

As another instance, you should never do a transformation, or calculate process capability, until you have determined the process is in the state of statistical control. If the process is not in control, then it is not stable, and cannot be predicted using capability indices.

Likewise, an out of control situation is evidence that multiple distributions are in place, so a single transformation for all the process data would be meaningless.

If a no normal distribution provides the best fit, use one of the following no normal capability models to evaluate your process:

- Capability Analysis (No normal)

- Capability Analysis Multiple Variables (No normal)

- Capability Six-pack (No normal).

Our data set consists of 100 random numbers that were generated for an exponential distribution with a scale = 1.5. The scale is what determines the shape of the exponential distribution. Suppose these data describe how long it takes for a customer to be greeted by a salesperson in a store. Usually a customer is greeted very quickly. Sometimes it is crowded in the store and it takes longer. Table 1 shows the time (min. sec) for 100 samples that a salesman greeted his customer. Table 2 shows comparison of normal & no normal capability data.

3.67 1.37 5.12 0.21 0.03

2.26 2.68 4.17 0.03 2.02

0.91 0.65 2.24 2.67 0.75

0.31 1.46 2.82 3.54 0.19

7.77 0.13 4.05 0.04 5.28

0.06 1.35 2.09 0.54 2.22

0.24 0.82 2.02 0.16 2.41

0.11 0.13 0.53 0.6 0.43

0.36 0.14 0.29 2.95 1.53

3.21 1.98 0.5 1.7 0.3

1.01 0.18 1.03 2.65 2.99

0.91 0.01 1.24 3.43 0.75

2.28 2.49 0.51 4.06 1.31

1.75 0.53 2.17 2.04 1

1.45 0.4 0.11 3.56 2.15

3.84 1.77 0.86 0.16 2.07

2.22 0.15 0.13 4.74 0.76

0.3 1.02 3.63 0.77 5.25

1.81 1.67 0.8 6.1 1.3

0.63 0.81 0.6 0.87 2.44

Table 1: Exponential Data

But there are 3 points out of100 that are above the USL. Only 3% considerably differ by 0.05%. The values depend on the type of the distribution used. The formulas are given below.

P =(USL-LSL)/(X

-X

99868 .001355

)

Where, USL = upper specification limit, LSL = lower specification limits, and the value X 99868= 99.865th percentile of the exponential distribution, and X001355= 0.13 5 5th percentile of the exponential distribution.

Ppl=(X5-LSL)/(X5-X00i355)

Where, X5= 50th percentile of the exponential distribution.

Ppu=(USL-X,y(X 99865-X,)

P is the minimum of P and P . The percen-

pk pu pl A

tiles of data as showed in figure (1) for normal

distribution, it is evident that after 3a the error is negligible on both sides.

6. Conclusions

The objective of the study was to estimate the current process performance, in order to provide accurate information for the customer, for performance benchmarking and for the process improvement. In case of data that do not obey normal distribution, the key issue in this analysis was to obtain correct estimate of process performance. This is successfully accomplished by using performance indices calculation based on Clements' percentile method.

The process capability and performance analysis has become an inevitable step in quality management of modern industrial processes. The data in Table 1, it is very clear that the follows exponential distribution(three-parameter gamma distribution). The value for Ppk (=0.56) is considerably less for normal data, when compared to Ppk (=0.94) for non-normal data. In addition, it is important to note that the estimated % above the USL is 1.83 for the non-normal process capability. Thus, the process capability analysis is preferred for non-normal data calculations. From table 2, it is evident that

the P value for non-normal data comes out to

pk

be less than 1, on the other hand values of P

pk

greater than l.The formulas for Ppk look different than for the normal distribution.

Within Capabilities Overall Capabilities (Normal) Overall Capabilities (Non-normal)

C=N/A, Ck=1.109, C =1.1 p ' pk ' pu (0.05%), Cpl=N/A (0%), Est. Sigma (y) =1.2916, PpM>USL=484.95, P<LSL=0, Total P =485.405 PM ' PM Pp=N/A, Ppk=0.94, Ppu=0.939 (0.24%), Ppl=N/A (0%), Sigma (s) = 1.545, PPM>USL=2466.694, Ppm<LSL=0, Total P=2466.69, PM PM Average (X)= 1.5958 Pp=N/A, Ppk=0.559, Ppu=0.561 (1.83%, Ppl=N/A, PPM>USL=18315.594, PPM<LSL=N/A, Total P =18315.5964 PM

Table 2: Comparison of normal and no normal capability data

References:

[1] Ahmad, S., Abdollahian, M., Zeephongsekul, P.,(2008)." Process capability estimation for non-normal quality characteristics: A comparison of Clements, Burr and Box-Cox Methods", ANZIAM J. 49, pp. C642-C665.

[2] Bachioua , Lahcene, (2013)."0n Pearson Families of Distributions and its Applications", May 2013. African Journal of Mathematics and Computer Science Research, pp. 108-117.

[3] Bachioua , Lahcene, (2014) "On Johnson's system families , and of distributions with their applications in quality", February-march 2014, To appear in British Journal of Mathematics & Computer Science, www. sciencedomain.org.

[4] Bachioua , Lahcene, (2014). " On Extended Burr System Family Distribution with their Application in

Quality", Journal of ISOSS, Vol. 1(1), 31-46.

[5] Bachioua Lahcene and Shaker, H. S.(2006)."0n Extended Tukey Lambda Distribution Models", Jordan Journal of Applied Science Natural Sciences, V( 8); N(2), pp 25-32.

[6] Chang, P L., Lu, K.H., (1994)"PCI Calculations For Any Shape of Distribution with Percentile", Quality World, technical section, Vol. September, pp.110-14.

[7] Dobson A., (1983). "An Introduction to Statistical Modeling", Chapman and Hall Ltd., London.

[8] Hahn and Shapiro, (1967)."Statistical Models in Engineering" John Wiley &Sons , pages 199-220.

[9] Keats, J.B., Montgomery D.C., (1996)"Statistical Applications in Process Control", ISBN 0-8247-97116, Marcel Dekker, Inc., New York.

[10] Kundu, D. and Manglick, A., (2004)."Discriminating between the Weibull and Log-normal distributions", Naval Research Logistics, 51, 893 - 905.

[11] Gruska, G. F., Mirkhani, K. and Lamberson, L. R., (1989)."Non Normal Data Analysis" , Applied Computer Solutions, St Clair Shores.

[12] Lovelace, C. R. and Swain, J. J. (2009). "Process capability analysis methodology for zero bound, non-normalprocess data", Quality Engineering, 21, 190-202.

[13] Montgomery, D. C (2009)."Statistical Quality Control- A Modern Introduction", Wiley., ISBN: 978047233979, USA.

[14] Pearson, W. L, Kotz S., (2006). "Encyclopedia and Handbook of Process Capability Indices", ISBN 981256-759-3, World Scientific Publishing Co. Pte. Ltd., Singapore

[15] Tatjana V. Sibalija, Vidosav D. Majstorovic ., (2010)."Process Performance Analysis For Nonnormal Data Distribution", International Journal ''Total Quality Management &Excellence'', Vol. (38), No(3).

[16] Yuichi , Nagahara, (2004)." A method of simulating multivariate nonnormal distributions by the Pearson distribution system and estimation", Computational Statistics & Data Analysis, Elsevier, vol. 47(1), pages 1-29, August.

© Dr Bachioua Lahcene, 2020

Ссылка на статью: Dr Bachioua Lahcene - Some Distribution Systems for Process Capability Analysis with NonNormal Data used in Quality Control // Вести научных достижений. Экономика и право. - 2020. - №3. - С. 135-147. DOI: 10.36616/2686-9837-2020-3-135-147 URL: https://www.vestind.ru/journals/economicsandlaw/releases/2020-3/ articles?pdfView&page=77

i Надоели баннеры? Вы всегда можете отключить рекламу.