SOME COMMENTS ON STATISTICAL RISKS
Viorel Gh. Vodä •
"Gheorghe Mihoc - Caius Iacob" Institute of Mathematical Statistics and Applied Mathematics of the Romanian Academy e-mail: von [email protected]
Abstract
In this work we make a detailed analysis of the concept of risk, the stress being focused then on various kinds of statistical risks: producer and consumer risks, technical risk, Taguchi's risk (making a connection with Cpm capability index) and a risk arising in SPC practice.
Key words: statistical risk, error, hazard rate, Taguchi's loss function, Taguchi's risk, SPC -Statistical Process Control.
1. Preliminaries: a discussion on the concept of risk
The notion of risk covers a broad area of interpretations. As in many cases, there is a man-in-the street approach and a scientific one which tries to offer quantitative measures of the underlying term.
Let us visit first some usual dictionaries. For instance, BBC English - Romanian Dictionary (Editura CORESI, Bucure§ti, 1998, page 966(, risk is assimilated to a danger: if there is a risk of something, it might have unpleasant or even dangerous consequences (results).
We seize here the potentiality of such kind of results, which may or may not occur. Therefore, it is a suggestion that risk is associated with uncertainty: it might happen, but we do not know for sure if it will indeed happen.
Merriam - Webster's Collegiate Dictionary (Tenth Edition M. W. Incorporated, Springfield, Mass, U.S.A., 1996, page 1011) is more generous and specific: possibility of loss or injury, a peril but also the degree of probability of such loss (this is a new element in the usual definitions).
The very recent „Illustrated Oxford Dictionary of English Language" (2008, Dorling/Oxford Univ. Press, Litera International, Bucure§ti - Chi§inau, page 709) defines it as a chance or possibility of danger, loss, injury etc.
The term „chance" is straightforwardly linked with that of uncertainty. Some authors consider that risk is characterized by possibility to be described by the aid of probability laws (see Barsan-Pipu and Popescu, 2003 [2, page 2]). Uncertainty can be described also by quantitative measures - if we regard it from the metrological point of view (see Petrescu et al., 2006 [12]).
Webster's Unabridged Dictionary of English Language (edited 2002) advances the concept risk management (RM) and also that of risk manager. This RM is viewed as a technique of estimation, prevention and minimization the accidental losses which could appear in a business by taking some safety measures (insurances - for instance).
Risk appears therefore as an uncertain event which may take place if some risk factors actually act.
On the other hand, the risk is always associated with the anthropical element - that human factor which finally will suffer eventual bosses of its „risky decisions".
2. Kinds of risks
Generally speaking, there are several types of risk - depending on the domain we consider to be of interest. Isaic-Maniu et al (1999, page 492) [7] believe that the so-called economic risk is of great importance. This risk is defined as the incapacity (or just impossibility!) of a given organization to survive in a business environment: this means that its managers have no the skills (and knowledge) to adapt the economical policy of the company to variations (sometimes unexpected and unfriendly) of the social-economical reality at a specified moment.
This economic risk (quite general) has some components such as „bankruptcy subrisk" which seems to be essential: if an organization cannot pay its bills for current utilities, cannot reimburse its loans, cannot pay its subcontractors, suppliers etc. - all these are the signs that the above risk has already implemented its destructive effects.
Since the risk is regarded as a probability, therefore it is worth to investigate the nature of what is called the statistical risk. It plays an important role in the framework of statistical inference. One problem which has been not very deeply investigated is the following: how to manipulate (or to manage) this in order to minimize it, in the sense that the decision taken in an uncertain/risky situation, to be „the best" one?
3. Various types of statistical risks
Usually, in the theory of statistical hypotheses, founded mainly by the British School of Statistics (see Stoichitoiu - Voda, 2002 [15]) we deal with the so-called errors we make as regards the decisions about the underlying hypotheses.
As it is well-known, a hypothesis (in general) is simply defined as a statement/assumption/supposition about a certain phenomenon, process, situation etc. This assumption may be true (that is in accordance with the real status of the entity considered) or may be not. For a scientific hypothesis it is sufficient to provide a counterexample, in order to reject as false the proposed hypothesis.
Since in statistical analysis we work with samples (assumed to be obtained randomly), the conclusions will depend entirely on the sample (or samples) we have at hand. The sample could support the advanced hypothesis (called null-hypothesis, H0) or it could sustain the alternative one (H1). Therefore, we say that the couple (H0, H1) is accompanied by two kinds of errors, namely
a = Prob{reject H0 | if H0 is true} (1)
and
P = Prob{accept H0 | if Hj is true} (2)
They are called respectively: error of the first Type (a) and error of the second Type (P) -see for details Blischke and Murthy, 2000 [3] page 157 - 162.
These authors draw the attention that Type I and Type II errors rates are the probabilities of making these kind of „mistakes" - namely „do reject H0" (when H0 is true) and „do not reject H0" (when H1 - the alternative is true).
In fact, a = a(n; 01 )and P = p(n : 01) - that is they depend on the size of the sample we
employ and on the true value of the parameter (0) on which the hypothesis is made.
In SQC - Statistical Quality Control - especially in sampling inspection of batches, where practical procedures have been standardized (see American Standards MIL STD 105 D and MIL STD 414 - or their ISO equivalents, ISO 2859 and ISO 3951, a and p are called „producer risk" (a) and „consumer risk" (P) - respectively. In the above documents a and P are taken at fixed levels (a = 5% and P = 10%) and hence there is no possibility to modify these values if in practice we use these standards.
What we can do is diminish the risk of non-acceptance of a given lot/batch. This risk is expressed as 1 - Pa (p), where Pa (p) is the probability of acceptance of the lot which depends on its
defective (or nonconforming) fraction (p). If p is larger then the accepted value (AQL - Acceptable Quality Level), then the risk of non-acceptance is higher.
Other important element of the above mentioned documents is the so-called LQ - Limiting Quality - that is that value of p which we are ready to accept with a small probability (in 10% of the case at most).
If p = p0 > AQL, the risk of non-acceptance increases as long as (p) approaches the LQ
value.
It follows that the management of this risk has to be directed to those measures which can lead to a decline of the fraction defective (see Isaic-Maniu and Voda, 1997, [6]).
3.1. Error of the Third Type?
In [14] has been discussed an argumentation of Malita and Zidaroiu (1980, [10]) in favor of a Raiffas idea (1970, [13]) regarding the existence of a Type III error. This last author claims that if an experimenter (or an analyst) tries to solve a false problem, then he commits an error of the third kind! Raiffa did not establish clearly what he understands by a „false" problem: is it an ill -posed problem (improperly/wrongly formulated) or the falsity refers to the goal/purpose stated by the responsible authority?
Malita and Zidaroiu tried to justify Raiffas proposal by linking it to the Type I and Type II errors, claiming that this Type III error „is expected to weight in a specific manner, the previous two classical type errors". They say also that the main source of Type III error is the lack of communication between the analyst and the decisional factor. This communication must act in both directions: from the decision unit to the experimenter/analyst and conversely, in order to check/verify that indeed we detected the right problem!
Such an argumentation seems to be at most at a metaphoric level: nobody will ask himself or someone else if the problem he solves is false ...
We shall mention the Cambridge Dictionary of Statistics, Cambridge University Press, 1998 (author B. S. Everitt) where he draws the attention to not confound this risk with Type III error -term used for identifying the poorer of two treatments as the better (pages 116 and 338).
3.2. Technical risk
Irina Isaic-Maniu (see [8 , page 51 - 65], 2003) gives a „risk interpretations" for the main indicators used in reliability theory; in fact, the distribution function F(t) of a continuous and positive random variable (T) which describes the failure behavior of a given entity may be viewed as a „technical risk" - that is the complement of the reliability function:
Prob{T < t0 } = F(t0) = 1 - R(t0) = 1 - Prob{T > t0} (3)
Here F(t0) is hence the probability that the system operates less than a desired time t0. If the reliability R(t0) is low, consequently this technical risk is high.
More adequate to define this technical risk seems to be the hazard rate (or failure rate) function which may be called also „the danger of failure":
=fL=dm = _dM (4)
1 - F (t) R(t) R(t)
A high value of h(t) means a low level of reliability (h(t) is expressed usually in failures/hour).
4. Taguchfs risk
Genichi Taguchi (see Alexis, 1999, [1]) revitalized Gauss' quadratic function f x) = a(x - x0 )2, a > 0, x > x0 > 0 and associated it to the so-called quality loss
L(x0;T) = k(x0 - T)2, k > 0, x0,T e R (5)
where x0 is the measured value of the quality characteristic (X) and (T) is it target value (k is a constant depending on the specific case at hand).
If f (x; 0) is the density of X (x e D, D being a part of R, 0 e R) thenthe average value
E[L(x;T )] = \ L(x;T )f (x; 0 )dx (6)
d
is called Taguchi type risk (see Kackar, 1986, [9]). Taking into account (5), we may write (6) as
E[L(x; T )] = k [var(x ) + (e(x ) - T )2 ] (7)
and if X is normally distributed n([j,; 52), we have.
E[L(x;T )] = k [52 +( - T )2 ] (8)
The empirical risk (denoted RT (x)) is therefore
RT (x ) = k [s2 +(x - T )2 ] (9)
where x and s are the well-known sample statistics.
There is a straightforward link between Taguchi's risk and his own process capability index Cpm (see Chan et al, 1988, [4]):
Cpm - USL - LSL (10)
Pm 6 s2 + (x - T )2
where USL = Upper Specified Limit and LSL = Lower Specified Limit of the given quality characteristic X ~ n([j,, 52) with T as its target value. We may write hence immediately
f V
~ / X ,[ USL - LSL 1
rt (x) = k ----tt
l 6 cpm 7
If USL - LSL = 6s - that is the minimal level for admissible process capability, we get:
(11)
RT (x;T ) = k
C
v pm J
(12)
and we draw the conclusion that the Taguchi's risk can be regarded as a function of the length of the specified interval USL - LSL measured in standard deviations units. The theoretical Taguchi risk corresponding to (12) is
Rt (x ) = k
' s ^2
v C , v pm J
ks2
C
Denoting kS2 = M and Cpm = X, we shall have a hyperbolic dependence of the type
R = M/X2. If in Cpm, the true mean-value u is just the target T, then Cpm becomes the classical potential index of a process namely C = (USL - LSL)/6S (see figure 1).
2
s
Fig. 1. The relationship between Taguchi's risk and C
pm
5. Risk in SPC practice
SPC - or Statistical Process Control is mainly based on the theory and practice of Shewhart control charts (see ISO document ISO 8258 „Shewhart control charts", 1991 or Petrescu-Voda, 2002 [11]).
From a statistical point of view, Shewhart control charts can be viewed as a continuously testing of by hypothesis H0 : Mean = p versus the alternative Hj : Mean ^ p at the significance
level a = Prob{z| > 3}= 0.0027 (see Derman and Ross, [5]).
From a practical perspective, this means that even when a certain process is in the state of statistical stability (remains in control) there is a chance - a risk (0.0027) - that a subgroup average will fall outside the control limits UCL = p + 3 5 /Vn, LCL = p + 3 5 /Vn and the experimenter would incorrectly take the „risky decision" to correct the process that is to dig for an illusory cause of trouble.
Numerical example: Consider a measurable characteristic for which two specified limits are fixed, namely LSL = 263.48 c.u. (c. u. = conventional units) and USL = 263.68 c. u. The target value is T = 263.58 c. u. If we ask a performance level for Cp to be 2 and if from data we get the mean value x = 263.58 c. u. and standard deviation s = 0.011 c. u., we shall get (€p approximately 0.40 - that is a very weak potential index of the process. The estimated Taguchi risk is therefore R-T (x) ~ 0.007 k and this risk is expressed in monetary units. This values shows that if the defective unit is cheap, then the risk is small. For such low production cost items it is not necessary to impose a performance at the level of SIX SIGMA (see the excellent monograph of Praveen Gupta "The Six Sigma Performance Handbook. A Statistical Guide to Optimizing Results", McGraw-Hill Book Co., 2005, New York
References
1. Alexis, J. (1999): Metoda Taguchi in practica industríala. Planuri de experience.
Editura TEHNICÁ, Bucure§ti, Colectia MQM
(Taguchi Method in Industrial Practice. Experimental Designs - in Romanian; translated from the French original)
2. Barsan-Pipu, N. §i Popescu, I. (2003): Managementul riscului. Concepte. Metode. Aplicatii
Editura Universitatii TRANSILVANIA, Bra§ov, Romania
(Risk Management. Concepts. Methods. Applications - in Romanian)
3. Blischke, W. R. and Murthy, D. N. P. (2000): Reliability. Modeling, Prediction, Optimization.
John Wiley and Sons Inc., New York
4. Chan, L. K., Cheng, S. W. and Spiring, F., A. (1988): A new measure of process capability: Cpm
Journal of Quality Technology, vol. 20, no. 3, pp. 162 - 175
5. Derman, C. and Ross, M. Sh. (1992): Statistical Aspects of Quality Control Academic Press, New York
6. Isaic-Maniu, Al. §i Voda, V. Gh. (1997): Manualul Calitatii Editura ECONOMICÁ, Bucure§ti
(Quality Handbook - in Romanian)
7. Isaic-Maniu, Al., Mitrut, C. §i Voineagu, V. (1999): Statistica pentru managements afacerilor (editia a Il-a)
Editura ECONOMICÄ, Bucure§ti
(Statistics for Business Management, Second edition - in Romanian)
8. Isaic-Maniu, Irina (2003): Masurarea §i analiza statistica a riscului in economie
Editura ASE Bucure§ti, Colectia Statisticä - Facultatea de Ciberneticä, Statisticä §i Informaticä Economicä
(Measurement and Statistical Analysis of Economical Risks - in Romanian)
9. Kackar, R., N. (1986): Off-line quality control parameter design and the Taguchi method
Journal of Quality Technology, vol. 17, nr. 4, pp. 176 - 188
10. Malita, M. §i Zidäroiu, C. (1980): Incertitudine §i decizie Editura §tiintificä §i Enciclopedicä, Bucure§ti (Uncertainty and Decision - in Romanian)
11. Petrescu, E. §i Vodä, V. Gh. (2002): Fi§e de control de process. Teorie §i studii de caz
Editura ECONOMICÄ, Bucure§ti
(Process Control Charts. Theory and Case Studies - in Romanian)
12. Petrescu, E. Stoichitoiu, D. G. §i Vodä, V. Gh. (2004): Incertitudinea de masurare. Interpretari. Controverse. Proceduri
Editura MEDIAREX 21, Bucure§ti
(Uncertainty Measurement. Interpretations. Controversies. Procedures - in Romanian)
13. Raiffa, H. (1970): Decision Analysis Addison - Wesley, Reading, Mass, U.S.A.
14. Petrescu, E. §i Vodä, V. Gh. (2007): Incertitudinea decizionala §i riscul statistic CALITATEA (SRAC), Anul 8, nr. 1 - 2, pp. 81 - 88
(Decisional Uncertainty and Statistical Risk - in Romanian)
15. Stoichitoiu, D. G. §i Vodä, V. Gh. (2002): Istoria Calitatii. Un eseu concentrat Editura MEDIAREX 21, Bucure§ti
(Quality History. A Concentrated Essay - in Romanian)