Interval Dependence Structures of Two Bivariate Distributions in Risk and Reliability
Boyan Dimitrov Kettering University Flint, Michigan, USA [email protected]
Sahib Esa
Institute of National Studies, Naz City, Block C, Erbil, Kurdistan IQ [email protected]
Abstract
We follow the ideas of measuring strength of dependence between random events, presented at two previous MMR conferences in South Africa and Tokyo. In our work here we apply it for analyzing local dependence structure of some popular bivariate distributions. At the Grenoble conference presentation we focus on the Bivariate Normal distributions with various correlation coefficients, and on the Marshal-Olkin distribution with various parameter's combinations. We draw the surface z = gu(x,y), i=1,2 of dependence of i-th component on the other component within the squares [x, x +1]x[y,y+1], and [x, x +.5]x[y,y+.5]. The points (x,y) run within the square [-3.5, 3.5]x[-3.5, 3.5] for Bivariate Normal distribution, and in [0.10]x[0,10] for the Marshal-Olkin distribution.
Keywords: Local dependence, local regression coefficients, strength of dependence, strength of dependence, surface of dependence, Bivariate normal, Marshal-Olkin distributions
I. Introduction
In several previous publications [1-6] we developed an idea how probability tools can be used to measure strength of dependence between random events. More details contain articles [1] and [2]. In the present article we propose to use it for measuring magnitude of local dependences between random variables. Such dependence is completely different from the global measure of dependence, measured usually by the correlation coefficient. As illustration, we demonstrate how it works in measuring local dependence inside the jointly distributed pairs of random variables, using the regression coefficients between random events. Short illustrations (graphics and tables) are showing the use of these measures in already known popular Bivariate Normal distribution with different correlation values, and inside the popular in reliability Marshal-Olkin distribution.
II. How people indicate dependence
The dependence in uncertainty is a complex concept. In the classical approach conditional probability is used to determine if two events are dependent, or not: A and B are independent when the probability for their joint occurrence equals to the product of the probabilities for their individual appearance, i.e. when
P( A o B) = P( A).P(B)
Otherwise, the two events are dependent.
In courses on Probability the independence for random events is always introduced simultaneously with conditional probability. Where independence does not hold, events are dependent, but more the dependence is never discussed. There are ways to go deeply in the analysis of dependence, to see some detailed pictures inside the global pictures, and use it in the studies of uncertainty. This matter is discussed in our previous articles ([1] and [2]). Some particular situations are analyzed in [3] to [6]. We refer to these articles for making a quick passage to the essentials.
First we notice here that the most informative measures of dependence between random events are the two regression coefficients. Their definition is given here:
Definition. Regression coefficient rB (A) of the event A with respect to the event B is called the difference between the conditional probability for the event A given the event B, and the conditional probability for the event A given the complementary event B, namely
rB (A) = P(A | B) - P(A | B). (1)
This measure of the dependence of the event A on the event B, is directed dependence.
The regression coefficient rA (B) of the event B with respect to the event A, is defined analogously,
rA (B) = P(B | A) - P(B | A) .
From the many interesting properties of the regression coefficients we would like to point out here just few:
(r1) The equality to zero rB (A) = rA (B) =0 takes place if and only if the two events are independent.
(r2) The regression coefficients rB (A) and rA (B) are numbers with equal signs and this is the sign of their connection &(A, B) =P(A"B)-P(A)P(B). The relationships
( a) = P( A n B) - P( A)P(B), and b) = P(A n B)- P(A)P(B). B P(B)[1 -P(B)] AK ' P(A)[1 -P(A)]
The numerical values of rB (A) and rA (B) may not always be equal. There exists an asymmetry in the dependence between random events, and this reflects the nature of real life.
(r3) The regression coefficients rB (A) and rA (B) are numbers between -1 and 1, i.e. they satisfy the inequalities -1 < rB (A) < 1; -1 < rA (B) < 1.
(r4.1) The equality rB (A) = 1 holds only when the random event A coincides with (or is equivalent to) the event B. Then is also valid the equality rA (B) =1;
(r4.2) The equality rB (A) = -1 holds only when the random event A coincides with (or is
equivalent to) the event B - the complement of the event B. Then is also valid r^ (B) = - 1, and respectively.
(r5) The name regression coefficient of the random event A with respect to the event B comes from the following fact: If Ia(m) and Ib(m) are the random indicator variables, related to the two events A and B, then the best linear regression between Ia(m) and Ib(m) is expressed by the equation
Ib(M) = P(B | A) + rn(A) IA(M) + €(a>),
Dimitrov B., Esa E. RT&A, No 1 (48) INTERVAL DEPENDENCE STRUCTURE OF TWO BIVARIATE_Volume 13, March 2018
where e(w)is a r.v. with zero expectation and minimum variance.
We interpret the properties (r4) of the regression coefficients in the following way: As closer is the numerical value of rB (A) to 1, "as denser inside within each other are the events A and B, considered as sets of outcomes of the experiment". In a similar way we interpret also the negative values of the regression coefficient.
The regression coefficient is always defined, for any pair of events A and B (zero, sure, arbitrary).
In our opinion, it is possible one event to have stronger dependence magnitude on the other than the reverse. This measure suits for measuring the magnitude of dependence between events. The distance of the regression coefficient from the zero (where the independence holds) could be used to classify the strength of dependence, e,g. as in some interpretations of the regression coefficient measuring the global dependence:
• almost independent (when | Ra(B) | < .05) ;
• weakly dependent (when .05 < | Ra(B) | < .2) ;
• moderately dependent (when .2 < | Ra(B) | < .45) ;
• in average dependent (when .45 < | Ra(B) | < .8) ;
• strongly dependent (when | Ra(B) | > .8) .
Predictions using Regression coefficients
One serious advantage of the Regression coefficients is that its magnitude can be used to evaluate the posterior probability of one event when information that the other event occurred is available. We have the following relation fulfilled:
P(A | B ) = P(A) + Rb(A)[1-P(B)]. (2)
This formula competes with the BAYES RULE, that requires joint probability P(AnB). We offer to use the strength of dependence Rb(A) instead of the Bayes rule. It seems much more natural for applications, since it uses long run experience.
III. Transfer rules: From events to random variables and distributions
The above measures allow studying the behavior of interaction between any pair of numeric r.v.'s (X,Y) throughout the sample space, and better understanding and use of dependence.
Let the joint cumulative distribution function (c.d.f.) of the pair (X,Y) be F(x,y)=P(X < x, Y < y), with marginal c.d.f.'s F(x)=P (X < x), G(y)=P(Y < y). Let introduce the events
Ax={x<X<x+^1x}; By ={y<Y<y+A2y}, for any x, y e (
Then the measures of dependence between events Ax and By turn into a measure of local dependence between the pair of r.v.'s X and Y on the rectangle D=[x, x+A1x]x[y, y+A2y]. Naturally, they can be named and calculated as follows:
Regression coefficient of X with respect to Y, and of Y with respect to X on the rectangle [x, x+A ix]*[y, y+A2y] can be introduced in analogy to considerations in previous section. By the use of Definition 1 we get
Rr((X Y)qD) = AdF(x,y) -[F(x + \x) -F(x)][G(y + A2y)-G(y)] Y , [ F (x + A x) - F (x)]{1 - [F (x + A x) - F (x)]} .
Here AoF(x,y) denotes the two dimensional finite difference for the function F(x,y) on rectangle
D=[x, x+Aix] x[y, y+A2y]. Namely
AoF(x,y) =F (X+A1X, y+A2y)- F (x+Ax y)- F (x, y+A2y)+ F (x, y). (4)
In an analogous way is defined Rx((X,Y)sD). Just denominator in the above expression is changed (symbol F to symbol G) respectively.
Using these rules one can see and visualize the local dependence between every pair of two r.v.'s X and Y with given joint distribution F(x,y) and marginal s F(x) and G(y).
The biggest advantage of the Regression Coefficients as measures of the magnitude of dependence is their easy interpretation, described above, and the fact that they come available from the knowledge of the probabilities of the respective events, or proportional number of individuals in the sets of subpopulations of interests.
In Probability modeling which use Multivariate Distribution we see GREAT Advantages: knowing that one component falls within an interval, then we can predict everything that may happen with the other component. For instance, when we know that Xe[a,b], we can predict (by use of measure the strength) how likely is that Ye [c,d], for any choice of c and d.
Next we illustrate specific rules in calculation of Regression Coefficients as measures of dependence to analyze the local dependence structure in Bivariate Normal Distribution, and in the Marshal-Olkin Distribution. We end our theoretical background of the general local dependence structural studies. Next we illustrate its application on the two selected qualitative and quantitative probability models.
IV. Correlated Bivariate Normal distribution
I. Analytical expressions
The random vector (X Y) has Bivariate Normal probability distribution if its probability density function is given by the expression:
" r/x=a T-2/Y y-^ V y-n
J ( ct J( CT J ( ct,
,, N 1 2(1-P )
f ( x> y) = ~-n— e
1
Pi
where p is the correlation coefficient between X and Y ; p , p are the expected values, and 01 , 02 are the standard deviations of the components X and Y correspondingly. We analyze how the magnitude of the correlation between components influences this local dependence structure, assuming ^1=^2=6, and 01=1, 02 any. The functions
X+1 --[u2-2puv/ct+v2/ct22] .-. .
J J e 2(1-^2) du dv J e- /2du. J e^2^dv g1( x, y) = -XL-y-7-x-^--(5)
6 v ,7J _ _x+1 f _ x+1 A
1 -P'2 Je~"Z/2du 1 -(\/4lx) Je^du
X V X /
Here we consider the symmetric case 02 =1, and then
g2(x,y)=g1(y,x) (6)
identify the magnitude of dependence between the two components X and Y on the square [x,x+1]x[y,y+1] with the lower left vertex (x,y) and side lengths equal to 1.
Dimitrov B., Esa E. RT&A, No 1 (48) INTERVAL DEPENDENCE STRUCTURE OF TWO BIVARIATE_Volume 13, March 2018
• The marginals Fx(x) and Gy(y) are Normal Distributions of means |ji and st. deviations oi i=1,2.
• We use standard normal marginals ^i = 0, oi =1 and correlated components with different numeric values of the Correlation Coefficient px,Y in our illustrations.
• Our goals are to study local dependence between X and Y as functions of the values of the Correlation Coefficient p, and of the width a of the rectangle (square) [x,x+a]x[y,y+a].
The predictions one can make by use of the Global Correlation Coefficient p are through the regression equation
Y = Hy+p-t- (X + e
-x
Here e is a r.v. with Standard Normal Distribution.
Our approach suggests to use INTEVAL Regression Coefficients, based on the alternative to the Bayes Formula for poster for probabilities cited above. For any pair of variables (X,Y) it looks like this:
P{Ye[c,d]IXe[a,b]} = P{Ye[c,d]) + RXe[a,b](Ye[c,d])[1-P{ Xe[a,b]}].
It is easily presented in terms of any particular Bivariate Distribution, including the Normal one.
Brief analytical and algorithmic discussion, and more graphics and numeric illustrations are used in our concluding observations. Here we give just one of many, illustrating global and local dependence on squares of sides lengths equal to 1 and .5, for values of the correlation between X and Y equal to + .95, + .5 and + .15:
I. Graphical illustrations
All graphing and numeric illustrations are made by program system Maple. Since the symmetry we show only one of the two options.
Correlated Bivariate Normal -Local Dependence Function g2(x,y): p = + - .95
Local dependence Y onX: gl(x,y) on [-3.5,3.5]x[-3.5,3.5],
p = .Q5 2-D Normal; p - + .95
Figure 1. Correlated Bivariate Normal -Local Dependence Function g1(x,y): p = -.95, and squares of length 1 and .5, and the global distribution pdf.
There are some lines of discontinuity on both surfaces. We observe these in a table of selected values of the local correlation function given below. In our opinion, this defect is due to the counting/graphing program, which in this case is MAPLE.
Table 1. Numeric values o f the^ function g1(x,y) on integer points in the square [-3,3]x[-3,3], p=.95
X\Y -3 -2 -1 0 1 2 3
-3 .6531 .1818 -.3884 -.1389 -.0219 -.0013
-2 .0324 .6289 -.1354 -.3948 -.0248 -.0015
-1 -.0325 -0707 .6396 -.2937 -.2062 -0020
0 -.2062 -.2937 .6396 -.0707 -.0325
1 -.0248 -.3948 -.1354 '6829 .0324 -.0015
2 -.0219 -.1389 -.3484 .1818 .6531 .0244
3 -.0214 -.1361 -.3418 -.1353 .4884 .5773
In both cases p=±.95 we observe some lines of discontinuity in the surface functions z=g1(x,y) and z=g2(x,y); Analytic reason for us is unclear for now. We think the deficiency is in the used program. However, the symmetry between local dependence magnitudes are seen from the table.
Another interesting fact is, that no matter if the global correlation p between X and Y is negative or positive, their local Regression Coefficients are positive near the lines y=x (for positive global regression), and the line y=-x (for negative global correlation). Also local regression dependence measure becomes negative (drops quickly) not so far from these lines, and goes to indication of independence with the growth of the distance driven from those lines.
One more interesting fact is, that the local regression magnitudes do nor exceed the global correlation magnitude, but vary with the location of the square within the considered range.
Speaking of predictions, if Xe[x, x+1] it is most likely that Ye[x, x+1] when p is positive, or Ye[-x, -x-1] when p is negative, based on the rule (2)
The function g2(x,y) exhibits similar behavior and it is symmetric to g1(x,y) with respect to the line y=x. We omit the show of these details.
Also we observe high positive local dependence close to the line y=x, and negative local dependences, also of relatively high magnitude, about the opposite signs y= - x. This magnitude vanishes as long the points become far from the origin (0,0). Notice reduction of magnitude on smaller square.
Correlated Bivariate Normal -Local Dependence Function g2(x,y): p = + - .5
2-D Normal; p = +.5 Local dependence X on Y: g2(x,y) on [-3.5,3.5]x[-3.5,3-5],
Figure 2. Bivariate Normal density functions f(x,y): p = +.5 : Global and local Interval dependence
In both cases p = + - .5 we observe saddle points on the surface functions z=g1(x,y) and z=g2(x,y) in the region of the origin (0,0), showing slight positive local dependence;
Interesting fact is, that the Regression Coefficients between X and Y still behave negative or
positive, not exceeding in absolute value the global correlation p .
The Local Regression Coefficients are positive near the lines y=x (for positive global regression), and the line y=-x (for negative global correlation). Local regression dependence measure becomes negative (drops) near these lines, and goes to indication of independence with the growth of the distance driven from those lines. Interesting level curve is at level L=0. There the two variables X and Y are independent on the square {x, x+1]x[y, y+1].
The local regression magnitudes do not exceed the global correlation magnitude, and vary with the location of the square.
Speaking of predictions, if Xe [x, x+1] it is most likely that Ye [x, x+1] when p is positive, or Ye [-x, -x+1] when p is negative, based on the rule (2)
Here we just observe the dependence of the graphs of local dependences how are these influenced by the sign of the Correlation Coefficient. Obviously, one is symmetric with respect to the ordinate axes compare to the other. The level curves are shown on most graphs. They show points of the same magnitude of local dependence.
Correlated Bivariate Normal Density: when p = + - .1
2- D Normal; p - + .1 Local dependence Y on X: gl(x,y) on [-3,3]x[-3,3], p +.I
Figure 3. Bivariate Normal density functions f(x,y): p = +.1 : Global and local Interval dependence
The original Distribution is almost symmetric. No significant global correlation. However, we observe differences in the graphs of local dependences, and how are these influenced by the sign of the Regression Coefficient.
Obviously, the magnitudes do not exceed 60% of the correlation, but go up and down.
The level curves are shown on most graphs. They show points of the same magnitude of local dependence.
In general, we observe again high positive local dependence close to the line y=x, and negative local dependences, also of relatively high magnitude, about the opposite signs y= - x. This magnitude vanishes as long the points become far from the origin (0,0) and away from the lines y=x, or y-x in case of negative global correlation. Notice reduction of magnitude in half on smaller square.
Something similar we observe and in the case of low correlations p = -.10 and p = -.15. As the ancient Greeks used to say, when you have a graph, "Just seat, watch, and make your own conclusions".
V. The Bivariate Marshal-Olkin Distribution
This distribution is well known in reliability from the "fatal shock models", where two exponentially distributed life times X and Y interact so their residual life times have the joint distribution
P{X>x, Y^^e-^iy^.^^y), x,y>0.
(7)
The marginal residual life times are
P{X>x}=e-^+v)x, x>0, and P{ Y>y}=e^+v)y, y>0.
(8)
Applying these functions into the rules (3) (4) with Ax=Ay = a, we get the expressions for the regression coefficients on the squares [x, x+a]x[y, y+a], namely
R1(x,y)
X -V
le
-il v + v x -V max(j^v) -X a—V max(jc + a,v) -il a—V max(x,v + a)
e le — e — e
, -X a-ii a-v max(x + a,y + a) -(u. + v) y , -(X + v) + e j — e V1— e
a]
-(|i + v) a
-(X + v) a . -(X + v) * , -(X + v) (x + e) -e J e +e
Respectively, R2(x,y) is given by a similar expression with the change of ¡i by A and of x by y. The two functions are symmetric with respect to the line y=x only when ¡i = A. Due to the limited space, we present our graphs of the local dependence surfaces between the components X and Y on the squares of size a=.5 and 1 on the square [0,3]x[0,3], and for values of the parameters A=1, ^=2 and v=3. We observe high positive dependence in a neighborhood of the origin, negative dependence of the small values of dependent variable, positive dependence along the line y=x, and vanishing dependence on the large values of the dependent variable.
Respectively, R2(x,y) is given by a similar expression with the change of ^ by A and of x by y.
The two functions representing Regression Coefficients surfaces are symmetric with respect to the line y=x only when ^ = A.
Our graphs show the local dependence surfaces between the components X and Y on the squares of size a=.5 and a =1 on the square [0,3]x[0,3], and for values of the parameters A=1, ^=2 and v=3. They are presented on the next figures 4 and 5.
On both we observe high positive dependence in a neighborhood of the origin, negative dependence around the small values of dependent variable, positive dependence along the line y=x, and vanishing dependence on the large values of the dependent variable.
Figure 4. Marshal-Olkin A=1, ||=2, v=3 - Local Dependence Functions R1(x,y) and R2(x,y): on squares of
length 5
Figure 5. Marshal-Olkin A=1, ||=2, v=3 - Local Dependence Functions R1(x,y) and R2(x,y): on squares of
length 1.
These graphs and numeric values on the axes of the boxes indicate that the magnitude of dependence is slightly affected on the size of the squares.
There is a negative dependence near the lines y=0, and x=0 depending with respect to which variable dependence is measured. Then this magnitude quickly rises and keeps a positive value along lines parallel to the respective axis. Then magnitude of mutual local dependence drops for a while, and rises again near the line y=x.
On the opposite direction dependence vanishes with the growth of the distance from the line
y=x.
The local dependence functions R1(x,y) and R2(x,y) on squares length a=1 and a=.5 for combination A=3, |=2, v=1 between parameters in the Marshal-Olkin distribution is shown on Fig.<5.
Figure 6. Marshal-Olkin local dependence functions R1(x,y) and R2(x,y) on squares of length 1 and .5 fpr
parameters A=3, ||=2, v=1.
The graphs and numeric values on the coordinate axes of the box indicate that the magnitude of dependence is now affected on the sides of the square. One is of magnitude .6, the other one - of magnitude .15. However, the shape of the surface of dependence is similar to others.
The negative dependence near the lines y=0, and x=0 depending with respect to which variable dependence is measured still saves behavior, and keep stable value along the variable with smaller value parameter.
The magnitude of this dependence is now higher compare to the case where the interacting
component has had higher intensity (compare v=3 to v=1). The magnitude near the line y=x keeps its ridge kind of shape.
On the opposite directions dependence vanishes with the growth of distance from the line y=x.
Figure 7. Marshal-Olkin local dependence functions R1(x,y) and R2(x,y) on squares of length 1 and .5 for
parameters A=1, |=1, v=3, and A=3, |=3, v=1.
The last combinations of numeric values between parameters in Marshal-Olkin distribution show that the magnitude of dependence is not affected on the values of parameters of the marginals ( as soon as the two marginals are equal X=|), no matter the value v of the interaction component is.
The overall shape of the graphs of local mutual dependence is similar to others. The magnitude of the dependence rises near the origin (up to .6, compare to .5 in other numeric combinations between parameters)
On the opposite direction of the pair (Y w.r.t. X, or X w.r.t. Y) dependence vanishes with the distance from the line y=x.
The ridge local dependence along the line y=x stays steady positive near the origin and slowly vanishes away from the origin.
VI. Conclusions
• We discussed Regression Coefficients as measures of dependence between two random events. These measures are asymmetric, and exhibit natural properties.
• Their numerical values serve as indication for the magnitude of dependence between random events.
• These measures provide simple ways to detect independence, coincidence, degree of dependence.
• If either measure of dependence is known, it allows better prediction of the chance for occurrence of one event, given that the other one occurs.
• We observe unexpected behavior of the Regression Coefficients between the two components of symmetric Bivariate normal distribution with different magnitude of the correlation coefficient.
• These measures are examined by the 3-D surface of dependence on squares [x, x+a]x[y,y+a] with a=.5; a=1.0 and (x,y)e[-3.5, 3.5]x[-3.5,3.5]
• There is some high positive local dependence close to the lines y=x or y=-x; negative local dependences also is present. The magnitudes of dependence vanish as long the points become far from the origin (0,0) or from the lines of correlation..
• Notice reduction of magnitude on smaller square.
• We observe unexpected behavior of the Regression Coefficients between the two components of the M-O Distribution.
• These measures are examined by the 3-D surface of dependence on squares [x, x+a]x[y,y+a] with a=.5; a=1.0 and (x,y)e[0, 3]x[ 0,3]
• There is some high positive local dependence close to the origin and along the line y=x
• Negative local dependences is presented near the axis of the variable w.r.t. which Regression Coefficient is considered.
• The magnitudes of dependence vanish at the opposite side, as the points get far from the origin (0,0).
• Notice reduction of magnitude on smaller square for interacting component with lower parameter' values.
• The magnitudes of dependence do not change with the value interacting component parameter as long the two components have same distributions.
As possible future investigation we challenge the readers of this article to compare the local dependence structures in the asymmetric bivariate normal distribution. Also look and compare our findings to the local dependence structures in the copulas of the bivariate inside any of the known bivariate distribution normal and the Mashal-Olkin distribution. Open is the local structure with correlated components.
VII. Acknowledgements
This work has been reported at the 10th International Conference on Mathematical Methods in Reliability (MMR2017) in Grenoble, France, July 2-6, 2017. Thanks to the Kettering University support of the Provost Travel Funds and the Department of Mathematics the presentation by the first author has been made. Thanks.
References
[1] Dimitrov, B. (2010) . Some Obreshkov Measures of Dependence and Their Use, Compte Rendus de l'Academie Bulgare des Sciences, V. 63, No.1, pp. 15-18.
[2] Dimitrov, B. (2014).Dependence structure of some bivariate distributions, Serdica J. Computing, v. 8, No 3, 101-122.
[3] Dimitrov, B. (2013). Measures of dependence in reliability, Proceedings of the 8-th MMR'2013, Stellenbosh, South Africa, July 1-4 , pp 65-69.
[4] Dimitrov, B. and Esa, S. (2015). On the local dependence structure in politics and in reliability distributions, Proceedings of the Tokyo MMR'2015.
[5] Dimitrov, B., Esa, S., Kolev, N. and Pitselis, G. (2013). Transfer of Global Measures of Dependence into Cumulative Local, Applied Mathematics, doi:10.4236/am.2013. Published Online pp. 2019-2028. (http: //www.scirp. org/journal/am)
[6] Esa, S. and Dimitrov, B. (2016). Dependence Structures in Politics and in Reliability. In Proceedings of the Second International Symposium on Stochastic Models in Reliability Engineering, Life Science and Operations Management (SMRLO'16), I. Frenkel and A. Lisnianski (eds.), Beer Sheva, Israel, February 15-18, pp. 318 - 322, IEEE CPS, 978-1-4673-9941-8/16, DOI, 2016.
[7] Esa, S. and Dimitrov, B. (2013). Dependencies in the world of politics, Proceedings of the 8-th MMR'2013, Stellenbosh, South Africa, July 1-4 , pp 70-73, 2013.