Научная статья на тему '“Simpson’s paradox” as a manifestation of the properties of weighted average (part 1)'

“Simpson’s paradox” as a manifestation of the properties of weighted average (part 1) Текст научной статьи по специальности «Философия, этика, религиоведение»

CC BY
101
22
i Надоели баннеры? Вы всегда можете отключить рекламу.
Ключевые слова
SIMPSON''S PARADOX / ПАРАДОКС СИМПСОНА / СРЕДНЕВЗВЕШЕННАЯ / МНОГОМЕРНОЕ ПРОСТРАНСТВО / WEIGHTED AVERAGE / MULTIDIMENSIONAL SPACE

Аннотация научной статьи по философии, этике, религиоведению, автор научной работы — Zhekov Encho Zhekov

The article proves that the so-called “Simpson's paradox” is a special case of manifestation of the properties of weighted average. In this case always comes to comparing two weighted averages, where the average of the larger variables is less than that of the smaller. The article demonstrates one method for analyzing the relative change of magnitudes of the type: S = Σ ki=1x iy i who gives answer to question: what is the reason, the weighted average of few variables with higher values, to be smaller than the average of few variables with lower values. This method explains “Simpson's paradox”.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Текст научной работы на тему «“Simpson’s paradox” as a manifestation of the properties of weighted average (part 1)»

"SIMPSON'S PARADOX" AS A MANIFESTATION OF THE PROPERTIES OF WEIGHTED AVERAGE (Part 1)

УДК 311.14

Encho Zhekov Zhekov

Place of work:

National Social Security Institute, Office Burgas - manager;

University "Prof. Assen Zlatarov" Burgas -lecturer.

Address: Bulgaria, Burgas, "Stefan Stambolov" str., 126

Phone: +359-56-80-37-10; +359-887-59-74-18. E-mail:

encho.jekov@burgas.nssi.bg: encho_j@bginfo.net

The article proves that the so-called "Simpson's paradox" is a special case of manifestation of the properties of weighted average. In this case always comes to comparing two weighted averages, where the average of the larger variables is less than that of the smaller. The article demonstrates one method for analyzing the relative change of magnitudes of the type:

who gives answer to question: what is the reason, the weighted average of few variables with higher values, to be smaller than the average of few variables with lower values. This method explains "Simpson's paradox".

Keywords: Simpson's paradox; weighted average; multidimensional space.

Энчо Жеков Жеков

Место работы:

Менеджер, Национальный институт государственного обеспечения, г.Бургас Преподаватель , Университет «Проф. д-р Асен Златаров», г.Бургас Адрес: Болгария, г.Бургас, ул. Стефана Стамболова, 126

Тел.: +359-56-80-37-10; +359-887-59-74-18 Эл. почта:

encho.jekov@burgas.nssi.bg; encho_j@bginfo.net

«ПАРАДОКС СИМПСОНА» КАК ПРОЯВЛЕНИЕ СВОЙСТВ СРЕДНЕВЗВЕШЕННОГО ЗНАЧЕНИЯ

В статье доказывается, что так называемый «Парадокс Симпсона» является лишь частным случаем проявления свойств средневзвешенного значения. В данном случае происходит сравнение двух средневзвешенных значений, где средний показатель среди наибольших переменных меньше чем наименьшее значение. В статье представлен метод анализа относительного изменения значения и имеет вид k

S = Xх' y

"To think logically and to think truthfully are two different things."

V.G. Belinsky

1. "Simpson's paradox"

E. Simpson [4] interprets a case of statistical practice, later called by C. Colin [2] "SimpsonS paradox". Similar cases, were described before that by K. Pearson [3] and G. Yule [5] and others. The authors demonstrate cases in which for the parts of an aggregate a statistical relationship is observed, while for the whole aggregate, that not only does not occur, but there's opposition. One of the examples cited is that of P. J. Bikel, E. A. Hjammel and J. W. O'Connell [1] of the data on the number and proportion of accepted applicants in the University of California, Berkeley:

Table 1

Men Women

History 20,0% (1/5) 25,0% (2/8)

Geography 75,0% (6/8) 80,0% (4/5)

University 53,8% (7/13) 46,2% (6/13)

In the history department five men have applied, of whom one was accepted, and eight women, of whom two were accepted. In the geography department eight men have applied, of whom six were accepted and five women, of whom four were accepted. In both departments the share of women is higher than that of accepted men (25,0% and 80,0% versus 20,0% and 75,0%, respectively). For the whole university, however, the share of accepted women is less than that of males (46,2% versus 53,8%). There is a contradiction between the presentation of men and women in separate departments and the university generally and this contradiction gives rise to some people to define it as a paradoxical case.

To understand whether there is anything unusual in such examples, let us imagine that the same group of candidates (or another twenty-six candidates allocated in the same way) were tested by^t"t$st, as their responses are measured from 0 to 100 points and their average results^olen down by department and sex are as follows:

Table 2

Men Women

Average results (points) Number of Average results (points) Number of

candidates candidates

History 20 5 25 S

Geography 75 8 80 5

University 13 13

1=1

Данная формула отвечает на вопрос: почему средневзвешенное значение нескольких переменных с более высоким значением меньше средневзвешенной переменных с более низким значением. Этот метод объясняет «Парадокс Симпсона».

Ключевые слова: парадокс Симпсона, средневзвешенная, многомерное пространство.

We see that in both departments separately, the average success of women is higher than that of men. As for the whole university, however, the ratio is back - the average success of males is: (20,0.5 + 75,0.8) :13 = 53,8 points and of women: (25,0.8 + 80,0.5): 13 = 46,2 points. So we have again a contradiction between the presentation of candidates in the departments separately and universities generally, and considering that the numbers in both tables are exactly the same, we should conclude that again there is the "Simpson's paradox".

On the other hand, in the second example the matter is nothing else but a comparison of two weighted averages (the average results of men and women for university) and the average calculated from the smaller variables (results for men) is higher than that calculated from the larger (results of women). We know the difference between the values of two weighted averages is affected not only the differences between the averaged variables, but differences between the structures of the weights with which they are weighted. Sometimes these two influences are in opposite directions and the second is stronger than the first. Then the average of smaller variables appears greater than that of larger ones. Obviously, something like this happened here and there is nothing paradoxical. But this raises a question: Is it logic, the same contradiction in the presentation of men and women to be natural when it is expressed in the test results, and paradoxical - when expressed in relative shares of accepted applicants?

It is striking that when discussing "Simpson's paradox" is generally given examples like that of the University of California - with sets, divided into two parts. Logical question is: Is it possible manifestation of the "paradox" in sets, divided over two parts and how it would look like? Here is one such example:

Table 3

Men Women

History 82,1% (23/28) 88,9% (8/9)

Geography 77,8% (14/18) 83,3% (15/18)

Biology 55,6% (5/9) 60,7% (17/28)

University 76,4% (42/55) 72,7% (40/55)

As seen in all departments separately proportion of women is higher than that of men, but in the university is back. And if the data in Table 1 are seen as paradoxical, should those in Table 3 must to be the same.

But when consider aggrgates consisting of more than two parts, the absence of any paradox is quite obvious. This is demonstrated particularly well in next two examples.

Let's have a shop in which in two consecutive months, three types of homogeneous goods are sold, for example - three mobile phone models "A". "B" and "C". The table shows the prices of phones and quantities sold for two months:

Table 4

Mod el J a n u a г у 2 0 11 F e b i u a г у 2 0 11

Price (€) Quantity Price (€) Quantity

"A" 95.00 100 90.00 300

"B" 75,00 200 70.00 200

"C" 55,00 300 50,00 100

Tota 1 600 600

As seen in February the prices of three models fell by 5 €. compared to January. However, in January 600 phones for 41000 € were sold, and in February - 600 phones but for 46 000 €. Which means that while the prices of all phones fell, their average price rose from 68,33 € to 76,67 €.

Is there something disturbing in this example? Of course, not. The most elementary explanation is that there is an appreciable increase in the proportion of phones sold of the most expensive model "A" at the expense of the most inexpensive model "C", and so the average price of phones sold, increases, although the prices of each of them have fallen.

It is not difficult to realize, however, that the numbers in table may not represent absolute numbers, as are the prices of phones, and some relative values in percentage - for example, the relative shares of the phones sold under contract with a mobile operator to the number of all phones sold for each model individually. Then the table will look like this:

Table 5

January 201 1 February 2011

•Model "Л" 95,0%.(45/100) 90.0% (270.300)

Model -ТГ 75,(1% (150/200) 70,0% (140.700)

Model "C" 55,0% (165/300) 50.0% (50 100)

Total 38,33% (410. <SQ0) 76,67% (46(1/600)

And for the three models, the proportion of phones sold under contract is higher in January than in February. However, under a contract 410 out of600 phones were sold in January, i.e. - 68,33% and in February - 460 out of600 phones, i.e. - 76,67%. It turns out that although for the three models separately contract phones were sold better in January than in February, for all of phones, it is opposite - in February sales with contract are relatively higher than in January. Which, if we follow the logic of example of candidates in the University of California, should be defined as a "Simpson's paradox".

It turns out that the same relationships between the same numbers, in some cases may seem quite consequential, and others -paradoxical. This is very strange. Is it not logical, if a relationship between a group of numbers is paradoxical, it to be paradoxical in all cases - regardless of what concepts stand behind these numbers? Conversely, if one of the manifestations of this relationship has an explanation, is it not true for all other manifestations, including those which seem paradoxical? Abundantly clear is, that in the last two examples it comes to comparing two weighted averages and once we have explanation for the first example, we have such explanation for the second one, too. What kind of a paradox are we talking about?

The same is true for the examples of candidates in the University of California and all cases in which occurs "Simpson's paradox". They clearly show that whatever concepts stand behind the numbers, always comes to comparing two weighted averages and the average calculated from the larger numbers is less than that calculated from the smaller. Of course, special feature of the "Simpson's paradox" is that we always compare weighted average of two relative numbers whose weights are the denominators of these numbers themselves. But this special feature does not change the nature of things.

Apparently the "paradox" is not in the relationships between the numbers themselves, but in how we perceive them.

The above said, can be formulated in the following way: Let's have four pairs of numbers: nA1 and NA1; nA2 and NA2 ; nB1 and NB1; nB2 and NB2 , which satisfy the inequalities: nA1 < NA1; nA2 < NA2 ; nB1 < NB1 and nB2 < NB2 . I introduce this condition, because it is present in all examples that demonstrate the "Simpson's paradox". As will become clear later, it may be manifested without being valid. Besides, we saw, that it is completely possible the paradox to be manifested also in cases in which the aggregate is divided in more than two parts.

'Simpson's paradox" is in the fact, although for the magnitudes: nA1, nA2 , nB1, nB2, NA1, NA2 , NB1 and NB2 , might be valid

inequalities:

'A1

'B1

and

' A2

'B2

N A1 NB1 N A 2

nA1 + nA2 ^ nB1 + nB2

valid: ь

N

, it doesn't mean that in all cases

n A1 + n

A2

n B1 + n

B2

B2

N A1 + N A2

N B1 + NB 2

is valid. It is possible to be

NA1 + NA2

Nbi + NB2

To explain this, the magnitudes:

n ai + П

A2

N A1 + N A 2

and

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

nB1 + n

B2

NB1 + N B2

have to be expressed by

'A1

''Bl

nA2

N

A1

Nbi N

known, that the first are average weighted of the second, weighted by the weights: NA1, NB1, NA2 and NB2

n A1 + n A2

n A1

N

.N

A1

+ ^ .N A2

A1

N

A2

I

i=1

n Ai

N

.N

Ai

Ai

N A1 + N A2 N A1 + N A2 N1 1

n Bl + nB 2 n Bl N bi •N Bl + ^ N в 2 N в 2 2 I ^ N « i=1 N Bi

N

A (Na

N Bl + N B2

N B1 + N B 2

I N Bi

n B

N

B

and

nB2

A2

N

It is

B2

and

_«A

Symbols n

пв Г

A (Na

and

n B

N «

N

в

nBl + nB 2

\

v N B1

N

' B 2 у

As seen, comparing

B (Nb ) : 2. n A1 + n A2

will denote the weighted average to be different from unweighted:

n A

N A

/ ^ nAl + n A2

v N Al

N

:2

A2 У

N A1 + N A2

to

nB1 + nB 2 NB1 + N B2

in fact we compare the average:

п a

N

A (Na

on one side, and the average:

n B

N

on the other.

B (Nb

This means that in order to explain the "Simpson's paradox" we should just answer the question: What is the reason for the weighted average of two variables with higher values to be smaller than thtrOther two variables with lower values?

Having in mind the afore said, I will demonstrate one method for analysis of the causes (factors), due to which two weighted averages differ from one another. This method will help us understand the "Simpson's paradox". Before that, however, I have to stop the attention on a topic that seemingly has nothing to do with the afore said, but in fact - and the patient reader will be convinced later - is in its base.

2. Analysis of relative change of the variables of type: S - ^

1=1

Si

In this part I will focus on the analysis of relative change of S over time, i.e. to the ratio of this variable in two different moments. Everything I will say, applies to the relative difference between two values of this variable in space.

There are many indicators that describe the values of characteristics representing a sum of productions of two groups of variables. Such characteristic is the total turnover of the phones sold by the three models in the example: in January - 41 000 €, in February - 46 000 €. This type of relationship between one variable and two groups of variables is called additive-multiplicative form of relation and its mathematical expression is: k

S = X xiy'

i=1

where y are the values of one group of variables (number of phones sold by various models), - the values of the other group variables (average price for each phone model) and k - the number of groups which are distributed to all units (the number of models of phones) as i = 1,2,3,...,k .

The values of magnitude S and its amendment shall be determined by the values and change of the groups of variables xt and y . And therefore x and y are called factors or factor variables, and usually y is called extensive and x - intensive factor. S is additive-multiplicative variable called result. When talking about the result and the factors that determine it I will use the symbols S , x and , and when talking about groups of variables that describe them - the symbols , xt and y.

I will provide a method for analyzing the impact the modification of factors x and y on the modification of the result S . This tool will help me to explain the "Simpson's paradox". Before that, however, I have to say something important, to be considered later.

First, when analysing the impact of the factors x and , the two groups of variables and y i should be regarded as independent of each other. Indeed, it can hardly be denied that in the expression:

S = X У = Х1У1 + X2 У2 + - + Xk У к ,

i=1

values of any of the variables xt does not depend in any way on any of the values of variables y - both on that by which it

>

>

>

is multiplied and the rest. For example, the values of xj do not depend either on yj or from y2 , y3, etc. Sufficient evidence is that can be given numerous examples where it is possible y1, y2 , y3, etc., to change, but remains constant and vice versa. The same is true for other variables in the group . Of course, this applies to reverse dependence - the group of the group . Therefore, two groups of variables are independent of each other and must be perceived as such.

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

Second, the groups of variables x;- and y are independent from each other means that in the expression: S " ^ x'y' , they participate in the same way. Of course, in different cases from practise different terms stand behind these variables. These are concepts that are distinguished as "quantitative" and "quality" characteristics of units in the aggregate and hence the factors are

seen as different in nature: extensive and intensive. But clear is, that from a mathematical point of view, any distinction between the

k

two groups ofvariables xt and yi is not only unfounded, but completely wrong - ultimately, involved in the expression: S = X x*y *

i= 1

are just two sets of numbers that in no way differ from one another, and if we change their positions if the right side of equation, its left side will remain unchanged. Therefore, the numbers behind the variables xi involved in the same way as the numbers behind the variables yt and calling them "intensive" and "extensive" factor does not make them different from each other. Which means that whatever we give definitions of these two factors, this does not change the mathematical fact that the variables x;- affect the values of S in the same way as the variables yi.

Third, up to here there is a very important conclusion about the nature of the factors x and , and their influence on the result . It is as follows: With regard to tl^e extensive factor y the aggregation of the variables has a cognitive sense - gets the total

number of units in the aggregate - ^ yi . Hence, the division of each of the variables y of this total means finding the proportion

i=1 k = y' V -1

of units in relevant part v y< * , where ¿^ vy 1. For example, taking the number of phones sold in January of the three

Xy'

1=1

models in the example gives the total number of phones sold - 600. Dividing the number of phones of any model in 600, shows the

share: a model "A" - 16,67%, model "B" - 33,33%, etc. Similarly, for February. Therefore, when considering the factor y , it is assumed that it affects the values of magnitude in two ways: first - the values of the group variables y i, and second - with their relatives or relative frequencies vy , i.e. - with the "structure" of this group.

As to the intensive factor x, aggregation of the variables , though entirely possible from a mathematical point of view, there is no identifiable cognitive sense. Example, the aggregation of prices in January of the three models will result in a number that does not mean anything - 225 €. Nothing means the referral of each price to this number (for model "A" - 42,22%, for model "B" - 33,33%,

v = X ^

etc.) - the result is obtaining relative values: v x k , where \ v x, =1, which have no meaning. In other words, using terms

X y 1

i=1

like "structure" and "relative shares" for intensive factor x is not accepted.

It was said however, that the two groups of variables and yaffect in the same way the variable S . It follows that if the factor y has a structure and this structure and its modification influence the values of and their amendments, the same should be true of factor x and it affects with the values of and with their relative frequencies vXi. And that for the latter term "relative units" sounds unacceptable, is hardly relevant to this impact.

Indeed, what is the logic to think that in the expression: S = xxyx + x2y2 +... + xkyk , the change of group of variables y;-affects S by the change in their values and the changed relationship between them, and for the group of variables x;- that does not apply? If only because, for them the term "structure" is not used, it is absurd. It is clear that changing the values of variables x; changes the ratio between them in the same way it happens in the group of variables y. But after a change in the ratio in one of the groups of variables affects in any way the values of S , isn't it completely logical in the same way to be influenced by the other? And the fact that for the first ratio the term "structure" is appropriate, but for the other one is not - it hardly matters.

Otherwise, when we analyze the influence factors x and we should seek structural impact not only on extensive factor y , but also in intensive . Moreover. Since, as we said two groups of variables and y;- influence in the same way the values of S the structural influence should be the same. Is it so, we will see later. But now, having in mind what was said up to here, I will enter a more advanced definition of "structure", referring to the two factors:

For each group or row of k-numbers of variables ai, where i = 1,2,3,..., k , there is a second group or row of the same k-

k

numbers of variables v a• ~ k , where ^ v a = 1 that can be called the "structure" of the first, and values va, - "relative

Z N '

i =1

frequencies" of the variables at. In most studies, the variables at are positive numbers, so the elements of "structure" are

a, a,

Va =-

Va. k--' at k

' . However, I use the term i i because they may have a negative sign.

a / \a

. However, I use the term

X, a H'

i=l '=1 Having in mind the aforesaid, I will demonstrate one method of analysis of the relative change the additive-multiplicative

variables of type: S = ^ xtyt . It is based on the presentation of the two groups of variables x; and y;- as vectors in multidimensional space. Let us denote them as: Wx (Xl, x 2x k) and Wy (y1, y 2,..., y k) .Coordinates of these vectors are the variables x;

I

Статистика и математические методы в экономике and y, and their lengths are:

Wx

1

1=1

and

W y

У,

Wk is an indication of vector of structure, whose parts are equal.

The lengths of the vectors can be represented as:

1

E xi = E x.

I=i

I =1 k

E

I=i

x and

= J E y i = E y < J 2 v 2,.

V 1=1 1=1 V 1=1 The last two equations are important because they provide a relatively difficult to interpret mathematical charac-

teristics - length of vector Wy (

k

Y

1=1

Уi ), by two other,

popular enough and used in statistics - the sum of the

k

values of the groups of variables y (E y • ) and coeffi-

f

cient of structural unevenness of Hirschman:

i =1

The same applies to the presentation of the length of the vector W .

The values

i=i

and

Z

1=1

Figure 1: Presentation of the two groups of variables xf and yi, their structures as vectors in three-dimensional space

'y, are lengths of vectors, whose coordinates are not the values x, and y, but their relative

frequencies: vx, and vy, . That is these are vectors that represent not the groups of variables x, and y, but their structures. I will

denote them by Wvx (v, vx2,..., Vx^and wvy (v, Vy2,..., VyJ.

The angle between the vectors w and W is ¿aand its cosine is:

cos axy

i=1

Î

k k .2 '

E x2 E y2.

/i =i i=i

It should be noted thatthe direction of the vectors Wx and Wy depends on the structure of groups of variables xt and y . Therefore, directions of W and W are always the same, and the directions of W and Wvy , as well. Therefore, the angle they

X VX . . y vy

conclude among themselves is the same: , or:

k

E y

C0S axy =■

Z'

i=1

k k

Z Z y 2

i=1 i=1

П

■ = cos av

Z v 2 Z v 2

i=1 i=1

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

When the angle vxvy increases from 0 to j, its cosine decreases from 1 to 0. Therefore,

measures the closeness

of the vectors Wvx and Wvy . Since Wvx and Wvy are structures of groups of variables x, and y, the term "closeness of vectors

vy

Wvx and Wvy " must be interpreted as the compliance between parts of these structures (compliance between factors x and and

obtaines. The more they

cos a.,

their structures). In other words, the more pairs and vy, are close together, the higher values differ, the lower values obtaines cos av v .

The expression: S = ^ x, y, in terms of vector algebra and analytic geometry, it represents the scalar product of vectors Wx

and Wy . As we know, the scalar product of two vectors p and q is the number p ■ q

of vectors and cos(p, g) the cosine of the angle between the ^.

Expressed in characters introduced, the expression: S = ^ x;y will have type:

•COS(p,q) where

and

are the lengths

; =1

S = Z

Xi У, =

i=1

Z Xi2 .

Z

yi -cos a v v

2

2

k

k

2

x

k

k

k

y

q

Or,

S - Z

i=1

k k

Zv 2 Z y

i=1 i=1

vy, - cos а

= k.x.k.y

I

i=1

I

i=1

V.cosа

where ~ __ î=i

Z y.

is the unweighted average of the group of variables xt, and y = = - the unweighted average of the

k

group of variables y .

к

The relative change of variable S from S0 (at the moment t0 ) into S (at the moment tx ) is:

s 0

4L ZL

x о y о

I

i=1

I

i=1

cos a v

vy1

I

i=1

I

i=1

cos a v

У о i

where in the numerators are the values of the characteristics at the moment tl and in the denominators - at the moment t0. Or,

1S _ 1X 1 y 1 vx 1 vy 1 c0s avxvy ,

where 11 = ^ , IУ =

Zl

У о

1 v x =■

Z^

1 v y =■

i

z

cos a.

2 and 1 cos av,«.

cos a

v x 0 v y 0

As shown, this change depends on five factor components:

- the first component ( I - ) shows the impact on the result S of the changes in average values of xi, measured by their unweighted average: x . Unweighted averages x1 and x0 in the multidimensional space are projections of the vector WXi and vector WXo on vector Wk . The same applies to unweighted averages y1 and y0 .

- the second component ( I y ) shows the impact on the result S of the changes in average values of yi, measured by their unweighted avarage: y ;

- the third component ( Iv ) shows the impact on the result S of variability of the unevenness of the structure of factor x,

measured by the coefficient of inequality of Hirschman:

- the fourth component (1 vy ) shows the impact on the result

of variability of the unevenness of the structure of factor y

measured by the coefficient of inequality of Hirschman:

the fifth component (

I,

) shows the impact on the result S of the change of the degree of compliance between

structures of the factors x and , measured by the cosine of the angle between vectors and Wvy .

Each of the five components reflect the influence of certain property of groups of variables xf and y (of factors x and ) on the values of the result . Values greater than one of any component means that the amendment of the property has influenced in direction of increase of S . Values less than one - in direction of reducing of S .

From the viewpoint of geometry, these components show five separate impacts of the change of vectors Wx and Wy on the change of their scalar product: the first and the second components - the individual effects of the changing of lengths of these vectors, the third and the fourth components - the individual effects of changing their unevenness, the fifth component - individual effects of changing the angle between them.

From statistical point of view, the first and the second components (I~x and 1 y ), show impacts of the changes of unweighted averages of group of variables xi and y .

The third and the fourth components (Ivx and 1 Vy ) show impacts of the changes of structures of groups of variables xt and y. It is obvious that this property can be defined as a degree of differentiation of values of x; and y; - for example, larger differentiation within the group, the greater will be the unevenness of the structure of this group. Less differentiation - the more uniform structure will be.

The fifth component (1 cos aVxVy ) shows impacts of the changes of the degree of compliance between the structures of both factors. This property is not a property of one or other factor. It is their joint property. Of course, it can be caused by modifying the structure of both simultaneously or each of them separately. However, this property should not be attributed to either intensive or extensive a factor, still less to seek ways of sharing it between them. Similarly, the change of cosine of the angle between two vectors Wvx and Wvy can be caused by removal of either or both of the two vectors. However, the values of the scalar product is not affected by the transfer of vectors itself, but their mutual disposition in multidimensional space, i.e. - the angle between them.

References

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

1. Bikel, P. J., E. A. Hammel and J. W. O'Connell (1975). "Sex Bias in Graduate Admissions: Data From Berkeley". Science 187 (4175), 398-404.

2. Colin, R. Blyth (June 1972). "On Simpson's Paradox and the Sure-Thing Principle". Journal of the American Statistical Asso-

2

x

2

2

v

x

2

2

v

x

о

2

2

x

i=1

i = 1

xi' y 1

2

0

x

i=1

i =1

xvy

ciation 67 (338), 364-366.

3. Pearson, K; Lee, A.; Bramley-Moore, L. (1899). "Genetic (reproductive) selection: Inheritance offertility in man". Philosophical Translations of the Royal Statistical Society, Ser. A 173, 534-539.

4. Simpson, Edward H. (1951). "The Interpretation of Interaction in Contingency Tables". Journal of the Royal Statistical Society, Ser. B 13, 238-241.

5. Yule G. U. (1903). "Notes on the Theory of Association of Attributes in Statistics". Biometrika 2, 121-134. Продолжение статьи читайте в №s2 - 2012 журнала «Экономика, статистика и информатика. Вестник УМО»

_

i Надоели баннеры? Вы всегда можете отключить рекламу.