Mathematical Structures and Modeling 2017. N. 3(43). PP. 43-49
UDC 519.652 DOI: 10.25513/2222-8772.2017.3.43-49
WHY LINEAR INTERPOLATION?
Andrzej Pownuk
Ph.D. (Phys.-Math.), Instructor, e-mail: ampownuk@utep.edu Vladik Kreinovich
Ph.D. (Phys.-Math.), Professor, e-mail: vladik@utep.edu
University of Texas at El Paso, El Paso, Texas 79968, USA
Abstract. Linear interpolation is the computationally simplest of all possible interpolation techniques. Interestingly, it works reasonably well in many practical situations, even in situations when the corresponding computational models are rather complex. In this paper, we explain this empirical fact by showing that linear interpolation is the only interpolation procedure that satisfies several reasonable properties such as consistency and scale-invariance.
Keywords: linear interpolation, scale-invariance.
1. Formulation of the Problem
Need for interpolation. In many practical situations, we know that the value of a quantity y is uniquely determined by the value of some other quantity x, but we do not know the exact form of the corresponding dependence y = f (x).
To find this dependence, we measure the values of x and y in different situations. As a result, we get the values y, = f (x,) of the unknown function f (x) for several values xi,...,xn. Based on this information, we would like to predict the value f(x) for all other values x. When x is between the smallest and the largest of the values x,, this prediction is known as the interpolation; for values x smaller than the smallest of x, or larger than the largest of x,, this prediction is known as extrapolation; see, e.g., [1].
Simplest possible case of interpolation. The simplest possible case of interpolation is when we only know the values y1 = f (x1) and y2 = f (x2) of the function f (x) at two points x1 < x2, and we would like to predict the value f (x) at points x E (x1, x2).
In many cases, linear interpolations works well: why? One of the most well-known interpolation techniques is based on the assumption that the function f (x) is linear on the interval [x1,x2]. Under this assumption, we get the following formula for f(x):
This formula is known as linear interpolation.
The usual motivation for linear interpolation is simplicity: linear functions are the easiest to compute, and this explains why we use linear interpolation.
An interesting empirical fact is that in many practical situations, linear interpolation works reasonably well. We know that in computational science, often very complex computations are needed, so we cannot claim that nature prefers simple functions. There should be another reason for the empirical fact that linear interpolation often works well.
What we do. In this paper, we show that linear interpolation can indeed be derived from fundamental principles.
2. Analysis of the Problem: What Are Reasonable Properties of an Interpolation
What is interpolation. We want to be able, given values y1 and y2 of the unknown function at points x1 and x2, and a point x e (x1,x2), to provide an estimate for f (x). In other words, we need a function that, given the values x1, y1, x2, y2, and x, generates the estimate for f (x). We will denote this function by I(x1,y1 , x2,y2,x). What are the reasonable properties of this function?
Conservativeness. If both observed values y, = f (x,) are smaller than or equal to some threshold value y, it is reasonable to expect that all intermediate values of f (x) should also be smaller than or equal to y. Thus, if y1 ^ y and y2 ^ y, then we should have I(x1,y1,x2,y2,x) ^ y.
In particular, for y = max(y1 ,y2), we conclude that
I(x1,y1,x2,y2,x) ^ max(y1,y2).
Similarly, if both observed values y, = f (x,) are greater than or equal to some threshold value y, it is reasonable to expect that all intermediate values of f(x) should also be greater than or equal to y. Thus, if y ^ y1 and y ^ y2, then we should have y ^ I(x1 ,y1,x2,y2,x).
In particular, for y = min(y1,y2), we conclude that
min(y1,y2) ^ I(x1,y1,x2,y2,x)
These two requirements can be combined into a single double inequality
min(y1,y2) ^ I(x1,y1,x2,y2,x) ^ max(y1,y2).
We will call this property conservativeness.
x-scale-invariance. The numerical value of a physical quantity depends on the choice of the measuring unit and on the starting point. If we change the starting point to the one which is b units smaller, then b is added to all the numerical values. Similarly, if we replace a measuring unit by a one which is a > 0 times smaller, then all the numerical values are multiplied by a. If we perform both changes, then each original value x is replaced by the new value x' = a ■ x + b.
For example, if we know the temperature x in Celsius, then the temperature x' in Fahrenheit can be obtained as x' = 1.8 ■ x + 32.
It is reasonable to require that the interpolation procedure should not change if we simply change the measuring unit and the starting point — without changing the actual physical quantities. In other words, it is reasonable to require that
I (a ■ x1 + b, y1, a ■ x2 + b, y2, a ■ x + b) = I (x1, y1, x2, y2, x).
y-scale-invariance. Similarly, we can consider different units for y. The interpolation result should not change if we simply change the starting point and the measuring unit. So, if we replace y1 with a ■ y1 + b and y2 with a ■ y2 + b, then the result of interpolation should be obtained by a similar transformation from the previous result: I ^ a ■ I + b. Thus, we require that
I(x1, a ■ y1 + b, x2, a ■ y2 + b, x) = a ■ I(x1, y1, x2, y2, x) + b.
Consistency. Let us assume that we have x1 ^ x1 ^ x ^ x'2 ^ x2. Then, the value f (x) can be estimated in two different ways:
• we can interpolate directly from the values y1 = f (x1) and y2 = f (x2), getting I(x1,y1,x2,y2,x), or
• we can first use interpolation to estimate the values f (x1) = = I(x1,y1,x2,y2,x1) and f(x'2) = I(x1,y1,x2,y2,x'2), and then use these two estimates to estimate f(x) as
I (x1,f (x1),x2,f (x2),x) = = I (x1,I (x1,y1,x2,y2,x1),x'2,I (x1,y1,x2,y2,x2),x). It is reasonable to require that these two ways lead to the same estimate for f(x): I (x1,y1,x2,y2,x) = I (x1,I (x1,y1,x2,y2,x1),x'2,I (x1, y1, x2, y2, xi>) , x).
Continuity. Most physical dependencies are continuous. Thus, when the two value x and x' are close, we expect the estimates for f (x) and f (x') to be also close. Thus, it is reasonable to require that the interpolation function I(x1,y1,x2,y2,x) is continuous in x — and that for both i = 1,2 the value I(x1,y1,x2,y2,x) converges to f (x,) when x ^ x,.
Now, we are ready to formulate our main result.
3. Main Result
Definition 1. By an interpolation function, we mean a function I(x1,y1,x2,y2,x) which is defined for all x1 < x < x2 and which has the following properties: conservativeness:
min(y1,y2) ^ I(x1,y1,x2,y2,x) ^ max(y1,y2)
for all x,, yit and x;
• x-scale-invariance: I (a ■ x1 + b, y1, a ■ x2 + b,y2, a ■ x + b) = I (x1, y1, x2, y2, x) for all xi, yi, x, a > 0, and b;
• y-scale invariance: I(x1,a ■ y1 + b,x2,a ■ y2 + b, x) = a ■ I(x1 ,y1,x2,y2,x) + b for all x,, y,, x, a > 0, and b;
consistency:
I (x1,y1,x2,y2,x) = I (x1,I (x1 ,y1,x2,y2,x1),x/2,I (x1, y1, x2, y2, xij) , x)
for all x,, x,, y,, and x; and
• continuity: the expression I(x1,y1,x2,y2,x) is a continuous function of x, I(x1,y1,x2,y2,x) ^ y1 when x ^ x1 and I(x1,y1 ,x2,y2,x) ^ y2 when x ^ x2.
Proposition. The only interpolation function satisfying all the properties from Definition 1 is the linear interpolation
. . x x1 x2 x , ,
I (x1,y1,x/,y2,x) =--y/ +---y1. (1)
x/ — x1 x/ — x1
Discussion. Thus, we have indeed explained that linear interpolation follows from the fundamental principles - which may explain its practical efficiency.
Proof.
1°. When y1 = y2, the conservativeness property implies that I(x1,y1,x2,y1,x) = = y1. Thus, to complete the proof, it is sufficient to consider two remaining cases: when y1 < y2 and when y2 < y1.
We will consider the case when y1 < y2. The case when y2 < y1 is considered similarly. So, in the following text, without losing generality, we assume that y1 < y/.
2°. When y1 < y2, then we can get these two values y1 and y2 as y1 = a ■ 0 + b and y2 = a ■ 1 + b for a = y2 — y1 and b = y1. Thus, the y-scale-invariance implies that
I(x1,y1,x/,y2,x) = (y/ — y1) ■ I(x1,0,x/, 1,x) + y1. (2)
If we denote J(x1,x/,x) == I(x1,0,x/, 1,x), then we get
I(x1,y1 ,x/,y/,x) = (y/ — y1) ■ J(x1,x/,x) + y1 =
= J(x1,x/,x) ■ y/ + (1 — J(x1 ,x/,x)) ■ y1.
(3)
3°. Since x1 < x2, we can similarly get these two values x1 and x2 as x1 = a ■ 0 + b and x2 = a ■ 1 + b, for a = x2 — x1 and b = x1. Here, x = a ■ r + b, where
x - b x - x1
r =
a x2 — x1
Thus, the x-scale invariance implies that
J(x1,x2,x) = J (0,1, ———
x2 — x1
So, if we denote w (r) = J (0,1, r), we then conclude that
J(x1,x2,x) = w ( --
x2 — x1
and thus, the above expression (3) for I(x1 ,y1,x2,y2,x) in terms of J(x1,x2,x) takes the following simplified form:
I(x1,y1,x2,y2,x) = w( ——— ) ■ y2 + (1 — w( ——— ) ■ y2 ) ■ y1. (4)
x2 — x1 x2 — x1
To complete our proof, we need to show that w(r) = r for all r e (0,1).
4°. Let us now use consistency.
Let us take x1 = y1 = 0 and x2 = y2 = 1, then
I(0,0,1,1, x) = w(x) ■ 1 + (1 — w(x)) ■ 0 = w(x).
Let us denote a =f w(0.5).
By consistency, for x = 0.25 = + . , the value w(0.25) can be obtained if we
apply the same interpolation procedure to w(0) = 0 and to w(0.5) = a. Thus, we get
w(0.25) = a ■ w(0.5) + (1 — a) ■ w(0) = a2.
Similarly, for x = 0.75 = ^^ 1, the value w(0.75) can be obtained if we apply the same interpolation procedure to w(0.5) = a and to w(1) = 1. Thus, we get
w(0.75) = a ■ w(1) + (1 — a) ■ w(0.5) = a ■ 1 + (1 — a) ■ a = 2a — a2. 0.25 + 75
Finally, for x = 0.5 = ——--, the value w(0.5) can be obtained if we apply the
same interpolation procedure to w(0.25) = a2 and to w(0.75) = 2a — a2. Thus, we get
w(0.5) = a ■ w(0.75) + (1 — a) ■ w(0.25) = = a ■ (2a — a2) + (1 — a) ■ a2 = 3a2 — 2a3.
By consistency, this estimate should be equal to our original estimate w(0.5) = a, i.e., we must have
3a/ — 2a3 = a. (5)
5°. One possible solution is to have a = 0. In this case, we have w(0.5) = 0. Then, we have
w(0.75) = a ■ w(1) + (1 — a) ■ w(0.5) = 0,
and by induction, we can show that in this case, w(1 — 2-n) = 0 for each n. In this case, 1 — 2-n ^ 1, but w(1 — 2-n) ^ 0, which contradicts to the continuity requirement, according to which w(1 — 2-n) ^ w(1) = 1.
Thus, the value a = 0 is impossible, so a = 0, and we can divide both sides of the above equality (5) by a.
As a result, we get a quadratic equation
3a — 2a/ = 1,
which has two solutions: a =1 and a = 0.5.
6°. When a = 1, we have w(0.5) = 1. Then, we have
w(0.25) = a ■ w(0.5) + (1 — a) ■ w(0) = 1,
and by induction, we can show that in this case, w(2-n) = 1 for each n. In this case, 2-n ^ 0, but w(2-n) ^ 1, which contradicts to the continuity requirement, according to which w(2-n) ^ w(0) = 0.
Thus, the value a = 1 is impossible, so a = 0.5.
7°. For a = 0.5, we have w(0) = 0, w(0.5) = 0.5, and w(1) = 1. Let us prove,
P
by induction over q, that for every binary-rational number r = — e [0,1], we have
w(r) = r.
Indeed, the base case q = 1 is proven. Let us assume that we have proven it
2k k
for q — 1, let us prove it for q. If p is even, i.e., if p = 2k, then — = , so the desired equality comes from the induction assumption. If p = 2k + 1, then
p 2k + 1 2k 2 ■ (k + 1) k k + 1
r = — = —— = 0.5 — + 0.5--v ; = 0.5--r + 0.5 ■ —.
2q 2q 2q 2q 2«-1 2«-1
By consistency, we thus have
k + 1
w(r) = 0.5 • w [ -- + 0.5 • w . ,
v ' 1 2q-1/ V 2q-1
By induction assumption, we have
. k \ k , (k +1\ k + 1 w -—- = -—- and w
2q-1 j 2q-1 \ 2q-1 / 2q-1
So, the above formula takes the form
k + 1 2«-1 '
r.
The statement is proven.
8°. The equality w(r) = r is true for all binary-rational numbers. Any real number x from the interval [0,1] is a limit of such numbers — namely, truncates of its infinite binary expansion. Thus, by continuity, we have w(x) = x for all x.
Substituting w(x) = x into the above formula (4) for I(x1,y1,x2,y2,x) leads exactly to linear interpolation. The proposition is proven.
Acknowledgments
This work was supported in part by the National Science Foundation grants HRD-0734825 and HRD-1242122 (Cyber-ShARE Center of Excellence) and DUE-0926721, and by an award "UTEP and Prudential Actuarial Science Academy and Pipeline Initiative" from Prudential Foundation.
1. Burden R.L., Faires J.D., Burden A.M. Numerical Analysis. Cengage Learning, Boston, Massachusetts, 2015.
ПОЧЕМУ ЛИНЕЙНАЯ ИНТЕРПОЛЯЦИЯ? А. ^внук
к.ф.-м.н., ст. преподаватель, e-mail: ampownuk@utep.edu В. Крейнович
к.ф.-м.н., профессор, e-mail: vladik@utep.edu
Техасский университет в Эль Пасо, США
Аннотация. Линейная интерполяция — это простейший в вычислительном отношении из всех возможных методов интерполяции. Интересно, что он работает достаточно хорошо во многих практических ситуациях, даже в ситуациях, когда соответствующие вычислительные модели довольно сложны. В этой статье мы объясняем этот эмпирический факт, показывая, что линейная интерполяция является единственной процедурой интерполяции, которая удовлетворяет нескольким разумным свойствам, таким как согласованность и масштабная инвариантность.
Ключевые слова: линейная интерполяция, масштабная инвариантность.
References
Дата поступления в редакцию: 09.04.2017