Научная статья на тему 'Theoretical properties of principal component score density estimators in functional data analysis'

Theoretical properties of principal component score density estimators in functional data analysis Текст научной статьи по специальности «Физика»

CC BY
117
49
i Надоели баннеры? Вы всегда можете отключить рекламу.
Ключевые слова
ШИРИНА ОКНА / ПЛОТНОСТЬ / ОЦЕНИВАНИЕ / СОБСТВЕННЫЕ ФУНКЦИИ / СОБСТВЕННЫЕ ЧИСЛА / РАЗЛОЖЕНИЕ КАРУНЕНА-ЛОЭВА / ЯДЕРНЫЕ МЕТОДЫ / СПЕКТРАЛЬНОЕ ПРЕДСТАВЛЕНИЕ / BANDWIDTH / DENSITY / ESTIMATION / EIGENFUNCTIONS / EIGENVALUES / KARUHEN-LOEVE EXPANSION / KERNEL METHODS / SPECTRAL DECOMPOSITION

Аннотация научной статьи по физике, автор научной работы — Delaigle Aurore, Hall Peter

Анализ главных компонент играет ключевую роль при исследовании функциональных данных хотя бы потому, что этот метод дает значительные возможности при редуцировании бесконечномерной проблемы в соответствующую задачу в пространстве конечной размерности. Он также позволяет понять, какие из разнообразных свойств распределения функционального объекта доступны идентификации по наблюдениям функциональной выборки. Некоторые из этих свойств могут быть выражены в терминах коэффициентов разложения по главным компонентам. При их исследовании полезно знать типы распределений этих коэффициентов, например, типы плотностей этих распределений. В 2010 г. авторами были предложены методы для оценивания этих плотностей и сформулирован теоретический результат, утверждающий, что предложенные оценки плотности асимптотически эквивалентны (в смысле первого члена асимптотики) своим идеальным двойникам, построенным так, как если бы были известны главные компоненты и соответствующие собственные числа, в отличие от реальных оценок, использующих по необходимости оценки этих объектов. В настоящей работе дается доказательство этой эквивалентности.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Principal components analysis plays a key role in functional data analysis, not least because it offers considerable potential for reducing an infinite dimensional problem to one of finite dimensional proportions. It also provides in-sight into variety properties accessible from a sample of functional data. Some of those properties can be expressed in terms of principal components scores. To explore them it is helpful to know the shapes of the distributions of those scores, for example, the shapes of their probability density functions. The authors in 2010 introduced methods for estimating those densities and stated a theoretical result. This result asserted that the density estimators are firstorder asymptotically equivalent to their ideal counterparts constructed using the actual principal component scores, rather than estimators of those scores. In the present paper we give a proof of this equivalence.

Текст научной работы на тему «Theoretical properties of principal component score density estimators in functional data analysis»

THEORETICAL PROPERTIES OF PRINCIPAL COMPONENT SCORE DENSITY ESTIMATORS IN FUNCTIONAL DATA ANALYSIS

Aurore Delaigle1, Peter Hall2

1. Department of Mathematics and Statistics, The University of Melbourne, VIC 3010, Australia,

Professor, [email protected]

2. Department of Mathematics and Statistics, The University of Melbourne, VIC 3010, Australia,

Professor, [email protected]

1. Methodology and main result

1.1. Dedication. It is a great pleasure to contribute to this volume in honour of Professor V.V. Petrov. Thirty-five years have passed since his master work, Petrov (1975), first appeared, and 15 years since the updated version, Petrov (1995). Both monographs, and Professor Petrov’s work more generally, have had a substantial influence on our research. Indeed, Petrov (1975) has travelled the world so often with the second author that the book is now in a particularly tattered state! However, the most-used portions still travel today. We wish Professor Petrov a very enjoyable 80th anniversary, and a long and healthy life.

1.2. Principal components and estimators. The covariance function K of a random function X defined on a compact interval I admits a spectral decomposition,

K(s; t) = cov{X(s); X(t)} = Y^ 6j0j(s)0j(t),

j=i

where the expansion converges in L2 on 12, and 61 > 62 > ••• are the eigenvalues, with respective orthonormal eigenvectors 0j, of the linear operator with kernel K. The functions -01,-02 ,••• comprise a basis for square-integrable functions on I, which can be represented in terms of their Karhunen—Loeve expansions:

OO OO

X - EX = £ 6l/2(Xj - EXj)0j, x - EX = £ 6 f xj0j,

j=1 j=1

where Xj = 6j 1/2 I x0j and Xj = 6- 1/2 JT ( x — EX)0j are principal component scores corresponding to functions X and x — EX. Given independent and identically distributed observations of X, we wish to estimate the respective probability densities of

Xj — EXj. (We use square-bracketed subscripts to avoid confusing the ith data value X^ with the ith principal component score Xi of X.)

First we compute

-I n OO

K(s,t) = -£{*[<](*) -X(s)}{Xw(t) -X(t)} =

n i=1 j=1

where X = n-1J2 i X[i] and the terms are ordered so that 61 > 62 > ••• We interpret 6j and 0j as estimators of the eigenvalues 6j and eigenfunctions 0j, respectively. See, for © Aurore Delaigle, Peter Hall, 2011

example, Ramsay and Silverman (2005, Chapters 8-10). Then we calculate approximations Xj = 6- 1/2 Jj (X[i] — X)0j to the principal components Xj = 6- 1/2 JT (X[i] — EX)0j. We

define too xj = 6j 1/2 It ( x — X)ripj, an estimator of xj. Our estimator fj of the probability

1/ 2

density function fj of 6j ' (Xj — EXj) can now be computed using standard kernel methods:

where h denotes a bandwidth and W is a kernel function.

Observe that if u = x definition of fj (xj), giving:

Observe that if u = xj then X cancels from the numerator inside the kernel in the

Afe> = ^X>

nh i~l I he]j2 ,

The “ideal” estimator of fj, which we would use if we knew 6j and 0j, is

1 n

fM-xD*-

nhit I h0V2

Elementary calculus can be used to prove that, as an estimator of fj (xj), fj (xj) has variance and bias asymptotic to wfj(xj)/(nh) and ^W2fj{xj)h?, respectively, where w = f W2 and W2 = f u2W(u)du. We shall show that fj and fj are asymptotically equivalent, from which it follows that the formulae above for asymptotic bias and variance hold for fj (xj) as well as fj (xj).

1.3. Asymptotic equivalence of fj (xj) and fj (xj). We assume that:

for each C > 0 and some 5 > 0, supE{\X(t)| } < to

tel

and sup E [{\s — t\-S\X(s) — X(t)|}c] < to;

s,teT:s=t

for each integer r > 1,6- r E (X — EX)0j|

(11)

(1/2)

(1-4)

is bounded uniformly in j; there are no ties among the j0 + 1 largest eigenvalues 6j; (1^3)

the density fj of the jth principal component score is bounded and has a bounded derivative; the kernel W is a symmetric, compactly supported probability density with two bounded derivatives; for some 5 > 0, h = h(n) = O(n-S) and n1-Sh3 is bounded away from zero as n ^

Recall that fj (xj) and fj (xj) can be interpreted as functionals of x.

Theorem. If (1.1)-(14) hold then, for all c> 0,

suP \fj(xj) — fj(xj^ = op{(nh)-1/2}• (15)

xeS(c)

2. Proof of theorem

To help clarify epsilon-delta arguments below we shall (when ambiguity might arise) use e in the context of a property that holds for all e > 0, and 5 when a property holds for some positive 5. Define \\x\\2 = fj-x(t)2dt and assume that E(X) = 0.

Step 1. Reducing the class S(c). Let Sj (c) denote the set of functions x G S(c) for which x is a constant multiple of 0j. Given x G S(c) we define the function x(j) = 0j x0j; therefore, x(j G Sj (c). Define also

= a(x) =

(

j _

x)0 j

h6

1/2

= ai(x) =

h6

1/2

In this notation we have, after Taylor expansion of W, using the fact that W has two bounded derivatives (see (1.4)),

1 n 1

= ^ W{dj + a) = U\ + C1U2 + —d2Us,

i=1

where

n n n

v> = -hIx*>. ^ =

i=1 i=1 i=1

and 0 < Oi < 1. In Step 9 below we shall show that, for all e > 0,

Qit

sup \U2\ = Op{n£(nh2) 1/2 + h},

xeS(c)

sup \U3\ = Op(nE)•

xeS(c)

(21)

(22)

(23) (2 • 4)

Define Z = n1/2(K — K) and write J Z0j0k for ffT2 Z(s,t)0j(s)0k(t)dsdt, put \\\K — K\\\2 = ffT2 (K — K )2, and let J = I (\\K — K\\\ < d) where d > 0 is a constant depending on the distribution of X. It can be proved, as in the derivation of Theorem 2.1 of Hall and Hosseini-Nasab (2009), that

6 j — 6j = n 1/2 J Z0j0j + n 1Rj1 = Op (n 1/2,

0 j (t) — 0 j (t) = n-1/2 ^ (6j — 6k )0k (t) ( Z0j 0k + n-1Rj2(t), k:k=j J

where, under conditions (1.1)—(1.3), for each 1 < j < jo and

for all C > 0,supE(\Rj1\CJ) < to, supE{sup\Rj2(t)\CJ} < to• n>1 n>1 tel

It can be shown from (1.1), (1.2) and Markov’s inequality that

for all C,d> 0, P(\\\K — K\\\ >d) = 1 — O(n-C)•

(2 5) (2 6)

(2 7)

(2 8)

Results (2.5)—(2.8) entail: \a\

(he1!‘2) 11 (x^j) — x)(0j— 0j)

j

<

< h 16 - 1/22\\x\\\\0j — 0j\\Op(h 1n 1/2)

(2.9)

uniformly in x G S(c). Together, (2.1), (2.3), (2.4) and (2.9) imply that, uniformly in x G

S(c),

f j(xj) — U1 = Op h-1n-1/2 • {nE(nh2)-1/2 + h}(h-1n-1/2) • h-£ = op{(nh)-1/2}• (2J0)

To obtain the last identity in (2.10) we used the fact that, for some 5 > 0, n1-Sh3 is bounded away from zero as n ^ to (see (1.4)), and we took e > 0 sufficiently small.

Define Yij = 6 - 1/2 fi(Xi — xj )0 j and

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

Yij = 6-1/2 ((X{i] — x)0j = 6-1/2 ((Xi] — xj )0j, (211)

J IE J IE

and note that

Ui - fiixj) = — - WiYij/h)}. (2.12)

The desired result (1.5) will follow from (2.10) and (2.12) if we prove that

1

sup

xMeSj (c)

nh - WiYy/h)}

i=1

= Op{(nh) 1/2}• (2J3)

Result (2.13) requires a supremum over only a scalar, Jj x0j, rather than over the function x, and so can be established relatively simply. This will prove useful in Step 8, where we establish (2.13).

Step 2. Expansions of 6 j and 0j. Define

£j = ~T)®j [ Ztpjtpj, Vj(t) = 53 ^ 1pk(t) f Ztpjtpk

J k:k=j J

and Qj = 6j 1/2 I(X[i] — xj )r/j. Results (2.5)—(2.7), and the fact that supj \6 j — 6j \ < \\\K — K||\ (this result is classical; see Hall and Hosseini-Nasab (2006) for recent references) imply that, using a potentially smaller value of d in the definition J = I(\\\K — K||\) < d),

a ~1/2 = 6J1/2(1 + n~1/2^j + n-1Rj3), 0 j = 0 j + nj + Rj4,

jj

where the scalar Rj3 and function Rj4 satisfy the properties ascribed to Rj1 and Rj2, respectively, in (2.7). From this result and (2.7) we deduce that Yj = (1 + n-1/2^j)Yij +

-1/2Cij + n 1Sij1, where Sj1 satisfies, in the case r = 1,

In particular,

for all C > 0, sup sup E(J\Sijr\C) < to• (214)

xeS(c) n,i>1

Yij = Yij + Sj + S(2), (21)

where

n1/2Sj = j Yij + Zij, nSj = Sij1 • (216)

n

Step 3. Expansion of W(Yj/h). Using (2.15) and the fact that W has two bounded derivatives, we obtain:

W (Yij/h) = W (Yij/h) + h-1(S{j) + S(2))W '(Yij/h) + h-2Aij, (217)

where, for a random variable Qij satisfying 0 < Qij < 1, and with Bk = sup \W(k)\,

2\Aij\ < B2(Sj + S\ff I{\Yij + Qij(Sg} + Sg})\ <h}<

< B2(Sj + S™)2{I(\Yij\<2h)+I(\Sg} + S(?\>h)}• (2.18)

Therefore,

n n

E \Aij \ < B^ E {(S(1))2 + (Sf)2} {i(\Yij < 2g)+I(\Sg} + S(2)\>h)} • (219)

i=1 i=1

Defining Bj1 = 6j 1/2 supk.k=j \6j — 6k\-1 and ^j(t) = JT Z(s,t)0j (s)ds, and using the

Cauchy—Schwarz inequality, it can be shown that,

Z3j < B21(jf j Ji(X[i] — x)2 = Sij2, say, where Sij2 satisfies, for r = 2, the condition

for all C > 0, sup sup E(\Sijr\C) < to, (2/20)

xeS(c) n,i>1

which implies (2.14). More simply, using the Cauchy—Schwarz inequality, Sj = £jYij + Zj can be shown to satisfy (2.20) when r = 3. From these properties and (2.16) we deduce that Sij4 = nn S1 + Si(2j')\ and Sij5 = n^Sj1)2 + Sf)2} both satisfy (2.14). Combining these results with (2.19) we deduce that

nn

EE^^d^'l ^ 2h) + I(<SW ^ nV2h)}■ (2.2i)

i=1 i=1

Treating a general version of the right-hand side of (2.21), we let:

Sij6 and Sij7 denote nonnegative random variables (both functionals of x) satisfying (2.14), with the property that for each x G S(c) the pairs (2/22)

(Sij6,Sij7), for 1 < i < n, are independent and identically distributed.

We shall show in the next section that if (2.22) and the conditions on fj and h in (1.4) hold, then, for all C, e > 0,

sup E

xeS(c)

nC Jn

— E^yeUdYy | < 2h)+I(Sij7 > n^h)}

i=1

O(ne )• (2/23)

Step 4. Proof that (2.22), and the conditions on fj and h in (1.4), imply (2.23). Let £1 be any positive number, and define Ii = I(Sij6 < n£l and (Sij7 < n£l) and Ii = 1 — Ii. Then

(2.22) follows if we prove the two versions of that result, in which Ii and Ii, respectively, are included as factors of the *th term in the series on the left-hand side of (2.23).

Without loss of generality, the exponent C in (2.23) is a positive integer. Choose C1 > 1 so large that £\(1 — C\) + 1 < 0. If we include Ii as the factor in (2.23) then the expected value on the left-hand side of (2.23) is bounded above by 2C multiplied by:

J

J

E\^lY. TiSiit J ^ E { ^ H Ti (n£l (5W«£l)Cl + Sij6(Sij7/n£1)

Ci-n

i=1

nh

i=1

<

< nC n-Ci (Ci-1)E + Sj S^-1)} =

= (nC1 (1-Ci) + C ) = o(1),

since EJSC^1 and E (JSij6 SC7) are bounded (see (2.23) and thence (2.14)) and C£i (1 — C1) + C < 0. (All bounds here hold uniformly in x G S(c). The third inequality applies for all sufficiently large n.) Therefore it suffices to prove that if £ > 0 is given, if k > 1 is an integer, and if £1 = £1(k,£) > 0 is chosen sufficiently small,

sup E

x^S(c)

J^IiSije {I (Yj | < 2h) + I (Sj < n12h)}

i=1

O{(nh)knEi}.

(2.24)

Expanding [^2i\k as a k-fold series, and then taking expectation, we see that the expected value on the left-hand side of (2.24) can be written as a k-fold series over i1,... . That

series can be broken up into k different portions, corresponding to the respective cases where exactly £, for 1 < I < k, of the integers i1,...,lk are distinct. The largest order of magnitude of upper bound is obtained when t = k. Indeed, using the argument that we shall give below for t = k, it can be shown that in the case of general t an identical order-of-magnitude bound, except reduced by the factor (nh1/b)-(k-i where b > 1, introduced below, is chosen arbitrarily close to 1, is obtained for t < k — 1 .

To derive a bound when t = k, note that, if a,b > 1 satisfy a-1 + b-1 = 1, then for each integer > 1 the series being bounded equals

E [IiiSiij6{I(lYiijI < 2h) + I(Siij7 > n1/2h)}

ii =1 ik = 1

E

IikSikj6{I(lYikj| < 2h) + I(Sikj7 > n12h)}

<

<

YJ{E(IiStj6}a{P(lYij I < 2h)1/b + P(n£i > Sij7 > n12h)1 b}

= O{(n1+£i h1 b)k},

(2.25)

uniformly in x G S(c), for all b,C > 0. The far left-hand side of (2.25) was derived using the independence of the summands IiSij6{I(IYj | < 2h) + I(Si,j7 > n12h)}, for 1 < i < n. To obtain the last identity in (2.25) we used (i) the fact that P(lY^ij | < 2h) = O(h) uniformly in i, for each 1 < j < jo (on account of the fact that fj is bounded; see (1.4)), (ii) the property

C

C

k

X

k

IiSijQ < n i, and (iii) the fact that, for 0 < £1 < \/6, P(n ei > Sij6 > n1/2h) = 0 for all

sufficiently large n, since nh? > 1 for all sufficiently large n, implying that n1 2h > n16. Result (2.24), for the series corresponding to t = k, follows from (2.25) on taking £1 > 0 sufficiently small and b > 1 sufficiently close to 1; and the other k — 1 series, discussed in the previous paragraph, can be handled similarly. This completes the proof of (2.23).

Step 5. Simplification of second term on right-hand side of (2.17). By (2.21) and (2.23), suPxes(c){E(Jh-1Y.i Aij|)k }1/ k = O( ne) for all integers > 1 and each £ > 0, and so by

(2.17),

ui = — YJWi%/h) = Ti+T'2 + nE-lh-1Qni{x,£

(2.26)

i=1

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

where U1 is as at (2.26),

= ^Ew) = t2 = —2Y.w^f+s^)w'^/h)

nh2

(2.27)

and for r = 1 and all c,C,£ > 0,

sup E{JQnr(x^)^} < to.

xES(c)

(2.28)

(The second identity in the formula for T1, in (2.27), follows from the second identity in (2.11) and the definition of fj(xj).)

Using the second identity in (2.16) we obtain:

(2.29)

where Sij1 satisfies (2.14). Result (2.23) implies that

sup E <

xeS(c) I nh

J2(Sij1 W' (Yij/h)

i=1

O(nE)

for all C,£ > 0, and so by (2.26) and (2.29),

U1 = T1 + T3 + n 1h 1Qn2(x,£),

(2.30)

where Qn2(x,£) satisfies (2.28) and, using the first part of (2.16),

nn

= rfEsS’riV‘) = 3J5JJ +c« )«■■'(«,/'•)

i=1 i=1

1

n3/ 2h2

E

1/2

E (®j — ®k) 1

jj k:k=j

(Xi]

)0k \ / Z0j0k —

— — 9a 1Yii

Z0j 0j

w '(Yij/h).

(2.31)

Step 6. Removing the sample mean, X, from Z. Recall that K(s,t) = cov{X(s),X(t)},

and note that we may (and will, below) assume, without loss of generality, that E(X) = 0.

We write Z = Z1 — Z2, where

n

Z1(s,t)= n-1/2 53{X[i](s)X[i](t) — K(s,t)}, Z2(s,t)= nl/2X(s)X(t).

i=1

For t = 1 or 2, define T3+£ to be the version of T3, at (2.31), which arises if we replace Z in both places on the far right-hand side by Z. Then, by (2.30), U1 = T1 + T4 — T5 + ne-1h-1Qn2(x,£). We shall show in the remainder of this section that, for some 5 > 0 and for r = 5,

Tr = n-s(nh)-1/2Qn3(x, 5), (2.32)

where Qn3(x, 5) satisfies (2.28). It therefore follows from (2.30) that for some 5 > 0,

U1 = T1 + T4 + n ^(nh) 1/<2QnA(x, 5), (2.33)

where Qn4(x, 5) satisfies (2.28).

By (2.31), T5 = T6 — T7, where

1 f f \ n n T‘ = {]/*’) EEr»w*)»

1

x £ (j — °k)-1(f X[i2]0k) JjX[ii] — xj))0k, k:k=j

Since i Yij W'(Yj/h) and X0j are both expressible as sums of n independent and identically distributed random variables, then it is straightforward to show that T7 = (nh)-1Rjs,

where supx£S(c) E(Rj5 |C) = O(ne) for all C,£ > 0. Therefore it suffices to prove that T6

satisfies (2.32), again for a random function Qn3(x, 5) satisfying (2.28).

Note that

T8 = of T6 = (nh)-2(T81 + T82 + T83Jr X0j, (2.34)

where, with /jjk = E{W'(Yi,j/h) Jj(X[i] — xj )0k }, we define

T81 = E (Oj — 0k)-1 E E \w '(Yiij/h) /' (Xii] — xj )0k — X

k:k=j 1<ii ,i2<n:ii=i2 1

X (XX[i2]0^ ,

T82 = E (0j — 0k)-1 it \w '(Yii j/h) f (Xi— xj) )0k — j)( ! X[{[%

k:k=j i=1 I I n p

T83 = E E (0j — 0k )-1^jk X[i]0k.

A_1 7„.7

i=1 k:k=j

By Rosenthal’s inequality, and for each b > 1 and each integer C > 1, there exists a constant C1C) > 0, depending only on C and such that, uniformly in x G S(c),

(

E\n-1 T^r < C1 (C)

E N "y ' (0j — 0k) 1 №jk f X[1]0k [k:k=j

+ nE N (0j — 0k) 1 ^jk j X[1]0k

kk=j Jl

+

k:k=j O(nCh2C/b).

(2.35)

To derive the last identity in (2.35) we used the fact that, with \\^\\2 denoting ^k>1 /jk, we have:

53 (0j — 0k) 1VjkJ^_ X[1]0k ^ mAX 0 — 0k | ^ it (KjiX[1]0k^j

k:k=j I k=1j

1/2

= 0j — 0k) j\X[1]\\^1,

and, if a,b > 0 satisfy a-1 + b-1 = 1, then, since fj is bounded, and all moments of ||X|| are finite,

°° f r 1 2

= E {W '(Yij/h)(Xi— xSjj)} <

k=1 ,'T

< B2P (\Yij | < hfb j {E(\X[i] — x^j)|a)}2/a = O(h2/b, (2.36)

uniformly in x G S(c). Result (2.35) implies that, for each £ > 0 and for a random function

Qn5 satisfying (2.28),

(nh)-2^ = ne-(3/2)h-1Qn5(x,£). (2.37)

Defining Bj2 = supk:k=j O — Ok |-1, we have |Ts2І < Bj2(T821 + T822) where

n OO p p

T821 = EE W'(Yij/h^ / (XK — xj)0k / X[i]0k

JT JT

i=1 k=1

n ( 00 /,

<E^ '(Yij/h^lY. / (X[

i=1 U=1 1

] — xj)0k

<

E

\k=1

X[i]0k

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

1/2

'(Yij/hMXi— xj ||||X[i

i=1

n

i=1 k=1

X[i]0k

<11^^ '(Yij/h)HX[i]\. i=1

C

2

2

2

It follows from (2.23) that, for i = 1, 2,E\(J/nh)T82e\C = O(ne), uniformly in x G S(c) and for all C,i> 0; in the case i = 2 we used (2.36). Therefore, E\(J/nh)T82\C = O(n£), uniformly in x G S(c) and for all C,e > 0. That is, for all e > 0,

(nh)-2\T82\ = n£-1h-1Qne(x, e), (2.38)

where Qn6 satisfies (2.28).

To bound the moments of T8i we write

Hi(X[ilhX[i2]) = £ (Oj - Ok )-1 { W'(Yij/h)J(X[i]- xj )0k -k:k=j

X (X XM ^k)

and H(X[i1] ,X[i2]) = H1(X[i1] ,X[i2]) + H1(X[i2] ,X[i1]), and Put

i2 -1

S(i2) = £ H (X[ii],X[i2]). il = 1

The variables S(i2) are martingale differences, in that E{S(i2)\Fi2-1} = 0 where Fi2-1 denotes the sigma-field generated by X[1],..., X[i2-1]. Also,

n

T81 = £ £h(X{ii],X{i2]) = Y S(i)

1<ii i <n:ii=i2 i=2

is a degenerate U-statistic: E{H(X[il],X[i2])\X[il]} = 0 for i1 = i2, and H(X[il],X[i2]) = H(X[i2],X[il]). To simplify understanding, let C > 1 be an integer. By Burkholder’s inequality for martingales, there exists a constant C2(C) > 0, depending only on C, such that

E\TS1\2C < C2(C)E j£S(i)2Cj < C2(C^£(ES(i)2C. (2.39)

If we condition on X[i2] then S(i2) equals a sum of independent random variables with zero means, and by Rosenthal’s inequality applied to such a sum, and taking the expectation of an expected value conditional on X[i2 ],

\2^i _ prrfc/■ \2C\

E{S(i2)2C } = E[E{S(i2)2C \X[i2]}] <

n C

< C1(C) I E

i2 1

J2e{H (X[il],X[i2])2\X[i2]} il =1

+

i2-1 \

+ £ E {\H (X^X^r}) = il = 1 J

= C1(C) {(i2 - 1)CE[E{H(X[1],X[i2])2\X[i2]}]C+

+ (i2 - 1)E{\H(X[1],X[i2])\2c}) . (2.40)

Here, C1 (C) > 0 denotes a constant depending only on C. Now,

H1(X[il],X[i2]) < Bj1 \\X[i2] || {\W'(Yilj/h)\^X[il]- x\| + M} ,

from which result, (2.36), and arguments borrowed from Step 4, it can be proved that, for each b > 1,

E{H((X[il],X[i2])2C\X[i2]} <

< const. [lIX^IIh1/6 + { \W,(Yi2j/h)\\X[i2] - x\\ + h16}] 2C , (2.41)

where the constant depends on b but not on x G S(c), h or n. Taking C = 2,ora general C, in (2.41), substituting the resulting bound at (2.41) into (2.40), and taking the expected value on the right-hand side of the latter formula, we deduce that E{S(i)2C} < const.iC, uniformly in x G S(c). Substituting the latter bound into (2.39) we deduce that E\T81 \2C < const.n2C, uniformly in x G S (c), and so

(nh)-2\T81\ = n-1h-2Qn7(x,e), (2.42)

where e now can be taken fixed and Qn7 satisfies (2.28).

Combining (2.34), (2.37), (2.38) and (2.42), and noting that, by (1.4), nl-Sh3 is bounded

away from zero for some S > 0, we deduce that for some S > 0,

T6 = n-s (nh)-1/ 2Qn8(x,S), (2.43)

where Qn8 satisfies (2.28). This shows that T6 satisfies (2.32), which completes the proof of (2.33) (note the comments at the end of the paragraph above (2.34)). Observe too that T1 = fj(xj); see the second identity in the first part of (2.27). Together, this result, (2.10) and (2.33) imply that for some S > 0,

fj (xj) = f(xj ) + T4 + n-s (nh)-1/ 2Qn9(x, S)+ op{(nh)-1/ 2}, (2.44)

where Qn9 satisfies (2.28) and the op(■) term is of the stated order uniformly in x G S(c). Step 7. Bound for T4. Recall, from the first paragraph of Step 6, that

1n

Ti = ^372 ^2 £

i=1

j 1/2 £ (°j - Ok )-1 { f (X[i] - x{j))0k} I Z^j 0k -

h-h=4 KJI ) j

k:k=j w '(Yij/h)

where

Z1(s,t) = n-1/2Y){X[i](s)X[^(t) - K(s,t)}.

Now, T4 = 6j ' (T4i + T42) - \dj 1Ti3, where

1

T41

££ £(Oj-ek)-1{I'Xh]-xj)^}x

-> it'i :^ , = 1' ^ b: b = 0 ^ 1 ^

(nh)2 ^ ^ ^ v j

1<il,i2<n:il=i2 k:k=j

T42

xg{X[i2](s)X[i2](t) - K(s,t)}0j(s)0k(t)dsdt,

= i^EE№-okyi jjVw -x^k} x

i=1 k:k=j I

T43

x JJ2{X[i](s)X[i](t) - K(s,t)}0j(s)0k(t)dsdt,

£ j{X[i](s)X[i](t) - K (s,t)}0j (s)0j(t)dsdt

(nh)2

n

x £YijW'(Yij/h).

(2.47)

i=1

The 2Cth moment of the absolute value of the term within square brackets on the right-hand side of (2.47), which does not depend on x, equals O(nC). Since J2i YijW'(Yij/h) equals a sum of independent and identically distributed random variables then it is straightforward to show that, for each C 2,

E

(nh)-1J2 Yij W '(Yij/h)

2C

= O{(nh)-C + h2C }

(2.48)

uniformly in x G S(c). (The h2C term on the right-hand side of (2.48) is a contribution from the mean, and is obtained using the assumption, in (1.4), that fj has a bounded derivative.) Combining the results from (2.47) to this point we deduce that

T43 = n 1/2h 1{(nh) 1/2 + h}Qn1o(x, 1)

(2.49)

where Qn10(x, 1) satisfies (2.28). Note that

T42 <

Bj

j2

(nh)2

Eiixh-

r(j) I

J^{X[i](s)X[i](t) - K(s,t)}0j(s)ds| dt

1/2

(2.50)

The series on the right-hand side here equals a sum of independent and identically distributed random variables, and using that property it is straightforward to show that the 2Cth moment of the series equals O(n2C) uniformly in x G S(c). Therefore, (2.50) implies that

T42 = n- h-2Qnu(x, 1),

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

(2.51)

where Qn11(x, 1) satisfies (2.28). The term T41 has the same form as T81 in Step 6, and is handled using the martingale argument given there, producing the bound: for some S > 0,

T41 = n-s (nh)-1/2Qn12(x,S),

(2.52)

X

where Qn12(x,S) satisfies (2.28). Here we have used the assumption, in (1.4), that for some S > 0, n1-Sh3 is bounded away from zero. The latter property, together, (2.49), (2.51) and (2.52), imply that T4 can be represented as was T41 at (2.52):

T4 = n-s(nh)-1/2Qn13(x, S), (2.53)

where Qn13(x,S) satisfies (2.28).

Step 8. Lattice argument and completion of proof of Theorem. Recall from Step (i) that to establish the theorem it suffices to derive (2.13). Combining (2.44) and (2.53) we see that we have already shown that

1n

f&i) - />.) = -r Ewvm - mYij/h)} =

nh

i=1

1n

— Y

nh

i=1

j2 \ \ hO1/2

= n-s(nh)-1/2Qn14(x, S) + op{(nh)-1/2}, (2.54)

where Qn14 satisfies (2.28), the op( ) term in (2.54) is of the stated order uniformly in x G S(c), and (see the first paragraph of Step 1) xj = u0j, with u = Jj x0j. Since

(xj )j = xj then:

(2.54) continues to hold if we replace x by xSj throughout, and in

(2.55)

particular in Qn14(x, S).

The set Sj(c), appearing in (2.13), is the class of all functions u0j for which \u\ < c. Let L denote a regular lattice of points within [-c,c], with adjacent points spaced n-3 apart, and given u G [-c, c], let u' be the element of L that is nearest to u, breaking any ties arbitrarily. Put xj = u0j and xj = u'0j. For all sufficiently large n,h > nl; we consider below only such values of n. Since W has a bounded derivative, then, using the representation for f j (xj) - f(xj) given in the second of the three identities in (2.54), we deduce that the absolute value, A(xj) say, of the difference between the two versions of f j (xj) - f(xj) computed on taking x to equal xj or xj , respectively, is bounded above by

B1h-2\u - u'j2 + OJ1/2)leB1n-1(e:j1/2 + O-1/2), where B1 = sup \ W'\. Since O j ^ Oj in probability then it follows that

sup \A(xSj )\ = Op (n-1).

xj>ES;j (c)

From this result, (2.54) and (2.55) we deduce that, in order to establish (2.13), it suffices to prove that

sup \Qn14(x,S)\ = op(n6). (2.56)

x:x=u^j ,u^C

Recall that J = I(\\\K - K\\\ < d), were d > 0 denotes a fixed constant. For each

C,n > 0 for which CS > 3,

P< sup \Qn14(x,S)\ >nn > <

\x:x=u^j ,uEC I

< P(J = 0)+ ]T P{J =1, \Qn14(x,S)\ >VnS}<

x:x=u^j ,u^C

< P(J = 0) + (n3 + 1) sup P{J = 1, \Qn14(x, S)\ > nnS} <

xES(c)

< P(J = 0) + (n3 + 1)(r]ns)-C sup E{J\Qn14(x,S)\C} =

xES(c)

= P (J = 0) + O(n3-CS) = o(1),

(2.56)

where the second-last identity follows from (2.28) and the last from the fact that \ \\K-K\\\ ^ 0 in probability, implying that P(J = 0) ^ 0. Result (2.56) follows from (2.57), thus establishing (2.13) and completing the proof of (1.5).

Step 9. Deriving (2.3) and (2.4). First we derive (2.3), where the initial step is to develop an expansion of W'(Yj) analogous to (2.17):

W '(Yj/h) = W '(Yij/h) + h-1 A.

ij

where, in place of (2.18),

j ^ B2\Sj + j {I (\Yij \ < 2h) + I (\Sj + j >

\Aij \<B2\Sj + S(2^\

In place of (2.21) we have, for the new definition of Aij at (2.57):

(2.57)

nn

E l^'l ^ T,SV*{I(\Yij < 2h) + J(Sy > n^h)},

n

i=1 i=1

where Sij8 and Sij9 satisfy (2.14). It now follows from (2.23) that

\ c

J

SUJ?E

xES(c) yn h i=1

■£\Ai0 \

O(ne)

for all C,e > 0, and therefore,

c \ 1C

I ( = O{ne(nh2)-1/2}.

(2.58)

(2.59)

From (2.59), using the lattice argument over the one-dimensional set Sj(c) (see Step 8), we deduce that

sup

x<ES(c)

1n

^£Aij

i=1

= Op {ne(nh2) 1/2}.

(2.60)

The quantity W'(Yij/h) equals a sum of n independent and identically distributed random variables, and using that property it is straightforward to show (using the lattice argument) that, for all e > 0,

sup

x<ES(c)

= Op{nE (nh) 1/2 + h}.

(2.61)

(The term h on the right-hand side is a contribution from the mean, and its presence is derived using the assumption, in (1.4), that the first derivative of fj is bounded.) The desired result (2.3) follows from (2.57), (2.60) and (2.61), on noting the definition of U2 at (2.2).

Finally we derive (2.4). In place of (2.57) and (2.58) we have the following inequality:

\W "(k + Oia)\ < B2 I {вТ 1/2(\\Х[{]\\ + c) < 2h} + I (\SijW \ > h)

Vj,

say, where Sij10 can be chosen to not depend on x G S(c) and to satisfy (A.14). (Here we use the fact that \Yij \ < 0- ^(HX^H + c) for all x G S(c).) Note that Vij depends only on

x^, not on other aspects of x, and so to prove (2.4) it suffices to show that for all e > 0,

1n

SUP T 'S>2lVn = °p(n£)- (2.62)

xESj(x) nh

The arguments leading to (2.23) can be used to show that, for all e > 0,

) =o<"')' <2'63)

The lattice argument applied to derive a supremum over the one-dimensional set Sj (c), introduced in Step 8, leads directly from (2.63) to (2.62), and hence to (2.4).

References

1. Delaigle A., Hall P. Defining probability density for a distribution of random functions // Ann. Statist. 2010. Vol. 38. P. 1171-1193.

2. Hall P., Hosseini-Nasab M. On properties of functional principal components analysis // J. R. Stat. Soc. Ser. B. 2006. Vol. 68. P. 109-126.

3. Hall P., Hosseini-Nasab M. Theory for high-order bounds in functional principal components analysis // Math. Proc. Camb. Phil. Soc. 2009. Vol. 146. P. 225-256.

4. Petrov V. V. Sums of Independent Random Variables. Berlin: Springer, 1975.

5. Petrov V. V. Limit Theorems of Probability Theory. Sequences of Independent Random Variables. New York: Oxford University Press, 1995.

6. Ramsay J. O., Silverman B. W. Functional Data Analysis. New York: Springer, 2005.

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

Статья поступила в редакцию 21 декабря 2010 г.

i Надоели баннеры? Вы всегда можете отключить рекламу.