THEORETICAL PROPERTIES OF PRINCIPAL COMPONENT SCORE DENSITY ESTIMATORS IN FUNCTIONAL DATA ANALYSIS
Aurore Delaigle1, Peter Hall2
1. Department of Mathematics and Statistics, The University of Melbourne, VIC 3010, Australia,
Professor, [email protected]
2. Department of Mathematics and Statistics, The University of Melbourne, VIC 3010, Australia,
Professor, [email protected]
1. Methodology and main result
1.1. Dedication. It is a great pleasure to contribute to this volume in honour of Professor V.V. Petrov. Thirty-five years have passed since his master work, Petrov (1975), first appeared, and 15 years since the updated version, Petrov (1995). Both monographs, and Professor Petrov’s work more generally, have had a substantial influence on our research. Indeed, Petrov (1975) has travelled the world so often with the second author that the book is now in a particularly tattered state! However, the most-used portions still travel today. We wish Professor Petrov a very enjoyable 80th anniversary, and a long and healthy life.
1.2. Principal components and estimators. The covariance function K of a random function X defined on a compact interval I admits a spectral decomposition,
K(s; t) = cov{X(s); X(t)} = Y^ 6j0j(s)0j(t),
j=i
where the expansion converges in L2 on 12, and 61 > 62 > ••• are the eigenvalues, with respective orthonormal eigenvectors 0j, of the linear operator with kernel K. The functions -01,-02 ,••• comprise a basis for square-integrable functions on I, which can be represented in terms of their Karhunen—Loeve expansions:
OO OO
X - EX = £ 6l/2(Xj - EXj)0j, x - EX = £ 6 f xj0j,
j=1 j=1
where Xj = 6j 1/2 I x0j and Xj = 6- 1/2 JT ( x — EX)0j are principal component scores corresponding to functions X and x — EX. Given independent and identically distributed observations of X, we wish to estimate the respective probability densities of
Xj — EXj. (We use square-bracketed subscripts to avoid confusing the ith data value X^ with the ith principal component score Xi of X.)
First we compute
-I n OO
K(s,t) = -£{*[<](*) -X(s)}{Xw(t) -X(t)} =
n i=1 j=1
where X = n-1J2 i X[i] and the terms are ordered so that 61 > 62 > ••• We interpret 6j and 0j as estimators of the eigenvalues 6j and eigenfunctions 0j, respectively. See, for © Aurore Delaigle, Peter Hall, 2011
example, Ramsay and Silverman (2005, Chapters 8-10). Then we calculate approximations Xj = 6- 1/2 Jj (X[i] — X)0j to the principal components Xj = 6- 1/2 JT (X[i] — EX)0j. We
define too xj = 6j 1/2 It ( x — X)ripj, an estimator of xj. Our estimator fj of the probability
1/ 2
density function fj of 6j ' (Xj — EXj) can now be computed using standard kernel methods:
where h denotes a bandwidth and W is a kernel function.
Observe that if u = x definition of fj (xj), giving:
Observe that if u = xj then X cancels from the numerator inside the kernel in the
Afe> = ^X>
nh i~l I he]j2 ,
The “ideal” estimator of fj, which we would use if we knew 6j and 0j, is
1 n
fM-xD*-
nhit I h0V2
Elementary calculus can be used to prove that, as an estimator of fj (xj), fj (xj) has variance and bias asymptotic to wfj(xj)/(nh) and ^W2fj{xj)h?, respectively, where w = f W2 and W2 = f u2W(u)du. We shall show that fj and fj are asymptotically equivalent, from which it follows that the formulae above for asymptotic bias and variance hold for fj (xj) as well as fj (xj).
1.3. Asymptotic equivalence of fj (xj) and fj (xj). We assume that:
for each C > 0 and some 5 > 0, supE{\X(t)| } < to
tel
and sup E [{\s — t\-S\X(s) — X(t)|}c] < to;
s,teT:s=t
for each integer r > 1,6- r E (X — EX)0j|
(11)
(1/2)
(1-4)
is bounded uniformly in j; there are no ties among the j0 + 1 largest eigenvalues 6j; (1^3)
the density fj of the jth principal component score is bounded and has a bounded derivative; the kernel W is a symmetric, compactly supported probability density with two bounded derivatives; for some 5 > 0, h = h(n) = O(n-S) and n1-Sh3 is bounded away from zero as n ^
Recall that fj (xj) and fj (xj) can be interpreted as functionals of x.
Theorem. If (1.1)-(14) hold then, for all c> 0,
suP \fj(xj) — fj(xj^ = op{(nh)-1/2}• (15)
xeS(c)
2. Proof of theorem
To help clarify epsilon-delta arguments below we shall (when ambiguity might arise) use e in the context of a property that holds for all e > 0, and 5 when a property holds for some positive 5. Define \\x\\2 = fj-x(t)2dt and assume that E(X) = 0.
Step 1. Reducing the class S(c). Let Sj (c) denote the set of functions x G S(c) for which x is a constant multiple of 0j. Given x G S(c) we define the function x(j) = 0j x0j; therefore, x(j G Sj (c). Define also
= a(x) =
(
j _
x)0 j
h6
1/2
= ai(x) =
h6
1/2
In this notation we have, after Taylor expansion of W, using the fact that W has two bounded derivatives (see (1.4)),
1 n 1
= ^ W{dj + a) = U\ + C1U2 + —d2Us,
i=1
where
n n n
v> = -hIx*>. ^ =
i=1 i=1 i=1
and 0 < Oi < 1. In Step 9 below we shall show that, for all e > 0,
Qit
sup \U2\ = Op{n£(nh2) 1/2 + h},
xeS(c)
sup \U3\ = Op(nE)•
xeS(c)
(21)
(22)
(23) (2 • 4)
Define Z = n1/2(K — K) and write J Z0j0k for ffT2 Z(s,t)0j(s)0k(t)dsdt, put \\\K — K\\\2 = ffT2 (K — K )2, and let J = I (\\K — K\\\ < d) where d > 0 is a constant depending on the distribution of X. It can be proved, as in the derivation of Theorem 2.1 of Hall and Hosseini-Nasab (2009), that
6 j — 6j = n 1/2 J Z0j0j + n 1Rj1 = Op (n 1/2,
0 j (t) — 0 j (t) = n-1/2 ^ (6j — 6k )0k (t) ( Z0j 0k + n-1Rj2(t), k:k=j J
where, under conditions (1.1)—(1.3), for each 1 < j < jo and
for all C > 0,supE(\Rj1\CJ) < to, supE{sup\Rj2(t)\CJ} < to• n>1 n>1 tel
It can be shown from (1.1), (1.2) and Markov’s inequality that
for all C,d> 0, P(\\\K — K\\\ >d) = 1 — O(n-C)•
(2 5) (2 6)
(2 7)
(2 8)
Results (2.5)—(2.8) entail: \a\
(he1!‘2) 11 (x^j) — x)(0j— 0j)
j
<
< h 16 - 1/22\\x\\\\0j — 0j\\Op(h 1n 1/2)
(2.9)
uniformly in x G S(c). Together, (2.1), (2.3), (2.4) and (2.9) imply that, uniformly in x G
S(c),
f j(xj) — U1 = Op h-1n-1/2 • {nE(nh2)-1/2 + h}(h-1n-1/2) • h-£ = op{(nh)-1/2}• (2J0)
To obtain the last identity in (2.10) we used the fact that, for some 5 > 0, n1-Sh3 is bounded away from zero as n ^ to (see (1.4)), and we took e > 0 sufficiently small.
Define Yij = 6 - 1/2 fi(Xi — xj )0 j and
Yij = 6-1/2 ((X{i] — x)0j = 6-1/2 ((Xi] — xj )0j, (211)
J IE J IE
and note that
Ui - fiixj) = — - WiYij/h)}. (2.12)
The desired result (1.5) will follow from (2.10) and (2.12) if we prove that
1
sup
xMeSj (c)
nh - WiYy/h)}
i=1
= Op{(nh) 1/2}• (2J3)
Result (2.13) requires a supremum over only a scalar, Jj x0j, rather than over the function x, and so can be established relatively simply. This will prove useful in Step 8, where we establish (2.13).
Step 2. Expansions of 6 j and 0j. Define
£j = ~T)®j [ Ztpjtpj, Vj(t) = 53 ^ 1pk(t) f Ztpjtpk
J k:k=j J
and Qj = 6j 1/2 I(X[i] — xj )r/j. Results (2.5)—(2.7), and the fact that supj \6 j — 6j \ < \\\K — K||\ (this result is classical; see Hall and Hosseini-Nasab (2006) for recent references) imply that, using a potentially smaller value of d in the definition J = I(\\\K — K||\) < d),
a ~1/2 = 6J1/2(1 + n~1/2^j + n-1Rj3), 0 j = 0 j + nj + Rj4,
jj
where the scalar Rj3 and function Rj4 satisfy the properties ascribed to Rj1 and Rj2, respectively, in (2.7). From this result and (2.7) we deduce that Yj = (1 + n-1/2^j)Yij +
-1/2Cij + n 1Sij1, where Sj1 satisfies, in the case r = 1,
In particular,
for all C > 0, sup sup E(J\Sijr\C) < to• (214)
xeS(c) n,i>1
Yij = Yij + Sj + S(2), (21)
where
n1/2Sj = j Yij + Zij, nSj = Sij1 • (216)
n
Step 3. Expansion of W(Yj/h). Using (2.15) and the fact that W has two bounded derivatives, we obtain:
W (Yij/h) = W (Yij/h) + h-1(S{j) + S(2))W '(Yij/h) + h-2Aij, (217)
where, for a random variable Qij satisfying 0 < Qij < 1, and with Bk = sup \W(k)\,
2\Aij\ < B2(Sj + S\ff I{\Yij + Qij(Sg} + Sg})\ <h}<
< B2(Sj + S™)2{I(\Yij\<2h)+I(\Sg} + S(?\>h)}• (2.18)
Therefore,
n n
E \Aij \ < B^ E {(S(1))2 + (Sf)2} {i(\Yij < 2g)+I(\Sg} + S(2)\>h)} • (219)
i=1 i=1
Defining Bj1 = 6j 1/2 supk.k=j \6j — 6k\-1 and ^j(t) = JT Z(s,t)0j (s)ds, and using the
Cauchy—Schwarz inequality, it can be shown that,
Z3j < B21(jf j Ji(X[i] — x)2 = Sij2, say, where Sij2 satisfies, for r = 2, the condition
for all C > 0, sup sup E(\Sijr\C) < to, (2/20)
xeS(c) n,i>1
which implies (2.14). More simply, using the Cauchy—Schwarz inequality, Sj = £jYij + Zj can be shown to satisfy (2.20) when r = 3. From these properties and (2.16) we deduce that Sij4 = nn S1 + Si(2j')\ and Sij5 = n^Sj1)2 + Sf)2} both satisfy (2.14). Combining these results with (2.19) we deduce that
nn
EE^^d^'l ^ 2h) + I(<SW ^ nV2h)}■ (2.2i)
i=1 i=1
Treating a general version of the right-hand side of (2.21), we let:
Sij6 and Sij7 denote nonnegative random variables (both functionals of x) satisfying (2.14), with the property that for each x G S(c) the pairs (2/22)
(Sij6,Sij7), for 1 < i < n, are independent and identically distributed.
We shall show in the next section that if (2.22) and the conditions on fj and h in (1.4) hold, then, for all C, e > 0,
sup E
xeS(c)
nC Jn
— E^yeUdYy | < 2h)+I(Sij7 > n^h)}
i=1
O(ne )• (2/23)
Step 4. Proof that (2.22), and the conditions on fj and h in (1.4), imply (2.23). Let £1 be any positive number, and define Ii = I(Sij6 < n£l and (Sij7 < n£l) and Ii = 1 — Ii. Then
(2.22) follows if we prove the two versions of that result, in which Ii and Ii, respectively, are included as factors of the *th term in the series on the left-hand side of (2.23).
Without loss of generality, the exponent C in (2.23) is a positive integer. Choose C1 > 1 so large that £\(1 — C\) + 1 < 0. If we include Ii as the factor in (2.23) then the expected value on the left-hand side of (2.23) is bounded above by 2C multiplied by:
J
J
E\^lY. TiSiit J ^ E { ^ H Ti (n£l (5W«£l)Cl + Sij6(Sij7/n£1)
Ci-n
i=1
nh
i=1
<
< nC n-Ci (Ci-1)E + Sj S^-1)} =
= (nC1 (1-Ci) + C ) = o(1),
since EJSC^1 and E (JSij6 SC7) are bounded (see (2.23) and thence (2.14)) and C£i (1 — C1) + C < 0. (All bounds here hold uniformly in x G S(c). The third inequality applies for all sufficiently large n.) Therefore it suffices to prove that if £ > 0 is given, if k > 1 is an integer, and if £1 = £1(k,£) > 0 is chosen sufficiently small,
sup E
x^S(c)
J^IiSije {I (Yj | < 2h) + I (Sj < n12h)}
i=1
O{(nh)knEi}.
(2.24)
Expanding [^2i\k as a k-fold series, and then taking expectation, we see that the expected value on the left-hand side of (2.24) can be written as a k-fold series over i1,... . That
series can be broken up into k different portions, corresponding to the respective cases where exactly £, for 1 < I < k, of the integers i1,...,lk are distinct. The largest order of magnitude of upper bound is obtained when t = k. Indeed, using the argument that we shall give below for t = k, it can be shown that in the case of general t an identical order-of-magnitude bound, except reduced by the factor (nh1/b)-(k-i where b > 1, introduced below, is chosen arbitrarily close to 1, is obtained for t < k — 1 .
To derive a bound when t = k, note that, if a,b > 1 satisfy a-1 + b-1 = 1, then for each integer > 1 the series being bounded equals
E [IiiSiij6{I(lYiijI < 2h) + I(Siij7 > n1/2h)}
ii =1 ik = 1
E
IikSikj6{I(lYikj| < 2h) + I(Sikj7 > n12h)}
<
<
YJ{E(IiStj6}a{P(lYij I < 2h)1/b + P(n£i > Sij7 > n12h)1 b}
= O{(n1+£i h1 b)k},
(2.25)
uniformly in x G S(c), for all b,C > 0. The far left-hand side of (2.25) was derived using the independence of the summands IiSij6{I(IYj | < 2h) + I(Si,j7 > n12h)}, for 1 < i < n. To obtain the last identity in (2.25) we used (i) the fact that P(lY^ij | < 2h) = O(h) uniformly in i, for each 1 < j < jo (on account of the fact that fj is bounded; see (1.4)), (ii) the property
C
C
k
X
k
IiSijQ < n i, and (iii) the fact that, for 0 < £1 < \/6, P(n ei > Sij6 > n1/2h) = 0 for all
sufficiently large n, since nh? > 1 for all sufficiently large n, implying that n1 2h > n16. Result (2.24), for the series corresponding to t = k, follows from (2.25) on taking £1 > 0 sufficiently small and b > 1 sufficiently close to 1; and the other k — 1 series, discussed in the previous paragraph, can be handled similarly. This completes the proof of (2.23).
Step 5. Simplification of second term on right-hand side of (2.17). By (2.21) and (2.23), suPxes(c){E(Jh-1Y.i Aij|)k }1/ k = O( ne) for all integers > 1 and each £ > 0, and so by
(2.17),
ui = — YJWi%/h) = Ti+T'2 + nE-lh-1Qni{x,£
(2.26)
i=1
where U1 is as at (2.26),
= ^Ew) = t2 = —2Y.w^f+s^)w'^/h)
nh2
(2.27)
and for r = 1 and all c,C,£ > 0,
sup E{JQnr(x^)^} < to.
xES(c)
(2.28)
(The second identity in the formula for T1, in (2.27), follows from the second identity in (2.11) and the definition of fj(xj).)
Using the second identity in (2.16) we obtain:
(2.29)
where Sij1 satisfies (2.14). Result (2.23) implies that
sup E <
xeS(c) I nh
J2(Sij1 W' (Yij/h)
i=1
O(nE)
for all C,£ > 0, and so by (2.26) and (2.29),
U1 = T1 + T3 + n 1h 1Qn2(x,£),
(2.30)
where Qn2(x,£) satisfies (2.28) and, using the first part of (2.16),
nn
= rfEsS’riV‘) = 3J5JJ +c« )«■■'(«,/'•)
i=1 i=1
1
n3/ 2h2
E
1/2
E (®j — ®k) 1
jj k:k=j
(Xi]
)0k \ / Z0j0k —
— — 9a 1Yii
Z0j 0j
w '(Yij/h).
(2.31)
Step 6. Removing the sample mean, X, from Z. Recall that K(s,t) = cov{X(s),X(t)},
and note that we may (and will, below) assume, without loss of generality, that E(X) = 0.
We write Z = Z1 — Z2, where
n
Z1(s,t)= n-1/2 53{X[i](s)X[i](t) — K(s,t)}, Z2(s,t)= nl/2X(s)X(t).
i=1
For t = 1 or 2, define T3+£ to be the version of T3, at (2.31), which arises if we replace Z in both places on the far right-hand side by Z. Then, by (2.30), U1 = T1 + T4 — T5 + ne-1h-1Qn2(x,£). We shall show in the remainder of this section that, for some 5 > 0 and for r = 5,
Tr = n-s(nh)-1/2Qn3(x, 5), (2.32)
where Qn3(x, 5) satisfies (2.28). It therefore follows from (2.30) that for some 5 > 0,
U1 = T1 + T4 + n ^(nh) 1/<2QnA(x, 5), (2.33)
where Qn4(x, 5) satisfies (2.28).
By (2.31), T5 = T6 — T7, where
1 f f \ n n T‘ = {]/*’) EEr»w*)»
1
x £ (j — °k)-1(f X[i2]0k) JjX[ii] — xj))0k, k:k=j
Since i Yij W'(Yj/h) and X0j are both expressible as sums of n independent and identically distributed random variables, then it is straightforward to show that T7 = (nh)-1Rjs,
where supx£S(c) E(Rj5 |C) = O(ne) for all C,£ > 0. Therefore it suffices to prove that T6
satisfies (2.32), again for a random function Qn3(x, 5) satisfying (2.28).
Note that
T8 = of T6 = (nh)-2(T81 + T82 + T83Jr X0j, (2.34)
where, with /jjk = E{W'(Yi,j/h) Jj(X[i] — xj )0k }, we define
T81 = E (Oj — 0k)-1 E E \w '(Yiij/h) /' (Xii] — xj )0k — X
k:k=j 1<ii ,i2<n:ii=i2 1
X (XX[i2]0^ ,
T82 = E (0j — 0k)-1 it \w '(Yii j/h) f (Xi— xj) )0k — j)( ! X[{[%
k:k=j i=1 I I n p
T83 = E E (0j — 0k )-1^jk X[i]0k.
A_1 7„.7
i=1 k:k=j
By Rosenthal’s inequality, and for each b > 1 and each integer C > 1, there exists a constant C1C) > 0, depending only on C and such that, uniformly in x G S(c),
(
E\n-1 T^r < C1 (C)
E N "y ' (0j — 0k) 1 №jk f X[1]0k [k:k=j
+ nE N (0j — 0k) 1 ^jk j X[1]0k
kk=j Jl
+
k:k=j O(nCh2C/b).
(2.35)
To derive the last identity in (2.35) we used the fact that, with \\^\\2 denoting ^k>1 /jk, we have:
53 (0j — 0k) 1VjkJ^_ X[1]0k ^ mAX 0 — 0k | ^ it (KjiX[1]0k^j
k:k=j I k=1j
1/2
= 0j — 0k) j\X[1]\\^1,
and, if a,b > 0 satisfy a-1 + b-1 = 1, then, since fj is bounded, and all moments of ||X|| are finite,
°° f r 1 2
= E {W '(Yij/h)(Xi— xSjj)} <
k=1 ,'T
< B2P (\Yij | < hfb j {E(\X[i] — x^j)|a)}2/a = O(h2/b, (2.36)
uniformly in x G S(c). Result (2.35) implies that, for each £ > 0 and for a random function
Qn5 satisfying (2.28),
(nh)-2^ = ne-(3/2)h-1Qn5(x,£). (2.37)
Defining Bj2 = supk:k=j O — Ok |-1, we have |Ts2І < Bj2(T821 + T822) where
n OO p p
T821 = EE W'(Yij/h^ / (XK — xj)0k / X[i]0k
JT JT
i=1 k=1
n ( 00 /,
<E^ '(Yij/h^lY. / (X[
i=1 U=1 1
] — xj)0k
<
E
\k=1
X[i]0k
1/2
'(Yij/hMXi— xj ||||X[i
i=1
n
i=1 k=1
X[i]0k
<11^^ '(Yij/h)HX[i]\. i=1
C
2
2
2
It follows from (2.23) that, for i = 1, 2,E\(J/nh)T82e\C = O(ne), uniformly in x G S(c) and for all C,i> 0; in the case i = 2 we used (2.36). Therefore, E\(J/nh)T82\C = O(n£), uniformly in x G S(c) and for all C,e > 0. That is, for all e > 0,
(nh)-2\T82\ = n£-1h-1Qne(x, e), (2.38)
where Qn6 satisfies (2.28).
To bound the moments of T8i we write
Hi(X[ilhX[i2]) = £ (Oj - Ok )-1 { W'(Yij/h)J(X[i]- xj )0k -k:k=j
X (X XM ^k)
and H(X[i1] ,X[i2]) = H1(X[i1] ,X[i2]) + H1(X[i2] ,X[i1]), and Put
i2 -1
S(i2) = £ H (X[ii],X[i2]). il = 1
The variables S(i2) are martingale differences, in that E{S(i2)\Fi2-1} = 0 where Fi2-1 denotes the sigma-field generated by X[1],..., X[i2-1]. Also,
n
T81 = £ £h(X{ii],X{i2]) = Y S(i)
1<ii i <n:ii=i2 i=2
is a degenerate U-statistic: E{H(X[il],X[i2])\X[il]} = 0 for i1 = i2, and H(X[il],X[i2]) = H(X[i2],X[il]). To simplify understanding, let C > 1 be an integer. By Burkholder’s inequality for martingales, there exists a constant C2(C) > 0, depending only on C, such that
E\TS1\2C < C2(C)E j£S(i)2Cj < C2(C^£(ES(i)2C. (2.39)
If we condition on X[i2] then S(i2) equals a sum of independent random variables with zero means, and by Rosenthal’s inequality applied to such a sum, and taking the expectation of an expected value conditional on X[i2 ],
\2^i _ prrfc/■ \2C\
E{S(i2)2C } = E[E{S(i2)2C \X[i2]}] <
n C
< C1(C) I E
i2 1
J2e{H (X[il],X[i2])2\X[i2]} il =1
+
i2-1 \
+ £ E {\H (X^X^r}) = il = 1 J
= C1(C) {(i2 - 1)CE[E{H(X[1],X[i2])2\X[i2]}]C+
+ (i2 - 1)E{\H(X[1],X[i2])\2c}) . (2.40)
Here, C1 (C) > 0 denotes a constant depending only on C. Now,
H1(X[il],X[i2]) < Bj1 \\X[i2] || {\W'(Yilj/h)\^X[il]- x\| + M} ,
from which result, (2.36), and arguments borrowed from Step 4, it can be proved that, for each b > 1,
E{H((X[il],X[i2])2C\X[i2]} <
< const. [lIX^IIh1/6 + { \W,(Yi2j/h)\\X[i2] - x\\ + h16}] 2C , (2.41)
where the constant depends on b but not on x G S(c), h or n. Taking C = 2,ora general C, in (2.41), substituting the resulting bound at (2.41) into (2.40), and taking the expected value on the right-hand side of the latter formula, we deduce that E{S(i)2C} < const.iC, uniformly in x G S(c). Substituting the latter bound into (2.39) we deduce that E\T81 \2C < const.n2C, uniformly in x G S (c), and so
(nh)-2\T81\ = n-1h-2Qn7(x,e), (2.42)
where e now can be taken fixed and Qn7 satisfies (2.28).
Combining (2.34), (2.37), (2.38) and (2.42), and noting that, by (1.4), nl-Sh3 is bounded
away from zero for some S > 0, we deduce that for some S > 0,
T6 = n-s (nh)-1/ 2Qn8(x,S), (2.43)
where Qn8 satisfies (2.28). This shows that T6 satisfies (2.32), which completes the proof of (2.33) (note the comments at the end of the paragraph above (2.34)). Observe too that T1 = fj(xj); see the second identity in the first part of (2.27). Together, this result, (2.10) and (2.33) imply that for some S > 0,
fj (xj) = f(xj ) + T4 + n-s (nh)-1/ 2Qn9(x, S)+ op{(nh)-1/ 2}, (2.44)
where Qn9 satisfies (2.28) and the op(■) term is of the stated order uniformly in x G S(c). Step 7. Bound for T4. Recall, from the first paragraph of Step 6, that
1n
Ti = ^372 ^2 £
i=1
j 1/2 £ (°j - Ok )-1 { f (X[i] - x{j))0k} I Z^j 0k -
h-h=4 KJI ) j
k:k=j w '(Yij/h)
where
Z1(s,t) = n-1/2Y){X[i](s)X[^(t) - K(s,t)}.
Now, T4 = 6j ' (T4i + T42) - \dj 1Ti3, where
1
T41
££ £(Oj-ek)-1{I'Xh]-xj)^}x
-> it'i :^ , = 1' ^ b: b = 0 ^ 1 ^
(nh)2 ^ ^ ^ v j
1<il,i2<n:il=i2 k:k=j
T42
xg{X[i2](s)X[i2](t) - K(s,t)}0j(s)0k(t)dsdt,
= i^EE№-okyi jjVw -x^k} x
i=1 k:k=j I
T43
x JJ2{X[i](s)X[i](t) - K(s,t)}0j(s)0k(t)dsdt,
£ j{X[i](s)X[i](t) - K (s,t)}0j (s)0j(t)dsdt
(nh)2
n
x £YijW'(Yij/h).
(2.47)
i=1
The 2Cth moment of the absolute value of the term within square brackets on the right-hand side of (2.47), which does not depend on x, equals O(nC). Since J2i YijW'(Yij/h) equals a sum of independent and identically distributed random variables then it is straightforward to show that, for each C 2,
E
(nh)-1J2 Yij W '(Yij/h)
2C
= O{(nh)-C + h2C }
(2.48)
uniformly in x G S(c). (The h2C term on the right-hand side of (2.48) is a contribution from the mean, and is obtained using the assumption, in (1.4), that fj has a bounded derivative.) Combining the results from (2.47) to this point we deduce that
T43 = n 1/2h 1{(nh) 1/2 + h}Qn1o(x, 1)
(2.49)
where Qn10(x, 1) satisfies (2.28). Note that
T42 <
Bj
j2
(nh)2
Eiixh-
r(j) I
J^{X[i](s)X[i](t) - K(s,t)}0j(s)ds| dt
1/2
(2.50)
The series on the right-hand side here equals a sum of independent and identically distributed random variables, and using that property it is straightforward to show that the 2Cth moment of the series equals O(n2C) uniformly in x G S(c). Therefore, (2.50) implies that
T42 = n- h-2Qnu(x, 1),
(2.51)
where Qn11(x, 1) satisfies (2.28). The term T41 has the same form as T81 in Step 6, and is handled using the martingale argument given there, producing the bound: for some S > 0,
T41 = n-s (nh)-1/2Qn12(x,S),
(2.52)
X
where Qn12(x,S) satisfies (2.28). Here we have used the assumption, in (1.4), that for some S > 0, n1-Sh3 is bounded away from zero. The latter property, together, (2.49), (2.51) and (2.52), imply that T4 can be represented as was T41 at (2.52):
T4 = n-s(nh)-1/2Qn13(x, S), (2.53)
where Qn13(x,S) satisfies (2.28).
Step 8. Lattice argument and completion of proof of Theorem. Recall from Step (i) that to establish the theorem it suffices to derive (2.13). Combining (2.44) and (2.53) we see that we have already shown that
1n
f&i) - />.) = -r Ewvm - mYij/h)} =
nh
i=1
1n
— Y
nh
i=1
j2 \ \ hO1/2
= n-s(nh)-1/2Qn14(x, S) + op{(nh)-1/2}, (2.54)
where Qn14 satisfies (2.28), the op( ) term in (2.54) is of the stated order uniformly in x G S(c), and (see the first paragraph of Step 1) xj = u0j, with u = Jj x0j. Since
(xj )j = xj then:
(2.54) continues to hold if we replace x by xSj throughout, and in
(2.55)
particular in Qn14(x, S).
The set Sj(c), appearing in (2.13), is the class of all functions u0j for which \u\ < c. Let L denote a regular lattice of points within [-c,c], with adjacent points spaced n-3 apart, and given u G [-c, c], let u' be the element of L that is nearest to u, breaking any ties arbitrarily. Put xj = u0j and xj = u'0j. For all sufficiently large n,h > nl; we consider below only such values of n. Since W has a bounded derivative, then, using the representation for f j (xj) - f(xj) given in the second of the three identities in (2.54), we deduce that the absolute value, A(xj) say, of the difference between the two versions of f j (xj) - f(xj) computed on taking x to equal xj or xj , respectively, is bounded above by
B1h-2\u - u'j2 + OJ1/2)leB1n-1(e:j1/2 + O-1/2), where B1 = sup \ W'\. Since O j ^ Oj in probability then it follows that
sup \A(xSj )\ = Op (n-1).
xj>ES;j (c)
From this result, (2.54) and (2.55) we deduce that, in order to establish (2.13), it suffices to prove that
sup \Qn14(x,S)\ = op(n6). (2.56)
x:x=u^j ,u^C
Recall that J = I(\\\K - K\\\ < d), were d > 0 denotes a fixed constant. For each
C,n > 0 for which CS > 3,
P< sup \Qn14(x,S)\ >nn > <
\x:x=u^j ,uEC I
< P(J = 0)+ ]T P{J =1, \Qn14(x,S)\ >VnS}<
x:x=u^j ,u^C
< P(J = 0) + (n3 + 1) sup P{J = 1, \Qn14(x, S)\ > nnS} <
xES(c)
< P(J = 0) + (n3 + 1)(r]ns)-C sup E{J\Qn14(x,S)\C} =
xES(c)
= P (J = 0) + O(n3-CS) = o(1),
(2.56)
where the second-last identity follows from (2.28) and the last from the fact that \ \\K-K\\\ ^ 0 in probability, implying that P(J = 0) ^ 0. Result (2.56) follows from (2.57), thus establishing (2.13) and completing the proof of (1.5).
Step 9. Deriving (2.3) and (2.4). First we derive (2.3), where the initial step is to develop an expansion of W'(Yj) analogous to (2.17):
W '(Yj/h) = W '(Yij/h) + h-1 A.
ij
where, in place of (2.18),
j ^ B2\Sj + j {I (\Yij \ < 2h) + I (\Sj + j >
\Aij \<B2\Sj + S(2^\
In place of (2.21) we have, for the new definition of Aij at (2.57):
(2.57)
nn
E l^'l ^ T,SV*{I(\Yij < 2h) + J(Sy > n^h)},
n
i=1 i=1
where Sij8 and Sij9 satisfy (2.14). It now follows from (2.23) that
\ c
J
SUJ?E
xES(c) yn h i=1
■£\Ai0 \
O(ne)
for all C,e > 0, and therefore,
c \ 1C
I ( = O{ne(nh2)-1/2}.
(2.58)
(2.59)
From (2.59), using the lattice argument over the one-dimensional set Sj(c) (see Step 8), we deduce that
sup
x<ES(c)
1n
^£Aij
i=1
= Op {ne(nh2) 1/2}.
(2.60)
The quantity W'(Yij/h) equals a sum of n independent and identically distributed random variables, and using that property it is straightforward to show (using the lattice argument) that, for all e > 0,
sup
x<ES(c)
= Op{nE (nh) 1/2 + h}.
(2.61)
(The term h on the right-hand side is a contribution from the mean, and its presence is derived using the assumption, in (1.4), that the first derivative of fj is bounded.) The desired result (2.3) follows from (2.57), (2.60) and (2.61), on noting the definition of U2 at (2.2).
Finally we derive (2.4). In place of (2.57) and (2.58) we have the following inequality:
\W "(k + Oia)\ < B2 I {вТ 1/2(\\Х[{]\\ + c) < 2h} + I (\SijW \ > h)
Vj,
say, where Sij10 can be chosen to not depend on x G S(c) and to satisfy (A.14). (Here we use the fact that \Yij \ < 0- ^(HX^H + c) for all x G S(c).) Note that Vij depends only on
x^, not on other aspects of x, and so to prove (2.4) it suffices to show that for all e > 0,
1n
SUP T 'S>2lVn = °p(n£)- (2.62)
xESj(x) nh
The arguments leading to (2.23) can be used to show that, for all e > 0,
) =o<"')' <2'63)
The lattice argument applied to derive a supremum over the one-dimensional set Sj (c), introduced in Step 8, leads directly from (2.63) to (2.62), and hence to (2.4).
References
1. Delaigle A., Hall P. Defining probability density for a distribution of random functions // Ann. Statist. 2010. Vol. 38. P. 1171-1193.
2. Hall P., Hosseini-Nasab M. On properties of functional principal components analysis // J. R. Stat. Soc. Ser. B. 2006. Vol. 68. P. 109-126.
3. Hall P., Hosseini-Nasab M. Theory for high-order bounds in functional principal components analysis // Math. Proc. Camb. Phil. Soc. 2009. Vol. 146. P. 225-256.
4. Petrov V. V. Sums of Independent Random Variables. Berlin: Springer, 1975.
5. Petrov V. V. Limit Theorems of Probability Theory. Sequences of Independent Random Variables. New York: Oxford University Press, 1995.
6. Ramsay J. O., Silverman B. W. Functional Data Analysis. New York: Springer, 2005.
Статья поступила в редакцию 21 декабря 2010 г.