Научная статья на тему 'Nonparametric estimation in a semimartingale regression model. Part 2. Robust asymptotic efficiency'

Nonparametric estimation in a semimartingale regression model. Part 2. Robust asymptotic efficiency Текст научной статьи по специальности «Математика»

CC BY
137
29
i Надоели баннеры? Вы всегда можете отключить рекламу.
Ключевые слова
NON-PARAMETRIC REGRESSION / MODEL SELECTION / SHARP ORACLE INEQUALITY / ROBUST RISK / ASYMPTOTIC EFFICIENCY / PINSKER CONSTANT / SEMIMARTINGALE NOISE

Аннотация научной статьи по математике, автор научной работы — Konev V., Pergamenshchikov S.

In this paper we prove the asymptotic efficiency of the model selection procedure proposed by the authors in [1]. To this end we introduce the robust risk as the least upper bound of the quadratical risk over a broad class of observation distributions. Asymptotic upper and lower bounds for the robust risk have been derived. The asymptotic efficiency of the procedure is proved. The Pinsker constant is found.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Текст научной работы на тему «Nonparametric estimation in a semimartingale regression model. Part 2. Robust asymptotic efficiency»

ВЕСТНИК ТОМСКОГО ГОСУДАРСТВЕННОГО УНИВЕРСИТЕТА

2009 Математика и механика № 4(8)

УДК 519.2

V. Konev, S. Pergamenshchikov

NONPARAMETRIC ESTIMATION IN A SEMIMARTINGALE REGRESSION MODEL. PART 2. ROBUST ASYMPTOTIC EFFICIENCY1

In this paper we prove the asymptotic efficiency of the model selection procedure proposed by the authors in [1]. To this end we introduce the robust risk as the least upper bound of the quadratical risk over a broad class of observation distributions. Asymptotic upper and lower bounds for the robust risk have been derived. The asymptotic efficiency of the procedure is proved. The Pinsker constant is found.

Keywords: Non-parametric regression; Model selection; Sharp oracle inequality;

Robust risk; Asymptotic efficiency, Pinsker constant, Semimartingale noise.

AMS 2000 Subject Classifications: Primary: 62G08; Secondary: 62G05

1. Introduction

In this paper we will investigate the asymptotic efficiency of the model selection procedure proposed in [1] for estimating a 1-periodic function S: R ^ R, S e L2[0,1], in a continuous time regression model

dyt = S(t)dt + d'%t, 0 < t < n, (1)

with a semimartingale noise | = (|t )0<t<n. The quality of an estimate S (any realvalued function measurable with respect to ct{yt, 0 < t < n}) for S is given by the mean integrated squared error, i.e.

Rq(S,S) = Eq,sI|S-S ||2 , (2)

where Eqs is the expectation with respect to the noise distribution Q given a function S;

II S ||2 =J0 S2(x)dx.

The semimartingale noise (|t )0<t<n is assumed to take values in the Skorohod space D[0, n] and has the distribution Q on D[0, n] such that for any function f from L2 [0, n] the stochastic integral

In (f) = \jsd Is (3)

is well defined with

EQln (f) = 0 and EqI2( f) <a*{0n/s2 ds, (4)

where ct* is some positive constant which may, in general, depend on n, i.e. ct* = a”n,

such that

0 < liminf CTn < limsup CTn <«. (5)

1 The paper is supported by the RFFI - Grant 09-01-00172-a.

Now we define a robust risk function which is required to measure the quality of an estimate S provided that a true distribution of the noise (%t )0<t<n is known to belong to some family of distributions Q* which will be specified below. Just as in [2], we define the robust risk as

K(S n S) = sup rq (S n S) • (6)

QeQn

The goal of this paper is to prove that the model selection procedure for estimating S in the model (1) constructed in [1] is asymptotically efficient with respect to this risk. When studying the asymptotic efficiency of this procedure, described in detail in Section 2, we suppose that the unknown function S in the model (1) belongs to the Sobolev ball

Wk = {f eCkper[0,1],]T Hf(j)||2< r}, (7)

j=o

where r > 0, k > 1 are some parameters, Ckper [0,1] is a set of k times continuously differentiable functions f :[0,1] ^ R such that f (i)(0) = f (i)(1) for all 0 < i < k. The functional class can be written as the ellipsoid in l2, i.e.

Wrk = {f eCkper[0,1]:£ a} 02 < r}, (8)

j=1

k

where aj = ^ (2n[ j/2])2i •

i=0

In [1] we established a sharp non-asymptotic oracle inequality for mean integrated squared error (2). The proof of the asymptotic efficiency of the model selection procedure below largely bases on the counterpart of this inequality for the robust risk (6) given in Theorem 1.

It will be observed that the notion "nonparametric robust risk" was initially introduced in [3] for estimating a regression curve at a fixed point. The greatest lower bound for such risks have been derived and a point estimate is found for which this

bound is attained. The latter means that the point estimate turns out to be robust

efficient. In [4] this approach was applied for pointwise estimation in a heteroscedastic regression model.

The optimal convergence rate of the robust quadratic risks has been obtained in [5] for the non-parametric estimation problem in a continuous time regression model with a coloured noise having unknown correlation properties under full and partial observations. The asymptotic efficiency with respect to the robust quadratic risks, has been studied in [2], [6] for the problem of non-parametric estimation in heteroscedastic regression models. In this paper we apply this approach for the model (1).

The rest of the paper is organized as follows. In Section 2 we construct the model selection procedure and formulate (Theorem 2.1) the oracle inequality for the robust risk. Section 3 gives the main results. In Section 4 we consider an example of the model (1) with the Levy type martingale noise. In Section 5 and 6 we obtain the upper and lower bounds for the robust risk. In Section 7 some technical results are established.

2. Oracle inequality for the robust risk

The model selection procedure is constructed on the basis of a weighted least squares estimate having the form

W 1 n

Sy = ZY( j)0 jJj with 0 j,n = - j0n 4i(t) dyt, (9)

j=1 n

where (4 j) j>1 is the standard trigonometric basis in i2[0,1] defined as

4 = 1, 4 j (x) = V2 Trj (2n[ j/2]x), j > 2, (10)

where the function Trj (x) = cos(x) for even j and Trj (x) = sin(x) for odd j ; [x]

denotes the integer part of x. The sample functionals 0 jn are estimates of the

corresponding Fourier coefficients

0j = (S,4j) = 10 s(t)4j(t)dt. (11)

Further we introduce the cost function as

W W

Jn (Y) = S Y2 (j)0 2,n - 2 S Y(j) 0 j,n + p Pn(Y) • j=1 j=1

Here

- -2 a.

with an = S0, l = [J"]+1;

P (y) is the penalty term defined as

Jj,n ,n j, °n ^ 'ji,<

n j=i

\ = a n 1 Y 1 P n(Y) = n ‘

As to the parameter p, we assume that this parameter is a function of n , i.e. p = pn such that 0 <p< 1/3 and

lim n5 pn = 0 for all 8 > 0.

n——W

We define the model selection procedure as

s. = sy (12)

where y is the minimizer of the cost function Jn (y) in some given class r of weight sequences y = (y(j))j>1 e [0,1]W , i.e.

Y = argminYer Jn (y) • (13)

Now we specify the family of distributions Q*n in the robust risk (6). Let Pn denote the class of all distributions Q of the semimartingale (|t) satisfying the condition (4).

It is obvious that the distribution Q0 of the process |t = Va^wt, where (wt) is a standard Brownian motion, enters the class Pn, i.e. Q e Pn . In addition, we need to

impose some technical conditions on the distribution Q of the process (|t )0<t<n . Let denote

^(Q) = limmax eq I ,

1< j< n

where

I j,n = ~T=In (4 j ) , Vn

( In (4 j ) is given in (3)) and introduce two Pn ^ R+ functionals

and

A,n(Q) = sup

xgH ,# ( x)<n

Z Xj {Eq j-°(Q))

j=i

(14)

L2,n (Q) = suP Eq

| x|<1, # ( x)< n

l^j,n V j=1 y

where H = [-1,1]W , 1 x |2 = S°°=J x;2 , #(x) = ZW=1 1{|xj |>0} and

1 j,n = j - EQj •

Now we consider the family of all distributions Q from Pn with the growth restriction on Z,,n (Q) + ¿2,n (Q), i.e.

Pn*={Q e Pn : A,n(Q) + L2,n(Q) <ln},

where ln is a slowly increasing positive function, i.e. ln —+w as n — +w and for any

5 > 0

lim \ = 0 •

n—w n5

It will be observed that any distribution Q from P* satisfies conditions Cl) and C2) on the noise distribution from [1] with c* < ln and c2 n < ln. We remind that these conditions are

C1) <n = A,n(Q) <w;

C2) c2,n = L2,n (Q) <w

In the sequel we assume that the distribution of the noise (1t) in (1) is known up to its belonging to some distribution family satisfying the following condition.

C2) Let Q*n be a family of the distributions Q from P^ such that Q0 e Q*n .

An important example for such family is given in Section 4.

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

Now we specify the set r in the model selection procedure (12) and state the oracle inequality for the robust risk (6) which is a counterpart of that obtained in [1] for the mean integrated squared error (2). Consider the numerical grid

An = {1,..., k*} x ft,..., tm }, (15)

where ti = is and m = [1/s2]; parameters k* > 1 and 0 < s< 1 are functions of n , i.e. k* = k* (n) and s = s(n), such that for any 8 > 0

lim k * (n) =+ro, lim k * (n)ln n = 0,

n^ro n^ro

lim s(n) = 0 and lim n8s(n) = +ro.

(16)

For example, one can take

s(n) =-----1------ and k * (n) = ^ ln(n +1)

ln(n +1)

for n > 1 .

Define the set r as

r = {Ya,ae An }, (17)

where Ya is the weight sequence corresponding to an element a = (P, t) e An, given by the formula

Ya (j) = 1{1< j < Jc,} +(1 - (j/raa )P)1{ j < j <®a} (18)

where j0 = j0(a) = [raa/(1 + lnn)], raa = (Tp tn)1'(2p+:) and

t = (P + 1)(2P +1)

P n2pp ‘

Along the lines of the proof of Theorem 1 in [1] one can establish the following

result.

Theorem 1. Assume that the unknown function S is continuously differentiable and the distribution family Q* in the robust risk (6) satisfies the condition C*). Then the estimator (12), for any n > 1, satisfies the oracle inequality

K (S - S) <1 + 3p-2p2min r; (S y, S) +1 Dn (p), (19)

1 - 3p Yer ' n

where the term Dn (p) is defined in [10] such that

lim = 0 (20)

n^ro n8

for each 8 > 0 .

Remark 1. The inequality (19) will be used to derive the upper bound for the robust risk (6). It will be noted that the second summand in (19) when multiplied by the optimal rate n2k/(2k+:) tends to zero as n for each k > 1. Therefore, taking into account that p^ 0 as n , the principal term in the upper bound is given by the minimal risk over the family of estimates (SY)Yer . As is shown in [7], the efficient

estimate enters this family. However one can not use this estimate because it depends on the unknown parameters k > 1 and r > 0 of the Sobolev ball. It is this fact that shows an adaptive role of the oracle inequality (19) which gives the asymptotic upper bound in the case when this information is not available.

3. Main results

In this Section we will show, proceeding from (19), that the Pinsker constant for the robust risk (6) is given by the equation

( ^ 2k / (2k +1)

~ ' . (21)

Rkn = ((2k + 1)r)17(2k+1}

I (k + 1)n

It is well known that the optimal (minimax) rate for the Sobolev ball Wrk is

n2k/ (2k+1) (see, for example, [8, 9]). We will see that asymptotically the robust risk of

the model selection (12) normalized by this rate is bounded from above by R»,n.

Moreover, this bound can not be diminished if one considers the class of all admissible estimates for S.

Theorem 1. Assume that, in model (1), the distribution of (|t ) satisfies the condition C» ). Then the robust risk (6) of the model selection estimator S » defined in (12), (17), has the following asymptotic upper bound

limsup n2k'(2k+1:i-^ sup R» (S», S) < 1. (22)

n^œ Rk,n S^wk

Now we obtain a lower bound for the robust risk (6). Let nn be the set of all

estimators Sn measurable with respect to the sigma-algebra ct{yt, 0 < t < n} generated

by the process (1).

Theorem 2. Under the conditions of Theorem 1

liminf n2k'(2k+»-L inf sup R*n(Sn,S) > 1. (23)

n^o> Rk,n S"GUn SGWrk

Theorem 1 and Theorem 2 imply the following result Corollary 3. Under the conditions of Theorem 1

lim n2k/(2k+i:i^ „inf sup Rl (§S) = 1. (24)

Rk,n S»en» SEW?

Remark 1. The equation (24) means that the sequence Rl n defined by (21) is the Pinsker constant (see, for example, [8, 9]) for the model (1).

4. Example

Let the process (^) be defined as

^ = QWt + Q2zt , (25)

where (wt )t >0 is a standard Brownian motion, (zt )t >0 is a compound Poisson process

defined as

N

z, = EY,- •

i=1

where (Nt )t>0 is a standard homogeneous Poisson process with unknown intensity

1

X > 0 and (Yj ) j>1 is an i.i.d. sequence of random variables with

EYj = 0, EY2 = 1 and EY4 < œ.

Substituting (25) in (3) yields

e in ( f )=(ft2 + ¿x)imi2.

In order to meet the condition (4) the coefficients gx, g2 and the intensity X > 0 must satisfy the inequality

q2 + £22 X<ct». (26)

Note that the coefficients ^ , g2 and the intensity X in (4) as well as ct» may

depend on n , i.e. g{ = gt (n) and X = X(n).

As is stated in [1], Theorem 2, the conditions C1) and C2) hold for the process

(25) with ct = ct(Q) = ft2 + defined in (14), c» (n) = 0 and

c»(n) < 4ct(ct + £2E Y14).

Let now Q» be the family of distributions of the processes (25) with the coefficients satisfying the conditions (26) and

where the sequence ln is taken from the definition of the set P*. Note that the distribution Q0 belongs to Q*. One can obtain this distribution putting in (25) gl = Va* and g2 = 0. It will be noted that Q*n c P* if

4a*(ct* +^finE Y/) < ln .

5. Upper bound

1. Known smoothness

First we suppose that the parameters k > 1, r > 0 and a* in (4) are known. Let the family of admissible weighted least squares estimates (Sy)yer for the unknown function

S e Wk be given (17), (18). Consider the pair

a0 = (k, *0)

where t0 = [rn/e]e, rn = r/an and e satisfies the conditions in (16). Denote the corresponding weight sequence in r as

Y0 = Ya0 • (28)

Note that for sufficiently large n the parameter a0 belongs to the set (17). In this section we obtain the upper bound for the empiric squared error of the estimator (6). Theorem 1. The estimator SYo satisfies the following asymptotic upper bound

limsup n2k/(2k+1:i-^ sup R* (Sy0 , S) < 1. (29)

n^w Rk,n SEwr

Proof. First by substituting the model (1) in the definition of § . in (9) we obtain

,,n

§ j,n = § j + n ^ j,n >

vn

where the random variables \are defined in (14). Therefore, by the definition of the estimators S Y in (9) we get

II §y0 - S II2 = £ (1 -Y0(,))2 §2 - 2Mn + £ y2(j) j 1=1 j=1

with Mn =-p£ (1 -Y0(1))Y0(1)§1 j .

Vn , =1

It should be observed that

EQ,S Mn = 0

for any Q e Q*. Further the condition (4) implies also the inequality Eq %,n < a*n for each distribution Q e Q*n. Thus,

R* (sJo, S) <£ (1 -y0(j))2 §2 + an £ Y 2( 1), (30)

j=10 n 1=1

where i0 = j0 (a0). Denote

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

Un = n2k/(2k+1) sup(1 - y0( j))2 /aj ,

j>10

where aj is the sequence as defined in (8). Using this sequence we estimate the first summand in the right hand of (30) as

72k/(2k+1) £ (1 _y0(j))2 02 <Un X a} 02

02 •

j=10 j >

From here and (8) we obtain that for each S e

Y^ (S) = n2k/(2k+1) £ (1 _ y0(j))2 02j <unr.

j=10

Further we note that

1 • .— 2k/ (2k+1) 1

limsup (rn ) Un < 2k ( \2k/(2k+1) »

n^œ n (,Tk )

where the coefficient Tk is given (18). Therefore, for any n > 0 and sufficiently large

n > 1

sup Y,n(S)<(1 + n)(CT»n)2k/(2k+1) Y», (31)

where Y, =

r1 (2k+1)

1 n2 k (Tk )2 k/ (2k+1)

To examine the second summand in the right hand of (29) we set

1

j=i

Since by the condition (5)

2(тк )y(2k+1) k2

(к + 1)(2k +1)

Note that by the definition (22)

Therefore, for any n > 0 and sufficiently large n > 1

Hence Theorem 1. □

2. Unknown smoothness Combining Theorem 1 and Theorem 1 yields Theorem 1. □

6. Lower bound

First we obtain the lower bound for the risk (2) in the case of "white noise" model (1), when ^ = 4CTwt. As before let Q0 denote the distribution of (§t )0<t<n in _D[0, n].

Theorem 1. The risk (2) corresponding to the the distribution Q0 in the model (1) has the following lower bound

where Rq(v) = Rq0 (•,•)•

Proof. The proof of this result proceeds along the lines of Theorem 4.2 from [2]. Let

| x |> 1. For each 0 < n< 1 we introduce a smoother indicator of the interval [-1 + n, 1 - n] by the formula

It will be noted that In e C" (R), 0 < In< 1 and for any m > 1 and positive constant

liminf n2k/(2k+1) inf —L sup R0(sn,S)>1,

(32)

V be a function from Cш (R) such that V(x) > 0 , f V(x)dx = 1 and V(x) = 0 for

J — 1

c > 0

lim sup f f (x)I’m (x) dx - f 11 f (x) dx = 0

(33)

where | f |* = sup-1£x£1 | f (x) |. Further, we need the trigonometric basis in L2[-1,1], that is

e, (x) = 1/V2, e} (x) = Tr} (n[ j/2]x), j > 2. (34)

Now we will construct of a family of approximation functions for a given regression function S following [2]. For fixed 0 < s< 1 one chooses the bandwidth function as

h = hn = (u*)2i+1 Nnn 2k+1 (35)

with

_* 2k

* °nkn j A T 1 4

us =------------^------------------ and Nn = ln n

(1 -s) r 2 +'(k + 1)(2 k + 1)

and considers the partition of the interval [0,1] with the points xm = 2hm , 1 < m < M , where

M = [1/(2 h)] -1.

For each interval [xm - h, xm + h] we specify the smoothed indicator as In (vm (x)), where vm (x) = (x - xm)/h . The approximation function for S(t) is given by

M N

Sz,„ (x) = ££ Zm, jDm, j (x), (36)

m=1 j=1

where z = (zm,j\<m<M,!<j<N is an array of real numbers;

Dm, j (x) = ej (vm (x))In (vm (x))

are orthogonal functions on [0,1].

Note that the set Wk is a subset of the ball

Br = {f e L2[0,1]:|| f ||2< r}.

Now for a given estimate S'n we construct its projection in L2[0,1] into Br

Fn := P% (§n) •

In view of the convexity of the set Br one has

IIS n - S||2 >||F n - S||2

for each S e Wk c Br.

From here one gets the following inequalities for the the risk (2)

sup R0 (S n, S) > sup R0 (f n, S) > sup R0 (f n, S),

SeWk SeWk {zeRd:Sz,neWrk}

where d = MN .

In order to continue this chain of estimates we need to introduce a special prior distribution on Rd . Let k = (Km, j )1<m<M,1£ j<N be a random array with the elements

Km,j = tm,j <j , (37)

where K*m,j are i.i.d. gaussian N(0,1) random variables and the coefficients

V* *

c

t ^nyj

m j ‘

We choose the sequence (y*)1£j<N in the same way as in [2] ( see (8.11)) , i.e.

y* = Nknj k -1.

We denote the distribution of k by |aK . We will consider it as a prior distribution of the random parametric regression SK,n which is obtained from (36) by replacing z with k. Besides we introduce

I = jz e Rd : max max —m,j- < ln nL (38)

^ 1< m<M 1< j< N tm, j J

By making use of the distribution |aK, one obtains

suPR0(SnS) ~\{zeRd :Sz „ewk }nS„ EQ0 A,n 11 Fn - Sz,n ||2 ^k (dz) •

Further we introduce the Bayes risk as

R(Fn) = JRd R0(F n Sz,n )^K (dz) and noting that || pn ||2 < r we come to the inequality

sup Ro(Sn,S)>R(Fn)-®n, (39)

SeWk

where TOn = E(1{Sk niWk } +lsn )(r + 11 SK,n ||2) •

By Proposition A. 1 from Appendix A. 1 one has, for any p > 0 ,

lim np wn = 0.

Now we consider the first term in the right-hand side of (39). To obtain a lower bound for this term we use the L2 [0,1] -orthonormal function family (Gm, j )1£ m<M ,1£ j < N

which is defined as

-Jh

We denote by g mj. and gm,j (z) the Fourier coefficients for functions pn and Sz respectively, i.e.

g m, j = J0 Fn(x)Gm, j (x)dx and gm,j (z) = J0 Sz ,n (x) Gm,j (x)dx •

Now it is easy to see that

M N

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

11 F n - Sz,n ||2 > ZZ ( gm j - gm, j ( z))2 •

Gm„, (X) = — ej (vm (x) )1(|vm (x)|<1) •

’ m, j ôm> j' m=1 j=1

Let us introduce the functionals Kj (•) : L1[-1,1] ^ R as

Kj ( f ) = i-! e2 (v) f (v) dv.

In view of (36) we obtain that

5 gm, j ( z ) = |0 Dm, j ( x) Gm, j ( x) dx = ^ Kj (1 n ) •

dz . ’j J0

m, j

Now Proposition A.2 implies

M N M N K2( I )

R ( F n ) > Z Z ÎRd ES„n (g - S.,j (z))2 *K №) > h Z Z K . ,2..nj. , •

m=1 j=1 m=1 j=1 Kj (1n ) nh + ^ j CT

Therefore, taking into account the definition of the coefficients (tm j ) in (37) we get

m, j >

* N

with

( ) K j2 (In ) y

T,(n, y) =

Kj (In2)y +1

jn

Moreover, the limit equality (33) implies directly

limsupsup

n^° j>1 y>0

(y + 1)tj (n, y)

-1

y

Therefore, we can write that for any v > 0

*

y,

= 0.

R ( F n)

2nh(1 + v) j= y. +1 It is easy to check directly that

CT* N y* _L_

lim-----n— Z —-— = (1 -e)2k+1,

n^œ 2nhRk ,n j=1 y.+1

where the coefficient R. n is defined in (21). Therefore, (39) implies for any 0 <s< 1

. . 2k 1 1

lim inf inf n2k+1 —— sup Rj(Sn, S) > (1 -e)2k+1.

T^.œ Sn Rk,n Sew'k

Taking here limit as e ^ 0 implies Theorem 1. □

7. Appendix

A.1. Properties of the parametric family (36)

In this subsection we consider the sequence of the random functions S defined in (36) corresponding to the random array k = (Km,j )1<m<M,1£j<N given in (37). Proposition A.1. For any p > 0

hm n lim E||SK,n ||2 [l{S t + Lc ] = 0•

n^œ n^œ ^n )

This proposition follows directly from Proposition 6.4 in [6].

A.2. Lower bound for parametric “white noise” models

In this subsection we prove some version of the van Trees inequality from [10] for the following model

dyt = S(t, z)dt + vct dwt, 0 < t < n, (A.1)

where z = (z1v.., zd)' is vector of unknown parameters, w = (wt)0<t<T is a Winier process. We assume that the function S (t, z) is a linear function with respect to the parameter z , i.e.

S(t, z) = ltzjSj (t). (A.2)

j=1

Moreover, we assume that the functions (Sj )1£j<d are continuous.

Let ® be a prior density in Rd having the following form:

d

®cz) = ®c zlv.,zd)=n^j(zj) *

j=1

where 9j is some continuously differentiable density in R. Moreover, let g (z) be a continuously differentiable Rd ^ R function such that for each 1 < j < d

lim g(z)9j(zj) = 0 and [ , | gj(z) |®(z)dz <<», (A.3)

JR

where

dg (z)

gj (z) = ■

dzj

Let now Xn = C[0, T] and B(Xn) be ct - field generated by cylindric sets in Xn . For any B(Xn) ® B(Rd) - measurable integrable function §=§(x,9) we denote

JR JX

where ^z is distribution of the process (A.1) in Xn . Let now v = ^0 be the distribution

of the process (ct*wt )0£i£n in X. It is clear (see, for example [11]) that |az << v for any

z e Rd. Therefore, we can use the measure v as a dominated measure, i.e. for the observations (A.1) in Xn we use the following likelihood function

rs x d|j.z ifnS(t,z) rn S2(t,z) )

fiy,z) = — = exp|io -J=rdyt -J0 ~2^dt\- <A'4)

Proposition A.2. For any square integrable function g measurable with respect to ct{yt, 0 < t < n} and for any 1 < j < d the following inequality holds

CT* B2

,r<g n-g(z))2 a „ , j . ■ (A*5)

I0" Sj (t) dt + CT*Ij

where

bj =\Rdgj(z) °(z) dz and ¡j=Jr ^ dz •

Proof. First of all note that the density (A.3) is bounded with respect to 9j e R for

any 1 < j < d , i.e. for any y = (yt )o<t<n e X

limsup f (y, z) <»•

\zj l^w

Therefore, putting

d

Y j = Y j (y, z) = — ln( f (y, z)®( z)) j j 89 j and taking into account condition (A.3) by integration by parts one gets

E ((gT - g(z))yj) = JRwxRd (gT(y) - g(z))(f (y>z)0(z))dz dv(y) =

= JRW xRd g j (z) f (y z)0(z) dzdv(y) = Bj •

Now by the Bounyakovskii-Cauchy-Schwarz inequality we obtain the following lower bound for the quiadratic risk

B2

E(gT - g(z))2 > j

EY 2

Note that from (A.4) it is easy to deduce that under the distribution |az

d ¡n Sj(t) tn S(t,z)Sj(t) f

^lnf(y,z) =Jo-rTdyt-Jo ----------------------^------dt = Jo

uzj VCT CT VCT

This implies directly

d

Ez —ln f (y> z ) = 0

dzj

and

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

Ez

Therefore,

( d ^ d ln f (y>z)

Kdzj

S2 <t)d • EY 2 =J7i0'S';(') dt+¡j •

Hence Proposition A.2. □

8. Acknowledgments

This research has been executed in the framework of the State Contract 02.740.11.5026.

REFERENCES

1. Konev, V.V. and Pergamenshchikov, S.M. Nonparametric estimation in a semimartingale regression model. Part 1. Oracle Inequalities, Vestnik TGU. Matematika i mehanika, No. 3(7), 23 - 41 (2009).

2. Galtchouk, L. and Pergamenshchikov, S. Adaptive asymptotically efficient estimation in heteroscedastic nonparametric regression, J. Korean Statist. Soc., http://ees.elsivier.com/jkss (2009)

3. Galtchouk, L. and Pergamenshchikov, S. Asymptotically efficient estimates for non parametric regression models, Statistics and Probability Letters, 76, No. 8, 852 - 860 (2006).

4. Brua, J. Asymptotically efficient estimators for nonparametric heteroscedastic regression models, Stat. Methodol., 6(1), 47 - 60 (2009).

5. Konev, V.V. and Pergamenshchikov, S.M. General model selection estimation of a periodic regression with a Gaussian noise, Annals of the Institute of Statistical Mathematics, http://dx.doi.org/10.1007/s10463-008-0193-1 (2008)

6. Galtchouk, L. and Pergamenshchikov, S. Adaptive asymptotically efficient estimation in heteroscedastic nonparametric regression via model selection, http://hal.archives-ouvertes.fr/ hal-00326910/fr/ (2009)

7. Galtchouk, L. and Pergamenshchikov, S. Sharp non-asymptotic oracle inequalities for non-parametric heteroscedastic regression models, J. Nonparametric Statist., 21, No. 1, 1 - 16 (2009).

8. Pinsker, M.S. Optimal filtration of square integrable signals in gaussian white noise, Problems Transimis. information, 17, 120 - 133 (1981).

9. Nussbaum, M. Spline smoothing in regression models and asymptotic efficiency in L2, Ann. Statist, 13, 984 - 997 (1985).

10. Gill, R.D. and Levit, B.Y. Application of the van Trees inequality: a Bayesian Cramer-Rao bound, Bernoulli, 1, 59 - 79 (1995)

11. Liptser, R. Sh. and Shiryaev, A.N. Statistics of Random Processes. I. General theory. NY: Springer (1977).

12. Fourdrinier, D. and Pergamenshchikov, S. Improved selection model method for the regression with dependent noise, Annals of the Institute of Statistical Mathematics, 59(3), 435 - 464 (2007).

13. Galtchouk, L. and Pergamenshchikov, S. Nonparametric sequential estimation of the drift in diffusion processes,Math. Meth. Statist., 13, No. 1, 25 - 49 (2004).

СВЕДЕНИЯ ОБ АВТОРАХ:

Konev Victor, Department of Applied Mathematics and Cybernetics, Tomsk State University,

Lenin str. 36, 634050 Tomsk, Russia, e-mail: [email protected]

Pergamenshchikov Serguei, Laboratoire de Math'ematiques Raphael Salem, Avenue de

l’Universit'e, BP. 12, Universit'e de Rouen, F76801, Saint Etienne du Rouvray, Cedex France

and Department of Mathematics and Mechanics,Tomsk State University, Lenin str. 36, 634041

Tomsk, Russia, e-mail: [email protected]

&атья принята в печать 16.11.2009 г.

i Надоели баннеры? Вы всегда можете отключить рекламу.