Improved model selection method for an adaptive estimation in semimartingale regression models

Pchelintsev Evgeny A.; Pergamenshchikov Serguei M.

ВЕСТНИК ТОМСКОГО ГОСУДАРСТВЕННОГО УНИВЕРСИТЕТА 2019 Математика и механика № 58

UDC 519.2 MSC 62G08; 62G05

DOI 10.17223/19988621/58/2

E.A. Pchelintsev, S.M. Pergamenshchikov

IMPROVED MODEL SELECTION METHOD FOR AN ADAPTIVE ESTIMATION IN SEMIMARTINGALE REGRESSION MODELS1

This paper considers the problem of robust adaptive efficient estimating of a periodic function in a continuous time regression model with the dependent noises given by a general square integrable semimartingale with a conditionally Gaussian distribution. An example of such noise is the non-Gaussian Ornstein-Uhlenbeck-Lévy processes. An adaptive model selection procedure, based on the improved weighted least square estimates, is proposed. Under some conditions on the noise distribution, sharp oracle inequality for the robust risk has been proved and the robust efficiency of the model selection procedure has been established. The numerical analysis results are given.

Key words: improved non-asymptotic estimation, least squares estimates, robust quadratic risk, non-parametric regression, semimartingale noise, Ornstein-Uhlenbeck-Lévy process, model selection, sharp oracle inequality, asymptotic efficiency.

1. Introduction

Consider a regression model in continuous time

dyt = S (t) dt + d^t, 0 < t < n, (1.1)

where S is an unknown 1-periodic R ^ R function, S e L2[0,1], (|t )0<t<n is an unob-

servable noise which is a square integrated semimartingale with the values in the Sko-rokhod space D [0,n] such that, for any function f from L2 [0,1], the stochastic integral

n

In (f) = J f(s)d%s (1.2)

0

has the following properties

n

EeIn (f) = 0 and Ee/„2(f) < Ke J f2(s)ds . (1.3)

0

Here Ee denotes the expectation with respect to the distribution Q of the noise process fe )0<t<n on the space D[0,n], kq >0 is some positive constant depending on the distribution Q. The noise distribution Q is unknown and assumed to belong to some probability family Qn specified below. Note that the semimartingale regression models in continuous time were introduced by Konev and Pergamenshchikov in [8, 9] for the signal estimation problems. It should be noted also that the class of the noise processes

1 This work is supported by RSF, Grant no 17-11-01049.

)t >0 satisfying conditions (1.3) is rather wide and comprises, in particular, the Levy processes which are used in different applied problems (see [2], for details). Moreover, as is shown in Section 2, non-Gaussian Ornstein-Uhlenbeck-based models enter this class.

The problem is to estimate the unknown function S in the model (1.1) on the basis of observations (yt )0<t<n . In this paper we use the quadratic risk, i.e. for any estimate S we set

nQ ((, S) := Eq,s S - Sf and ||S||2 = }S2 (t) dt, (1.4)

0

where Eq,s stands the expectation with respect to the distribution Pq,s of the process in (1.1) with a fixed distribution Q of the noise (5>t )0<t<n and a given function S. Moreover, in the case when the distribution Q is unknown we use also the robust risk

R*(Sn,S)= sup Rq (Sn, S). (1.5)

V ' QQn QK '

The goal of this paper is to develop the adaptive robust efficient model selection method for the regression (1.1) with dependent noises having conditionally Gaussian distribution using the improved estimation approach. This paper proposes the shrinkage least squares estimates which enable us to improve the non-asymptotic estimation accuracy. For the first time such idea was proposed by Fourdrinier and Pergamenshchikov in [4] for regression models in discrete time and by Konev and Pergamenshchikov in [10] for Gaussian regression models in continuous time. We develop these methods for the general semimartingale regression models in continuous time. It should be noted that for the conditionally Gaussian regression models we cannot use the well-known improved estimators proposed in [7] for Gaussian or spherically symmetric observations. To apply the improved estimation methods to the non-Gaussian regression models in continuous time one needs to use the modifications of the well-known James - Stein estimators proposed in [13, 14] for parametric estimation problems and developed in [16, 18]. We develop the new analytical tools which allow one to obtain the sharp non-asymptotic oracle inequalities for robust risks under general conditions on the distribution of the noise in the model (1.1). This method enables us to treat both the cases of dependent and independent observations from the same standpoint, it does not assume the knowledge of the noise distribution and leads to the efficient estimation procedure with respect to the risk (1.5). The validity of the conditions, imposed on the noise in the equation (1.1) is verified for a non-Gaussian Ornstein-Uhlenbeck process.

The rest of the paper is organized as follows. In the next Section 2, we describe the Ornstein-Uhlenbeck process as the example of a semimartingale noise in the model (1.1). In Section 3 we construct the shrinkage weighted least squares estimates and study the improvement effect. In Section 4 we construct the model selection procedure on the basis of improved weighted least squares estimates and state the main results in the form of oracle inequalities for the quadratic risk (1.4) and the robust risk (1.5). In Section 5 it is shown that the proposed model selection procedure for estimating S in (1.1) is asymptotically efficient with respect to the robust risk (1.5). In Section 6 we illustrate the performance of the proposed model selection procedure through numerical simulations. Section 7 gives the proofs of the main results.

2. Ornstein-Uhlenbeck-Levy process

Now we consider the noise process (§t )t >0 in (1.1) defined by a non-Gaussian Orn-stein-Uhlenbeck process with the Levy subordinator. Such processes are used in the financial Black-Scholes type markets with jumps (see, for example, [1], and the references therein). Let the noise process in (1.1) obey the equations

d £>t = a%tdt + dut, = 0, (2.1)

ut = glwt + g2zt and zt = x*(|a-p.)t, (2.2)

where (wt )t>0 is a standard Brownian motion, |(ds dx) is a jump measure with deterministic compensator |(ds dx) = dsn(dx), n(-) is a Levy measure, i.e. some positive measure on R * = R \ {0} (see, for example, in [3]), such that

n(x2) = 1 and n(x8)<c» . (2.3)

We use the notation n(|x|m ) = JR |y|™ n(dy). Note that the Levy measure n(K*)

could be equal to . We use * for the stochastic integrals with respect to random measures, i.e.

x*(|-Д) = | J y(ц-Д)(ds,dy).

Moreover, we assume that the nuisance parameters a<0, gl and g2 satisfy the conditions

-amax < a < 0 , 0 < g < ft2 and CTg = ft2 + gl < ç*, (2.4)

where the bounds amax , g and ç* are functions of n, i.e. amax = amax (n), g = g and

HldA y J y HldA HldA x ' y *— n

ç* = çn such that for any S > 0

lim n~Samax (n) = 0 , liminf nSg > 0 and lim n~Sç*n = 0 . (2.5)

max v

We denote by Qn the family of all distributions of process (1.1) - (2.1) on the Sko-rokhod space D[0,n] satisfying the conditions (2.4) and (2.5). It should be noted that, in view of Corollary 7.2 in [17], the condition (1.3) for the process (2.1) holds with

KQ = 2ctQ .

Note also that the process (2.1) is conditionally Gaussian square integrable semimartingale with respect to c-algebra G = ^{zt, t > 0} which is generated by the jump process ( zt )t>0 defined in (2.2.).

3. Improved estimation

For estimating the unknown function S in (1.1) we will consider it's Fourier expan-

sion. Let (ф j ) be an orthonormal basis in L2 [0,1]. We extend these functions peri-

odically on K, i.e. ф j (t) = ф j (t +1) for any t e K.

0

Bj) Assume that the basis functions are uniformly bounded, i.e. for some constant ** > 1, which may be depend on n,

sup sup | *j(t) |< ** < to . (3.1)

1< j<n 0<t<1

B2) Assume that there exist some d0 > 7 and a > 1 such that

1

d>d0 d 0

1 f * *

sup — I ®d (v)dv < a, ®d (v) = maxi:

rl J

£ * j (t )* j (t - v)

j=1

(3.2)

For example, we can take the trigonometric basis defined as Tri = 1, Try (t) = -Jl cos(ra jt) for even j and Tr;- (t) = V2 sin(ra jt) for odd j > 2, where the frequency ra j = 2n[ j /2] and [x] denotes integer part of x. As is shown in Lemma A1 in [17], these functions satisfy the condition B2) with d0 = inf{d > 7:5+lnd < d} and

a = (1 - e-amx)/(4amax).

We write the Fourier expansion of the unknown function S in the form

to

S (t ) = X0; *; (t x

j=1

where the corresponding Fourier coefficients

1

0 j = (S, * j ) = | S (t )* j (t )dt (3.3)

0

can be estimated as

0 jn = -1 * j (t )dy. (3.4)

We replace the differential S(t)dt by the stochastic observed differential dy. In view of (1.1), one obtains

0 j,n =0 j § j,n , § j,n =JTIn (* j ) (3.5)

Vn Vn

and In (* j) is given in (1.2). As in [11], we define a class of weighted least squares estimates for S(t) as

Sy=X Y(j )0 j,n * j (3.6)

j=1

with the weights y = (y( j))1<j<n e Rn which belong to some finite set r from [0,1]". We put

n

v = card(r) and | r |* = max^y(j), (3.7)

Yer j=1

where card(r) is the number of the vectors y in r . In the sequel we assume that all vectors from r satisfies the following condition.

Di) Assume that for any vector y £ r there exists some fixed integer d = d(y) such that the first d components of the vector are equal to one, i.e. y(j) = 1 for 1 < j < d for any yeT .

D2) There exists n0 > 1 such that for any n > n0 there exists a c-field Gn for which the random vector |dn = ■ n )^< .<d is the Gn -conditionally Gaussian in Md with the covariance matrix

G n =( E&,n, I j ,n IG ) L,j <d (3.8)

and for some nonrandom constant l* > 0

inf (trGn-Xmax(Gn))>l* a.s, (3.9)

QeQn

where Xmax ( A) is the maximal eigenvalue of the matrix A.

As is shown in Proposition 7.11 in [17], the condition D2) holds for the non-Gaussian Ornstein-Uhlenbeck-based model (1.1) - (2.1) with l* = o (d - 6)/2 and

n —n

d > d 0 .

Further we will use the improved estimation method proposed for parametric models in [14] for the first d Fourier coefficients in (3.5). To this end we set 0n = (0- n)1£j£d.

d

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

In the sequel we will use the norm |x|2 = ^ xj for any vector x = (x- ) < . from Md .

j=1 !<1 <

Now we define the shrinkage estimators as

0j,n = (1 - g t jj,n, (3.10)

with g ( j ) = (cn /10 n|d )i{1< j<d} ; 1A is the indicator of the set A,

l*

n

cn =T~* I , . \ ~, K* = sup KQ .

(rn dK */n)n QeQn

The positive parameter r* is such that

lim r* =œ and lim n~sr* = 0 (3.11)

for any S > 0 .

Now we introduce a new class of shrinkage weighted least squares estimates for S as

s;=ê Y( i )e;,n ^ -. (3.12)

-=1

Let Aq (S ) := Rq (S*, S)- Rq (SY, S) denote the difference of quadratic risks of the

estimates (3.12) and (3.6).

Theorem 3.1. Assume that the conditions D1) - D2) hold. Then for any n > n0

sup sup Aq (S)<- c2. (3.13)

QeQ,||S|| <rn

Remark 3.1. The inequality (3.13) shows that for any n>n0 the estimate (3.12) outperforms non-asymptotically the estimate (3.6) in mean square accuracy.

4. Model selection

This Section gives the construction of a model selection procedure for estimating a function S in (1.1) on the basis of improved weighted least square estimates and states the sharp oracle inequality for the robust risk of proposed procedure.

The model selection procedure for the unknown function S in (1.1) will be constructed on the basis of a family of estimates (S*) r .

The performance of any estimate S* will be measured by the empirical squared error

Errn (*) = | |S*- S| f.

In order to obtain a good estimate, we have to formulate the rule for choosing a weight vector yeT in (3.12). It is obvious, that the best way is to minimize the empirical squared error with respect to y. Making use of the estimate definition (3.12) and the Fourier transformation of S imply

Errn (*) = EY2 (j)(0jn)2 -2XY(j)0*,n0; + E0? . (4.1)

j=1 j=1 j=1

Since the Fourier coefficients (0j) >> are unknown, the weight coefficients (yj).>

cannot be found by minimizing this quantity. To circumvent this difficulty one needs to replace the terms 0* n0 j by their estimators S j n . We set

Sj,n =0*,n0j,n- — , (4.2)

J n

where ccn is the estimate for the noise variance of CTq = Eq§2 n which we choose in the following form

c n = E fi:n and t =11 * j (t) dyt. (4.3)

For this change in the empirical squared error, one has to pay some penalty. Thus, one comes to the cost function of the form

Jn (y) = EY2 (j)(0*,n)2 -2EEY(j)Sj,n +pPn (y) , (4.4)

j=1 j=1

where p is some positive constant, Pn (y) is the penalty term defined as

Pn (y^^L^L . (4.5)

n

Substituting the weight coefficients, minimizing the cost function

Y* = agrminYer Jn (y) (4.6)

in (3.12) leads to the improved model selection procedure

S* = S*.. (4.7)

It will be noted that y* exists because r is a finite set. If the minimizing sequence y* in (4.6) is not unique, one can take any minimizer.

To prove the sharp oracle inequality, the following conditions will be needed for the family Qn of distributions of the noise (^)t>0 in (1.1). Namely, we need to impose

some stability conditions for the noise Fourier transform sequence (§jn )1< n introduced in [15].

C1) There exists a proxy variance CTq > 0 such that for any e > 0

L (Q) n

lim = 0, Lj,n(Q) = £1 Eq? n -aJ . (4.8)

C2) Assume that for any e > 0

L2,n (6)

lim-i^ = 0, L2n (6) = sup Ee

n—x n ' |x|<1

i*j (^2,n -E6^2,n)| • (4-9)

V j=1

Theorem 4.1. If the conditions Ci) and C2) hold for the distribution 6 of the process (Çt )t >0 in (1.1), then, for any n > 1 and 0 < p < 1/2, the risk (1.4) of estimate (4.7) for S satisfies the oracle inequality

R6 (S*,S) < l+^P minR6 (S;,S) + , (4-10)

6V ' 1 -p yer 6 \ T > pn

where Bn (6) = Un (6) (1+1 r |* E6 | ccn -«6 l) and the coefficient Un (6) is such that for any e > 0

lim = 0- (4.11)

n—nS

In the case, when the value of «6 in Ci) is known, one can take cCn = «6 . Then

Pn , (4.12)

n

and we can rewrite the oracle inequality (4.10) with Bn (Q) = Un (Q). Now we study the estimate (4.3). To obtain the oracle inequality for the robust risk (1.5) we need some additional condition on the distribution family Qn. We set

q* =q*n = sup CTq . (4.13)

QeQn

C*) Assume that the limit equations (4.8) - (4.9) hold uniformly in Q e Qn and q*n / ns ^ 0 as n for any e > 0 .

Now we impose some conditions on the set of the weight coefficients r . C2) Assume that the set r is such that v /ne ^ 0 and | r |* /n1/2+e ^ 0 as n for any e > 0 .

As is shown in [17], both conditions C*) and C2) hold for the model (1.1) with Ornstein-Uhlenbeck noise process (2.1). Using Proposition 4.2 from [17] we can obtain the following result.

Theorem 4.2. Assume that the conditions C*) and C2) hold and the function S(t) is continuously differentiable. Then the robust risk (1.5) of the estimate (4.7) satisfies the oracle inequality, for any n > 2 and 0 <p < 1/2,

R* (S*,S) < ^rninR* (s;,S) +—B*n (1+ II SII2),

v ' 1 -p Yer v 1 ' pn v '

where the term B*n has the property (4.11).

Now we specify the weight coefficients (y( j) ) j>1 as proposed in [5, 6] for a hetero-

scedastic regression model in discrete time. Firstly, we define the normalizing coefficient vn = n / ç*. Consider a numerical grid of the form

A ={1,..., k *}x{,..., rn },

where ri = ie and m = ^1/e2 J . We assume that the parameters k* > 1 and 0 <e< 1 are functions of n, i.e. k*= k* (n) and e = e(n), such that k* (n) e(n) ^ 0,

k* (n) / ln n ^ 0 and nSe(n) ^ œ as n ^œ for any S > 0 . One can take, for example, e(n) = 1/ ln(n +1) and k* (n) = ^ln(n +1). For each a = (P, r) e a* , we introduce the weight sequence Ya =(Ya( j) )> as

Ya (j) = 1{1< j <d (a)} +(1 - ( / ®a)P) 1{d(a)< j<ffla} ,

where roa=(xprvn)1/(2p+1), Tp=(p + 1)(2P +1)/(n2pp) and d(a) = [roa/ln(n +1)]. We set

r= {Ya, ae An }. (4.14)

It will be noted that such weight coefficients satisfy the condition D1).

5. Asymptotic efficiency

In order to study the asymptotic efficiency we define the following functional So-

bolev ball

Wk,r ={f e Ckp[0,1]: ¿|f (i^||2 < r^, (5.1)

where r > 0 and k > 1 are some unknown parameters, Ckp [0,1] is the space of k times differentiable 1-periodic functions such that f(i ) (0) = f(i ) (1) for any 0 < i < k -1. Let En denote all estimators Sn, i.e. measurable functions with respect to ct{yt, 0 < t < n}. In the sequel, we denote by Q* the distribution of the process (yt ) <n with = ç*wt, i.e. white noise model with the intensity ç*.

Theorem 5.1. Assume that Q* e Qn . The robust risk (1.5) has the following lower bound

liminf .inf v2nk /(2k+1) sup R* (K, S)> k (r).

SeW

k ,r

with /k(r) = ((2k + 1)r)1/(2k+1}(k/(n(k +1)))2k/(2k+1}.

We show that this lower bound is sharp in the following sense.

Theorem 5.2. The model selection procedure (4.7), with the weight coefficients (4.14), satisfies the following upper bound

limsupv2k/(2k+1) sup R* (S*,S)< lk(r).

n^œ SsWk r

It is clear that these theorems imply the following efficiency property.

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

Corollary 5.3. Assume that Q* e Qn. Then the model selection procedure (4.7), with the weight coefficients (4.14), is asymptotically efficient, i.e.

limv2k/(2k+1) sup R*(S*,S) = lk(r).

SeWk ,r

Theorem 5.1 and Theorem 5.2 are proved in the same way as Theorems 1 and 2 in [9].

6. Monte Carlo simulations

In this section we give the results of numerical simulations to assess the performance and improvement of the proposed model selection procedure (4.7). We simulate the model (1.1) with 1-periodic function S of the form

S(t) = t sin(2nt) +12(1 -1)cos(2nt), 0 < t < 1, (6.1)

and the Ornstein-Uhlenbeck noise process )t>0 defined by the equation

d^t = -^dt + 0.5dwt + 0.5dzt, zt = v Y. ;

j —1 J

here Nt is a homogeneous Poisson process with the intensity X — 1 and (Y.) is i.i.d.

Gaussian (0,1) (see, for example, [12]).

We use the model selection procedure (4.7) with the weights (4.14) in which k* —100 + Vln(n +1), rl — i/ln(n +1), m — [ln2(n +1)], q* — 0.5 and p —(3 + lnn)-2. We define the empirical risk as

— 1 P - - 1 N

R (S, S) — - vv E A2(tj) and E An (t) — - ]tA *,i (t),

where An (t) — Sn (t) - S(t) and An l (t) — Sln (t) - S(t) is the deviation for the l-th replication. In this example we take the frequency of observations p = 100001 and numbers of replications N = 1000.

Table 1 gives the values for the sample risks of the improved estimate (4.7) and the model selection procedure based on the weighted LSE (3.15) from [11] for different numbers of observation period n. Table 2 gives the values for the sample risks of the model selection procedure based on the weighted LSE (3.15) from [11] and it's improved version for different numbers of observation period n.

Table 1

The sample quadratic risks for different optimal y

n 100 200 500 1000

R (S S ) Y 0.0289 0.0089 0.0021 0.0011

R ( S-, s ) 0.0457 0.0216 0.0133 0.0098

R ( S-, s )/R ( s*., s ) 1.6 2.4 6.3 8.9

Table 2

The sample quadratic risks for same optimal y

n 100 200 500 1000

R (S*, s ) 0.0391 0.0159 0.0098 0.0066

R ( S-, s ) 0.0457 0.0216 0.0133 0.0098

R ( S-, s )/R ( s*, s ) 1.2 1.4 1.3 1.5

Remark 6.1. Figures 1 and 2 show the behaviour of the procedures (3.6) and (4.7) depending on the values of observation periods n. The bold line is the function (6.1), the continuous line is the model selection procedure based on the least squares estimators S and the dashed line is the improved model selection procedure S*. From the Table 2 for the same y with various observations numbers n one can conclude that theoretical result on the improvement effect (3.13) is confirmed by the numerical simulations. Moreover, for the proposed shrinkage procedure, Table 1 and Figures 1, 2, we can conclude that the gain is considerable for non-large n.

Fig. 1. Behavior of the regression function and its estimates for n = 500

0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -1.0 -1.2

0 0.2 0.4 0.6 0.8 1

Fig. 2. Behavior of the regression function and its estimates for n = 1000

7. Proofs

7.1. Proof of Theorem 3.1. Consider the quadratic error of the estimates (3.12)

S* - S

£()e*,n -ey)2 =£((*,n -ej)2 + £ (y(j)ey,n -ey)2

j=1

j = d+1

= £((n -ej)2 + cn2 -2cn £ (,n -ej))

J=1

j=d+1

= ||SY- s\I2 + cn2 - 2cn £ ( ,n -ej )lj (e n ),

j=d+1

where I j (x) = xj /||x||d for x = (xj )1< j < d

timator S* can be represented as

Therefore the risk for the improved es-

rq (s*,S) = Rq (ssY,S) + cn2 -2cnEQ,s £ (^J,n-ej ))•,n,

J=d+1

where Ij n = E (ij (en) (ej n -e j) ). Now, taking into account that the vector en = (eJn )1< j<d is the Gn -conditionally Gaussian in Rd with mean e = (ej )1<J<d and covariance matrix n _1G n, we obtain

IJ, n =JR d 1 j (X) (x-e j )p( x 1 Gn ) dx.

Here p(x | Gn) is the conditional distribution density of the vector en , i.e.

p( x\Gn ) = -

^exp

(x-9)' G-1 (x-9)

(2n)d/ ^vdetG!

Changing the variables by u = G-12 (x-9) yields

1 d

IJ,n = TTTd/2 S gjl iRd j (u)ul exp

(2n)d/2 =

2

Irlld

2

v

du.

(7.1)

(1/2 \ 1/2 Gn u + 9) and gij denotes (ij)-th element of Gn . Furthermore,

integrating by parts, the integral I. n can be rewritten as

d d ( ^ . ^

IJ,n —SSE [gjlgkl ^ (u )lu —9 nlGnl. l—1 k—1 V Uuk J

Now, taking into account that z'Az <Xmax (A)||z||2 and the condition D2), we obtain that

Aq (S) = cn2 - 2cnEq,s

r trG n 9 n G n 9 n ^

ml3 ,

< c„2 - 2cnl>"1EQ,, 1

Recall, that the prime denotes the transposition. Moreover, in view of the Jensen inequality, we can estimate the last expectation from below as

I9 nlld )-1 = EQ,S (I9 + n"/2i nid )-1 + n

EQ,S

Now we note that the condition (1.3) implies that

1/2

E,

Q,s|Md

EQ,S| I! I

< K*d.

So, for ||S||2 < r*

and therefore Aq (S ) = cn - 2c(

Eq,s Kn|| )-1 )

l*

(r*n d k*/ n ) n

<-c2.

Hence Theorem 3.1.

7.2. Proof of Theorem 4.1. Substituting (4.4) in (4.1) yields for any

Errn (y) = J(y) + 2£y(j)V9*,n9;,n -9*,n9; J + ||S||2-pPn (y). (7.2) Now we set L (y) = S n= y( j ),

Bn (y) = Êy(j)(EQ!2,n -CTQ), B2,n (y) = SY(j)ij,n,

j=1

M (y)="T S y(j )9 j ! j,n and BXn (y) = -^£ y(j)g (j)9j,n!j,n. vn j=1 Vn j=1

Taking into account the definition (4.5), we can rewrite (7.2) as

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

Errn (y) = J (y) + 2^^^L (y) + 2M (y) + -b,,n (t) n n

+24W) ^T^ - 2B3,n (y) +1|S|I2 - pPn (y) (7.3) VCTQn

with y = y/|y|n. Let y0 = (y0 (j))Kn be a fixed sequence in r and y* be as in (4.6). Substituting y0 and y* in (7.3), we consider the difference

Errn (Y*)-Errn (Y0 )< 2 L (x) + 2M (x)+ * B,n (x) +

v ' n n \ V / ^TQn

- 2,^(70)^lM-2B3,n (y*) + 2B3,n (Y0)-pPn (y*) + PP (Y0),

VaQn

where x = y* -y0. Note that L(x)< 2|r|* and |Bj,n (x)| < Lj,n (Q). Applying the elementary inequality

2|ab| < ea2 +e-162 (7.4)

with any e > 0, we get

2 /^TT B2,n ) < P ( )+ B22,n (Y) < P ( )+ B*

2VPn (Y) ,- <ePn (Y) +-<ePn M +—,

Tjn eajn ean

where B* = max (B22 n (Y) + B22,n (Y2))

Yef v \ //

with y2 = (yj )1< < . Note that from the definition of function L2 n (Q) in the condition C2) it follows that

Eqb* < £(EQB22,n (Y) + EjB22n (Y2)) < 2vL2,n (Q). (7.5)

Yer

Moreover, by the same argument one can estimate the term B3 n . Note that

£gY2 (j )e2 = c2 < n (7.6)

j=i j n

where cn = n maxYer c2n. Therefore by the Cauchy-Schwarz inequality, we can estimate the term B3 n (y) as

|B3,n (Y) < ^ cn i£ Y2 (J ^ = ^ cn (tq + B2,n (y 2 )) .

<n \ J=1 ) Vn v v "

So, applying the elementary inequality (7.4) with some arbitrary e > 0, we have

21B3, (Y)|<ePn (Y) + -^(tq + B*).

ea jn

Using the bounds, one has

Err! (y* )< Err; (y o )+ 4 |r| * ^-CT "I L (x ) + 2M (x )+ * Lu (Q )

v ' n n

2 c*

( + b* ) + ^-B2- + 2zP" (y* ) + 26P. (yo ) - pPn (y*) + pPn (yo ).

6 ctq"

6 CTq"

Setting s — p/4 and substituting p = 1 (where it is possible) we have

Errn (y*) < Errn (Y0)+ 5^*^-CT^ + 2M(x) + * L,n (Q)

v ' n n

16(c' + 1)(<CTQ + B2*) -2P„ (y*) + 2P. (Yo) + 2

PCTq" 2 v ' 2 2

Moreover, taking into account here that

\Pn (Ye)- Pn (Ye ))< and that p< 1/2, we obtain that

- 2 Pn (y* )+ 2 Pn (Yo )+ 2 Pn (Y 0 ).

IrI* |CTQ -CTn|

Errn (y*) < Errn (Yo )+ 6|r|* ^-CT^ + 2M(x) + * L,. (Q) v ' n n

16 ( c* + 1)(CTQ + Bj)P_ (y* )+ ^ p. (y o

PCTQn

(7.7)

Now we estimate the third term in the right-hand side if this inequality. Firstly, we note that

7 *

2|M (x )|<s|| Sxf + —, (7.8)

ns

where Sx — v n x.9.6. and

x ¿-^ . —1 j jrj

* nM2 (x)

Z = sup-

xer1 ||Sx| |2

with the set r1 = r-y o. Using Proposition 7.1 from [17], we can obtain that for any fixed x = (xj )1£j<d e Md.

EM2 (x) =

EI"2 (Sx ) = CTQ llSxl I2 =CTQ

n n

S xj 9

2n2 j

j=1

and therefore

* ^ nM2 (x)

Eqz* < s

xer

1 Sx

<CTQ V.

2Q

(7.9) (71o)

The norm S** - S* || can be estimated from below as

S * SY

y yo

= S(x(j) + P(j)) #x| + 2Sx(j)P(j

j=1 j=1

n

where p(j) = Y0 (j)g, (y0)-y(j)g, (y). Therefore, in view of (3.5),

||SJ2- s*-s; <||SX||2-\\S\\ -2£x(j)p(j)0

j=1

n 2

< -2 M (x2 )- 2£ x (j )0 ; 0 ; --= Y(x )

vn

j=1

where Y(y) = £n= y(j)P(j)e,,. Note that the first term in this inequality can be estimated as

nM2 (x2 ) xer1 ||Sx| |2 '

Note that, similarly to (7.10) one can estimate the last term as

EQZ1* < aQv. From here it follows that, for any 0 <e< 1,

z

2M(x2)<e||Sx\|2 + and Z* = sup

v ' ne xer,

ISxll2 <

1

1 -e

||S* - S* II2 + Z

Ze- ( j )P(/h0 j-j=

ne /=j Vn

Y(x )

Further we note that the property (7.6) yields

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

ê p2 (j 0 < ^ ( j «+2i4 (j 0 < ^

/=1

ne

(7.11)

(7.12)

Given |x(j)| < 1 and using the inequality (7.4), we get that for any e > 0

2 £x(j)P(j)0,e, <e||Sx||2 + —

ne

To estimate the last in the right hand of (7.11) we use first the Cauchy-Schwartz inequality and then the bound (7.12), i.e.

/ N1/2/ \1/2

n n \

-f |y(y)|<£P2 (j)02 £y2 OK

Vn Vn

V j=1

*

<ePn (y) +-£y2 (j<ePn (y) + -

ecn /=j

+ B )

eCT 6n

Therefore

in1 j{(y)) :< T7= Iy (y* )l + XIY ^0 ) < ePn (y* ) + ePn (y0 ) +

2 m . 2 k,/ *\i 2 „ . „ i *\ „ , , 2c*(«6 + B2*)

•Jn

yfn

ec 6n

Combining all these bounds in (7.11), we obtain that

ISxlI2 <

1

(

Z* 1

1 -e ne

V

Using (7.8) and this bound

S * SY

Y y0

* / * \ 2 6c„ (c + B2 )

+ ep (y* ) + epn (y0 )

S** -S*0 < 2(( (*) + Errn (y0)), we have

7* + 7; 2s(Errn(y*) + Errn(y0)) 6c*(ctq + B2) s2 ) n( )) 2M(x)<--7—^ + —-V-+ V Q1 / + —(( (y ) + Pn (Y0 )).

n(1 -s)L 1 -s *CTq (1 -s) 1 -s v ' '

Choosing here s<p/2 <1/2 we have that

2(7 ^ + 7* ) 2s(Errn (y* ) + Errn (Y 0 )) ^ (CTQ + B2 ) ((* ) D( )) 2M(x)<--+—-V-+-—-i+s(Pn(y ) + Pn(Y0)).

ns 1 - s *CTq V V / >

From here and (7.7), it follows that

„ /n < 1 + s„ ( ,+ 6 |r| * |ctq n\ + 2L1, n (Q)

Errn ( )< (Y0 )+ n(1 -3s) +

+ 28 (1 + c* ) (q + B*) + 2 (7* + 7* ) + 2pPn (Y0 ) . p(1 - 3s)*CTq n (1 - 3s) 1 - 3s

Choosing here s — p/3 and estimating (1 - p)-1 by 2 where this is possible, we get „ , n < 1 + p/3E ( ) +12IrI*|ctqn| + 4T (n)

Errn (Y Errn (Y0 )+ n *1 -3s) + n L1,n (Q)

+ 56(1 + c*)( + ) + 4(7* + 7*) + 2pPn (Y0). P*CTq n 1 - p

Taking the expectation and using the upper bound for Pn (y0 ) in Lemma 7.1 with s — p yields

n0 (s*,S)<rq (s* ,S)+^+12lrl*EQ lCTQ,

v ' 1 -p v 0 ! np n

where Uq,. — 4L1n (Q) + 56(1 + c*)(2L2 n (Q)v+1) + 2c*. Since this inequality holds for each y0 e A, this implies Theorem 4.1.

7.3. Property of Penalty term

Lemma 7.1. For any n > 1, y e r and 0 < s < 1

P. (7.13)

1 -s ns(1 -s)

Proof. By the definition of Err. (y) one has

Err. (y) — V ((j )9j,n-9 j )2 — J Y(j )((,.-9.) + ((.)-1)9. )2

j—1 j—1

2

>VY(j)2 (9*,n-9J) + VY(J)(Y(j)-1)9J (9*,n-9J). j—1 j—1

By the condition B2) and the definition (3.10) we obtain that the last term in the sum

can be replaced as

E Y(j )(Y(j)- 1)e j ((-0 j ) = E Y(j )(Y(j)-1)0 j (0 ,,-0 j)

j=1 j=1

i.e. e£"j ^ y( j)(y( j)-1)0 j ( n -® j) = 0 and, therefore, taking into account the definition (4.12), we obtain that

Errn (Y)>XY(j)2 E(,j-0; )2 = £y(;)2 E{ j-gy (j)0; J

> Pn (y)*eee y(j)2 gY (j)0j,n§; > (1 -Eft (Y)-1 EEgY (j)02.

V" j =1 E j=

The inequality (7.6) implies the bound (7.13). Hence Lemma 7.1.

REFERENCES

1. Barndorff-Nielsen O.E., Shephard N. Non-Gaussian Ornstein-Uhlenbeck-based models and some of their uses in financial mathematics // J. Royal Stat. Soc. 2001. B 63. P. 167-241.

2. Bertoin J. Levy Processes. Cambridge: Cambridge University Press, 1996.

3. Cont R., Tankov P. Financial Modelling with Jump Processes. London: Chapman & Hall, 2004.

4. Fourdrinier D., Pergamenshchikov S. Improved selection model method for the regression with dependent noise // Annals of the Institute of Statistical Mathematics. 2007. V. 59(3). P. 435-464.

5. Galtchouk L., Pergamenshchikov S. Sharp non-asymptotic oracle inequalities for non-parametric heteroscedastic regression models // Journal of Nonparametric Statistics. 2009. V. 21(1). P. 1-16.

6. Galtchouk L., Pergamenshchikov S. Adaptive asymptotically efficient estimation in hetero-scedastic nonparametric regression // Journal of Korean Statistical Society. 2009. V. 38(4). P. 305-322.

7. James W., Stein C. Estimation with quadratic loss // Proceedings of the Fourth Berkeley Symposium Mathematics, Statistics and Probability. Berkeley: University of California Press, 1961. V. 1. P. 361-380.

8. Konev V.V., Pergamenshchikov S.M. Nonparametric estimation in a semimartingale regression model. Part 1. Oracle Inequalities // Tomsk State University Journal of Mathematics and Mechanics. 2009. No. 3(7). P. 23-41.

9. Konev V.V., Pergamenshchikov S.M. Nonparametric estimation in a semimartingale regression model. Part 2. Robust asymptotic efficiency // Tomsk State University Journal of Mathematics and Mechanics. 2009. No. 4(8). P. 31-45.

10. Konev V.V., Pergamenshchikov S.M. General model selection estimation of a periodic regression with a Gaussian noise // Annals of the Institute of Statistical Mathematics. 2010. V. 62. P. 1083-1111.

11. Konev V.V., Pergamenshchikov S.M. Efficient robust nonparametric in a semimartingale regression model // Annals of the Institute of Henri Poincare. Probab. and Stat. 2012. V. 48(4). P. 1217-1244.

12. Konev V.V., Pergamenshchikov S.M. Robust model selection for a semimartingale continuous time regression from discrete data // Stochastic processes and their applications. 2015. V. 125. P. 294-326.

13. Konev V., Pergamenshchikov S. and Pchelintsev E. Estimation of a regression with the pulse type noise from discrete data // Theory Probab. Appl. 2014. V. 58(3). P. 442-457.

14. Pchelintsev E. Improved estimation in a non-Gaussian parametric regression // Stat. Inference Stoch. Process. 2013. V. 16(1). P. 15-28.

15. Pchelintsev E., Pergamenshchikov S. Oracle inequalities for the stochastic differential equations // Stat. Inference Stoch. Process. 2018. V. 21 (2). P. 469-483.

16. Pchelintsev E.A., Pchelintsev V.A., Pergamenshchikov S.M. Non asymptotic sharp oracle inequality for the improved model selection procedures for the adaptive nonparametric signal estimation problem // Communications - Scientific Letters of the University of Zilina. 2018. V. 20 (1). P. 72-76.

17. Pchelintsev E., Pergamenshchikov S. Adaptive model selection method for a conditionally Gaussian semimartingale regression in continuous time. 2018. P. 1-50. Preprint: http://arxiv. org/abs/1811.05319.

18. Povzun M.A., Pchelintsev E.A. Estimating parameters in a regression model with dependent noises // Tomsk State University Journal of Mathematics and Mechanics. 2017. No. 49. P. 43-51.

Received: November 13, 2018

Pchelintsev E. A., Pergamenshchikov S. M. (2019) IMPROVED MODEL SELECTION METHOD FOR AN ADAPTIVE ESTIMATION IN SEMIMARTINGALE REGRESSION MODELS. Vestnik Tomskogo gosudarstvennogo universiteta. Matematika i mekhanika [Tomsk State University Journal of Mathematics and Mechanics]. 58. pp. 14-31

AMS Mathematical Subject Classification: 62G08; 62G05

Пчелинцев Е.А., Пергаменщиков С.М. (2019) УЛУЧШЕННЫЙ МЕТОД ВЫБОРА МОДЕЛИ ДЛЯ АДАПТИВНОГО ОЦЕНИВАНИЯ В СЕМИМАРТИНГАЛЬНЫХ РЕГРЕССИОННЫХ МОДЕЛЯХ. Вестник Томского государственного университета. Математика и механика. № 58. С. 14-31

DOI 10.17223/19988621/58/2

Ключевые слова: улучшенное неасимптотическое оценивание, оценки наименьших квадратов, робастный квадратический риск, непараметрическая регрессия, семимартин-гальный шум, процесс Орнштейна - Уленбека - Леви, выбор модели, точное оракульное неравенство, асимптотическая эффективность.

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

Рассматривается задача робастного адаптивного эффективного оценивания периодической функции в непрерывной модели регрессии с зависимыми шумами, задаваемыми общим квадратично интегрируемым семимартингалом с условно-гауссовским распределением. Примером такого шума являются негауссовские процессы Орнштейна - Уленбека -Леви. Предложена адаптивная процедура выбора модели на основе улучшенных взвешенных оценок наименьших квадратов. При некоторых условиях на распределение шума доказано точно оракульное неравенство для робастного риска и установлена робастная эффективность процедуры выбора модели. Приводятся результаты численного моделирования.

Финансовая поддержка: Работа выполнена при поддержке РНФ, Грант № 17-11-01049.

Благодарности. Работа частично выполнена в рамках государственного задания Минобрнауки № 2.3208.2017/4.6. Работа второго автора частично выполнена в рамках государственного задания Минобрнауки № 1.472.2016/1.4.

PCHELINTSEVEvgeny Anatolievich, Candidate of Physics and Mathematics, Department of Mathematics and Mechanics, National Research Tomsk State University, Tomsk, Russia. E-mail: evgen-pch@yandex.ru

PERGAMENSHCHIKOVSergueiMarkovich, Doctor of Physics and Mathematics, Laboratory of Mathematics Raphael Salem, University of Rouen Normandy, France. E-mail: serge.pergamenchtchikov@univ-rouen.fr

Improved model selection method for an adaptive estimation in semimartingale regression models Текст научной статьи по специальности «Математика»

Аннотация научной статьи по математике, автор научной работы — Pchelintsev Evgeny A., Pergamenshchikov Serguei M.

Похожие темы научных работ по математике , автор научной работы — Pchelintsev Evgeny A., Pergamenshchikov Serguei M.

Улучшенный метод выбора модели для адаптивного оценивания в семимартингальных регрессионных моделях

Текст научной работы на тему «Improved model selection method for an adaptive estimation in semimartingale regression models»