Научная статья на тему 'Parallel algorithm for calculating general equilibrium in multiregion economic growth models'

Parallel algorithm for calculating general equilibrium in multiregion economic growth models Текст научной статьи по специальности «Математика»

CC BY
72
9
i Надоели баннеры? Вы всегда можете отключить рекламу.
Журнал
Ural Mathematical Journal
Scopus
ВАК
Область наук
Ключевые слова
COMPUTABLE GENERAL EQUILIBRIUM / ECONOMIC GROWTH / ITERATIVE METHODS / HIGH-PERFORMANCE COMPUTING / OPENMP

Аннотация научной статьи по математике, автор научной работы — Melnikov Nikolai B., Gruzdev Arseniy P., Dalton Michael G., O'Neill Brian C.

We develop and analyze a parallel algorithm for computing a solution in a multiregion dynamic general equilibrium model. The algorithm is based on an iterative method of the Gauss Seidel type and exploits a special block structure of the model. Calculation of prices and input-output ratios in production for different time steps is carried out in parallel. We implement the parallel algorithm using the OpenMP interface for systems with shared memory. The efficiency of the algorithm is studied with the numbers of cores varying in the full range from one to the number of time steps of the model.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Текст научной работы на тему «Parallel algorithm for calculating general equilibrium in multiregion economic growth models»

URAL MATHEMATICAL JOURNAL, Vol. 2, No. 2, 2016

PARALLEL ALGORITHM FOR CALCULATING GENERAL EQUILIBRIUM IN MULTIREGION ECONOMIC GROWTH MODELS

Nikolai B. Melnikov

Lomonosov Moscow State University;

Central Economics and Mathematics Institute, RAS, Moscow, Russia

melnikov@cs.msu.ru

Arseniy P. Gruzdev

Lomonosov Moscow State University, Moscow, Russia gruzdev@cs.msu.ru

Michael G. Dalton

National Oceanic and Atmospheric Administration, Seattle WA, USA michael.dalton@noaa.gov

Brian C. O'Neill

National Center for Atmospheric Research, Boulder CO, USA Email: boneill@ucar.edu

Abstract: We develop and analyze a parallel algorithm for computing a solution in a multiregion dynamic general equilibrium model. The algorithm is based on an iterative method of the Gauss — Seidel type and exploits a special block structure of the model. Calculation of prices and input-output ratios in production for different time steps is carried out in parallel. We implement the parallel algorithm using the OpenMP interface for systems with shared memory. The efficiency of the algorithm is studied with the numbers of cores varying in the full range from one to the number of time steps of the model.

Key words: Computable general equilibrium, Economic growth, Iterative methods, High-performance computing, OpenMP.

AMS Classification: 91B50, 91B62, 91B66, 91B74, 68W10

1. Introduction

Dynamic computable general equilibrium (CGE) models are widely used for estimating the effects of demographic and technological changes on energy use and carbon dioxide (CO2) emissions. The equilibrium is described in the framework of the Arrow-Debreu theory, which leads to a systems of nonlinear equations. Usually large-scale nonlinear systems are solved by one of the "generalpurpose" Krylov subspace solvers, which can deal effectively with sparse matrices (see, e.g., [1]).

In our paper [2], we presented a parallel algorithm based on an iterative method of the Gauss -Seidel type [3]. We exploited the special block structure of the nonlinear system of equations in dynamic CGE models. We implemented the algorithm using parallel programming environments for the one-region version of the Population-Environmental-Technology (PET) model [4,5]. The

numerical results showed that the speed of our algorithm is comparable to the one of Krylov methods solvers such as NITSOL [6].

In this paper we extend the algorithm to models with international trade and apply it to the multiregion PET model [7]. We implement the parallel algorithm using the OpenMP interface for systems with shared memory. To demonstrate the effectiveness of the parallel algorithm we use the PET model calibrated to reproduce major outcomes for the socioeconomic scenarios from the Shared Socioeconomic Pathways (SSP) database (see, e.g. [8]). The calibration of the PET model to the SSPs is described in the supplementary material to [8].

The paper is organized as follows. In Sect. 2 we present a description of the multiregion PET model. In particular, we explain in detail how the intermediate goods demand is calculated in the presence of the international trade. In Sect. 3 we present the numerical method for calculating the equilibrium and explain the parallel algorithm. In Sect. 4 we discuss the calculation results.

In this section we describe the multiregion PET model (for description of the one-region PET model, see, e.g. [2,4,5]).

The PET model is a forward-looking CGE model with tree types of agents: consumers, producers, and government. Consumers maximize their lifetime utility function taking prices as given (Subsec. 2.1). Producers maximize profits supported by the prices as described in Subsec. 2.2. Government redistributes capital through taxes and transfers (for details see, e.g. [5]). International trade is described by the Armington model as described in Subsec. 2.3. Prices are determined by the markets clearing conditions for production factors, intermediate and final goods (Subsec. 2.4). The first-order optimality conditions for the agents and supply-equals-demand conditions for markets form a system of nonlinear equations. A solution to this system of equation is called the general equilibrium.

2.1. Consumers side

In each of the NR regions the utility function of the representative household is given by the discounted lifetime consumption

where t = 0,1,2,... is time, index j = 1, Nc labels consumer goods, Cjt is consumption, nt is the size of population, ^ € (—ro, 1)\ {0} is the intertemporal substitution parameter, ft € (0,1) is the discount rate, a = 1/(1 — p) is the electivity (p € (—ro, 1)\ {0} is the substitution parameter) and j is the preference coefficient (for details of calculating j see, e.g., [5]).

The capital dynamics is

where kt is capital (k0 > 0), xt is investment, ô € (0,1) is the capital depreciation coefficient, 1 + vt = nt+i/nt is the population growth coefficient (vt is the growth rate). The budget constraint is

2. Structure of the CGE model

(1+ Vt) kt+1 = (1 - Ó) kt + xt,

(2.1)

Nc

^Pjtcjt + qtxt = (1 - dt) utlt + (1 - 0t) rtkt + gt, j=i

(2.2)

where pjt is the price of the jth consumer good, qt is the prices of investments, wt is the wage rate, rt is the rental rate of capital, gt is the government transfers, lt is the labor supply, Qt and 0t are the tax rates on capital and labor incomes, respectively. Here the quantities cjt, kt, xt, lt and gt are given in per capita terms.

Taking prices as given, the representative household maximizes the utility,

U (c)

max,

(2.3)

subject to constraints (2.1) and (2.2). The first-order optimality condition for problem (2.1), (2.2) and (2.3) gives the Euler equation

1 aQt+i(l ~ 6) + (1 - 0f+l)rt+i

— p-=-Ct+ v ,

— Ct Pt

Pt+1

where the consumption composite and price index are

ct =

'Nc p

J2(^jtcjt)p , pt =

j=i

Nc / x -P--y^fPjt} "-1

h

such that

The transversality conditions

Nc

^VjtCjt = Ptct. j=i

lim Atkt = 0, t—

where At is the Lagrange multiplier, guarantees that the optimal trajectory (ct,kt,xt) exists and is unique (see, e.g., [9]).

i

i

p

2.2. Producers side

Firms are aggregated into sectors that produce final goods (Nc consumer goods and one "investment good") and intermediate goods (NE energy goods and the rest, which we call materials). The total number of production sector is NX = NC + 1 + NE + 1.

Production level of the good X is defined by the constant elasticity of substitution (CES) function

-—- _j_

X = tx{aK(GKKyx + aL(GLLYx + aB(GBE)px + a^G^M)^) px , (2.4)

where K is capital, L is labor, E is energy composite and M is materials (unlike small letters that indicate the per capita values, capital letters denote the totals). Here Gj, I = K,L,E,M, are the productivity factors and the coefficient Yx normalizes the production shares aj to unity. Both productivity factors and production shares can be sector- and time-dependent. (Current version of the PET model [8] also has land as a production factor but, for simplicity, we do not consider it here.)

At each time moment, the producer of the good X maximizes profit, or equivalently, minimizes costs

PkK + PlL + PeE + (1+ tu)P^M min , (2.5)

K,L,E, M

given the level of production (2.4). Here Pj is the corresponding price and tm is the tax on the use of materials (for brevity, we omit the time index).

The minimal cost for problem (2.4) and (2.5) is given by PxX, where

px , , . Px

1 /PË\~i , /(l+T^P^yx-i

I nl-px rA , „I^PX ^ ' M> M E \GÈ) +aM {

The cost minimizing input-output ratios AX = I/X for I = K,L,E are given by

{I =i i p^—

and for I = M the ratio is given by

i

AM _ I 1 (1+Tm)PM\PX~1

aM(7xGB)pX Px

Since the PET model is primarily intended for energy economics analysis, it is detailed in the energy sector,

i

/ne \ps

E = £ ^(GEiEzyz\ , (2.6)

where Ei, i = 1,Ne are different energy types. Solving the cost-minimization problem

Ne

2(1+ TEi)PEiEi nEin

i=l

given the level of production (2.6), we derive the price of the energy composite,

i / / \ \ E \-

Ji ( (1 +TEi)PEi \ »E-1 \

"Ei

and the input-output ratios AE = Ei/E,

E \®Ei (ye Gei )pe Pe where rEi, i = 1,..., NE, are the taxes on the use of energy.

i

AEj = I 1 (1 +TEi)PEi\pz-1

i

2.3. Intermediate goods demand

Production has a nested structure. Therefore, calculation of the intermediate goods demand requires a recursive procedure. We derive the necessary formulae first for the one-region model and then for the multiregion case.

© © ©

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

® © ©

Figure 1. Nested production structure with one intermediate good.

2.3.1. One-region case

To explain the main idea, we first consider the market in which intermediate goods are aggregated into one good, which we call materials M. In this case, according to the nested production structure shown in Fig. 1, demand for materials is given by

M = AgX + Ag(AgX) + (Ag)2(AgX) + ...

where the first term corresponds to the portion of materials used in production of the final good X(K, L, M), the second term corresponds to the portions of materials used in production of materials M (K, L, M ) one level down, etc. Calculating the sum of the geometric series, we obtain

M =fi - A M-1 AgX

(2.7)

or, equivalently,

M = AM X + A gM.

(2.8)

The latter means that demand for materials is equal to the amount of materials needed to produce the final good and amount needed to produce the materials themselves. Denoting Z = M, A = AM

and Y = AMX, we write (2.7) as

Z = (1 - A)-1 Y.

(2.9)

Next, we consider the production (2.4) with two intermediate goods, energy and materials. In this case, the aggregate demand for materials is given by

M = AgX+

A gAgX + Ag AX X

M

+

M

M

AM( AMaM X + AE AX X) + AE (AgAMX + Ae AX X

+ ...

This formula describes the sum over layer of the nested production structure (Fig. 2). Each expression in square brackets corresponds to a particular layer. Rearrangement of the terms in square brackets gives

Figure 2. Nested production structure with energy and materials.

m = Afx +

AfAfX + Af Af X

+

M M

Similarly, for energy we obtain

Af Af + Af Af) AfX + Af Af + Af Af AEX X

+...

E = aXX X +

Af Af X + Af AX X

M

+

A EL Af + AEîAEL\ Am X + I Af-AM + Af A^ Af X AMAm + AÊAf AX X + {AfAÊ + AêJ AXX

Mf

\E aM

\E a E

+ ...

(2.10)

(2.11)

Defining y = (AMX, AXX)T and

f AM AM \

A = M E ,

I AM AE r

\ M E '

we write expressions (2.10) and (2.11) as a matrix series:

( M \ ,T . .o ,3 , f AMX

u J = (I+A+A2+A3+"H a\X

where I is the unity 2 x 2-matrix. Summing the series, we have

» = (I - A)- AXX

E J ' I AX X

Equation (2.12) can be written in the form

Z = (I - A)-1 Y

where

Z

M E

Y

AfX

AX X

(2.12)

(2.13)

Note that equation (2.13) is the same as equation (2.9) we obtained with one intermediate good. It is the dimensionality of this equation and form of the vectors Z and Y and matrix A that change when we change the number of intermediate goods.

2.3.2. Multiregion case

In this subsection we obtain the intermediate goods demand in the multiregion economy with trade.

International trade is described by the Armington model (see, e.g. [10]). It is based on the assumption that the same goods produced in different regions are not perfect substitutes but can be aggregated according a certain rule (usually a CES function). The Armington model enables the representation of markets in which domestically produced goods keep a share of domestic markets even though their price is higher than the price in other regions, and in which different exporters co-exist even if they have different prices.

Figure 3. Nested production structure with one intermediate good for the multiregion case.

Same as in the previous subsection, first we consider the market with only one intermediate good (Fig. 3). Then M (Mi,..., MNr) aggregates materials Mi,..., MNr from NR regions (Fig. 4).

Figure 4. Armington trade structure for materials.

Similarly to the problem (2.5) and (2.4), we consider

P1M1 + ... + PNrMNr ^ min,

subject to

i

where P\,..., PNr are the export prices. Then the minimum of the cost function is equal to PmM, where

nr / i ps \

p

f

Ym i=i

The cost minimizing input-output ratios are given by

M ' '

1 Pi \ Pm-1

m \ajypm P

f

Similarly to relation (2.8), we have

nr

Mг = £ bj A^ Xj + A m m

j=i

Mi

Denoting B = (bj^j, AX = diag (Af*) and A = diag (aM^ , we write

Z = (I - BA) B Y

(2.14)

where we set Z = (Mi, matrix.

, Mnr)t, X = (Xi,..., Xnr)t, Y = AXX and I is the unity Nr x Nr-

0

® ©

Figure 5. Nested production structure with energy and materials in the multiregion case.

In the case of production (2.4) with two intermediate goods, energy and materials (Fig. 5), the vector Z has the form

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

Z = (Mi, Ei,...,Mnr,Enr)t

and the components of equation (2.14) will have the block structure

bj 0

B = (j, Bj = bE

where bf = bfi / bE . Matrices AX and A will consist of the input-output ratios for materials and

j EE

energy,

AX = diag (AX) , AX =

AM 0

A = diag (Ai), Ai = | Mi E

0 AEi

Ei

In the PET model the energy composite is the aggregate (2.6) of Nf energy types. In this case, Z will be a vector of dimensions (Nf + 1)NR (Nf energy types plus materials per region). Matrix

elements bj and AE will be diagonal matrices and AX will have N f + 1 elements.

j Ei

2.4. Market equilibrium

Aggregate supply for capital KAS and labor LAS are determined by the sums over all regions of ntkt and ntlt, respectively. Aggregate demand for capital and labor are

Nx

\K v , AK

KAD = £ AKj Xj + AKP GP, j=i

Nx

LAD = £ AXj Xj + AGP GP, j=i

where GP is government purchases and A^KP and A^P are the government sector input-output ratios of capital and labor, respectively.

An equilibrium is defined by the markets clearing conditions. That means aggregate demand is equal to aggregate supply (in each region and each time t) for the factors of production and final goods,

K AD = KAS

lad = las ,

X AD = XAS

Here XAD is equal to the sums of ntct (or ntxt) over all regions, and XAS is the production output. For the government sector, we require that revenues are equal to expenditures,

GRfv = GfXP

The set of the optimality conditions for consumers and producers and markets clearing conditions form a system of nonlinear equations that need to be solved. This system of equations depends on consumer quantities, i.e. capital, investment, consumption and government transfers, on the one hand and production costs (prices) and input-output ratios on the other.

3. Parallel algorithm

Since all other quantities can be obtained explicitly if we know capital K and prices P, the system of equations describing the general equilibrium can be written as

f (K,P) =0.

The block structure of the system and parallel algorithm for solving such systems were described in detail in our paper [2]. Here we briefly recall the main ideas before describing the implementation of the parallel algorithm.

input : K°, P°

output : K, P

1 marker : if diff > tol and it < numlt then

2 omp parallel default(private)

3 omp shared(dyn arrays, stor arrays)

4 omp copyin(parameters)

5 omp for

6 for t 0 to T do

7 Calculate prices P for time moment t (inner loop);

8 Update dyn arrays;

9 end

10 omp end parallel

11 Update stor arrays;

12 it<- (it + 1);

13 diff update (K,P);

14 end

15 goto marker

Figure 6. The OpenMP implementation.

The Fair-Taylor method [3] works as follows. Let Ks be the sth iterate of capital. To obtain the next iterate of prices Ps+1 it is necessary to solve the system

f (K s,P )=0 (3.1)

with respect to P. To obtain the next iterate of capital Ks+2 it is necessary to solve the system

f (K,Ps+1) = 0 (3.2)

with respect to K, and so on.

The part of the algorithm that calculates the next iterate of capital (3.2) is implemented as the outer loop. The part that calculates the next iterate of prices (3.1) is implemented as the inner loop. Blocks of the system (3.1) that correspond to different time-periods can be calculated in parallel. To improve the convergence, solution of each block is broken down into two nested loops: the NewtonA-loop for factor prices (PK and PL in NR regions) and the NewtonB-loop for all other prices (goods prices in each region and export prices). The NewtonA-loop has a smaller dimensions, therefore we can use the classical Newton method with backtracking as a solver. For the NewtonB-loop we use a more advanced Krylov subspace method NITSOL (see, e.g. [6,11]), because it has much larger dimensions and it is called more often to calculate the Jacobian for the NewtonA-loop.

The algorithm is described in Fig. 6. The input data of the algorithm is the initial approximations of capital K0 and prices P0 and the output is the equilibrium capital K and prices P. The general parameters are the tolerance tol and number of iterations numlt. Parameter T is the time horizon of the model.

There are two types of arrays for storing and processing the economic data: dyn and stor. The first group of arrays corresponds to data at the current time and is used by the inner loop (Fig. 6, lines 6-9), the second is used for storing data over the iterations of the algorithm (outer loop). The

variable it is the iteration index and diff is the target error for the outer loop. The lines 8 and 11 in Fig. 6 correspond to the implementation of economic equations and line 13 computes the error using current iterates of capital and prices.

In the OpenMP version, the time steps of the inner loop are performed in the parallel region. All dyn and stor arrays are shared. The arrays with parameters are distributed using copyin clause (Fig. 6, line 4).

4. Results and discussion

For calculations we use the PET model with Nr = 9 regions and time horizon T = 105 years. The total number of production sectors is Nx = 10 in each region. As inputs the PET model uses national production and household survey data at the baseyear and long-term population and technical change projections over the whole time period. We use three sets of input data that correspond to socioeconomic scenarios from the Shared Socioeconomic Pathways (SSP) database (for the implementation of SSPs in the PET model, see [8]).

10 9 8 7 6 5 4 3 2 1

—e— ssp2

—e— ssp3 > >

//__i I

- SSP5 ■—c

19 17 15 13 11 9 7 5 3 1

i

-e- SSP2

—e- SSP3

-e- SSP5

4 8

Number of cores (threads)

12

8 16 Number of cores (threads)

(a) Lomonosov (Intel Xeon X5670 2.93 GHz, 12 Gb) (b) Yellowstone (Intel Xeon E5-2670 2.6 GHz, 64 Gb)

Figure 7. Speedup of the model runs for different SSPs.

1

2

1 threads

2 threads 4 threads 8 threads 16 threads 32 threads 64 threads 128 threads

o

0 50 100 150

Iteration Iteration

(a) Regular node (b) SMP node

Figure 8. Timing of the outer loop iterations for the SSP3 obtained at the Lomonosov supercomputer.

— 1 threads — 2 threads — 4 threads - 8 threads — 12 threads

We use two supercomputer systems for the model runs. The first one is the Lomonosov supercomputer [12]. We use two types of nodes at the Lomonosov: regular node with 12 cores (Intel

(a) Regular node (4 threads) (b) SMP node (16 threads)

Figure 9. Timing of year-blocks in the inner loop for the SSP3 obtained at the Lomonosov supercomputer.

Xeon X5670 2.93 GHz, 1 Gb/core) and a node with 128 cores (16 Gb/core) with shared memory, the Symmetric Multiprocessing (SMP) node. The second system is the Yellowstone supercomputer [13]. At the Yellowstone we use regular node with 16 cores (Intel Xeon E5-2670 2.6 GHz, 4 Gb/core), up to 32 cores with hyperthreading. The model is implemented using Fortran. For the algorithm implementation we use BLAS [14], LAPACK [15] and Fortran implementation of NITSOL [6]. For compiling the libraries and our code we use the Intel Fortran Compiler 15 with optimization flag -O3 and standard make-file techniques for building the project.

To study strong scalability of the parallel algorithm we need to increase the computing power while keeping the total problem size constant. This is achieved by running the model with the same initial approximations K0 and P0 and same set of numerical parameters (for each SSP) with increasing number of threads. The results show that the speedup of the parallel algorithm grows almost linearly at both supercomputers as the number of threads grows from 1 to about 12-16 (Fig. 7). Overall, we obtain the speedup of about 10 times for a regular node. With further increase of the number of nodes the speedup slows down and saturates (Table 1). Once the number of nodes becomes greater than the time horizon of the model each thread solves one year-block of the inner loop an no more speed up is possible with this algorithm. From Table 1 we see that the maximum speedup is about 22 times but using 64 nodes we already get very close to it.

Fig. 8 shows that, for the number of threads from about one to ten, there is a visible monotone decreases in timing of the outer loop as the algorithm converges (especially after the 50th iteration). This effect can be explained if we look at the timings of different year-blocks of the inner loop (Fig. 9). As the number of iterations increase the algorithm stops computing the Jacobian in the NewtonA-loop using the one from the previous iteration. The number of these "fast" year-blocks of the inner loop is increasing from iteration to iteration. For the 100th iteration of the outer loop the calculation times of more than 60 first year-blocks are close to zero. For the 200th the "fast" year-blocks span almost the whole time horizon T = 105.

Table 1. Speedup of the model runs for the SSP3 at the SMP node of the Lomonosov supercomputer.

Threads 1 2 4 8 16 32 64 128 Speedup 1 1.8 3.3 7.3 11 17 21.5 22

From Fig. 8 we also see that the timings of the outer loops uniformly decrease with the number of threads increasing. But as the number of threads increases above ten, the timings of the outer loops level out. The reason is that the timing of the inner loop in the parallel algorithm cannot get

smaller than the timing of the slowest year-block. From Fig. 9 we see that the number of "fast"

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

year-blocks is increasing as the algorithm converges but there are always some "slow" year-blocks

close to the end period T.

Aknowledgments

We are grateful to R. Loft and M. Weitzel for useful discussions of the results.

REFERENCES

1. Kelley C. Iterative Methods for Linear and Nonlinear Equations. SIAM: Philadelphia, 1995.

2. Melnikov N., Gruzdev A., Dalton M. and O'Neill B. Parallel algorithm for solving large-scale dynamic general equilibrium models // Russian Supercomputing Days, Moscow, 2015. P. 84-95.

3. Fair R., Taylor J. Solution and maximum likelihood estimation of dynamic nonlinear rational expectations models // Econometrica, 1983. Vol. 51. P. 1169-1185.

4. Dalton M., O'Neill B., Prskawetz A., Jiang L. and Pitkin J. Population aging and future carbon emissions in the United States // Energy economics, 2008. Vol. 30, P. 642-675.

5. Melnikov N., O'Neill B. and Dalton M. Accounting for the household heterogeneity in dynamic general equilibrium models // Energy economics, 2012. Vol. 34, P. 1475-1483.

6. Pernice M., Walker H. NITSOL: a Newton iterative solver for nonlinear systems // SIAM J. Sci. Comput., 1998. Vol. 19, P. 302-318.

7. O'Neill B., Dalton D., Fuchs R., Jiang L., Pachauri S. and Zigova K. Global demographic trends and future carbon emissions // Proc. Natl. Acad. Sci. U.S.A., 2010. Vol. 107, P. 17521-17526.

8. Ren X., Weitzel M., O'Neill B.C., Lawrence P., Meiyappan P., Levis S., Balistreri E.J. and Dalton M. Avoided economic impacts of climate change on agriculture: integrating a land surface model (CLM) with a global economic model (iPETS)// Climatic Change, 2016. P. 1-15. DOI: 10.1007/s10584-016-1791-1

9. Stokey N., Lucas R. and Prescott E. Recursive Methods in Economic Dynamics. Harvard University Press: Cambridge MA, 1989. 608 p.

10. Armington P. A theory of demand for products distinguished by place of production // IMF Staff Papers, 1969. Vol. 16, P. 170-201.

11. Eisenstat S. and Walker H. Globally convergent inexact Newton methods // SIAM J. Optimization, 1994 Vol. 4, P. 393-422.

12. Sadovnichy V., Tikhonravov A., Voevodin Vl. and Opanasenko V. "Lomonosov": Supercomputing at Moscow State University. In Contemporary High Performance Computing: From Petascale toward Exascale. Chapman & Hall/CRC Computational Science, 2013. P. 283-307.

13. Computational and Information Systems Laboratory, 2012. Yellowstone: IBM iDataPlex System (Climate Simulation Laboratory). Boulder, CO: National Center for Atmospheric Research. http://n2t.net/ark:/85065/d7wd3xhc.

14. Basic Linear Algebra Subprograms. Available from: http://www.netlib.org/blas/ Accessed 10 October 2016.

15. Linear Algebra Package. Available from: http://www.netlib.org/lapack/ Accessed 10 October 2016.

i Надоели баннеры? Вы всегда можете отключить рекламу.