Научная статья на тему 'Быстрое матричное разложение в параллельной компьютерной алгебре'

Быстрое матричное разложение в параллельной компьютерной алгебре Текст научной статьи по специальности «Физика»

CC BY
189
43
i Надоели баннеры? Вы всегда можете отключить рекламу.
Ключевые слова
БЫСТРЫЕ АЛГОРИТМЫ / МАТРИЧНОЕ РАЗЛОЖЕНИЕ / ПАРАЛЛЕЛЬНЫЕ АЛГОРИТМЫ / КОМПЬЮТЕРНАЯ АЛГЕБРА / FAST ALGORITHMS / MATRIX DECOMPOSITION / PARALLEL ALGORITHMS / COMPUTER ALGEBRA

Аннотация научной статьи по физике, автор научной работы — Малашонок Геннадий Иванович

Предложены новые алгоритмы для вычисления матричного разложения и для вычисления обратной матрицы в случае матриц над произвольными полями. Для коммутативных областей предложен алгоритм вычисления присоединенной матрицы. Эти алгоритмы имеют сложность матричного умножения и не требуют поиска ведущего элемента и выполнения перестановок элементов матриц. Для вырожденных матриц они позволяют находить невырожденный блок наибольшего размера. Предлагаемые алгоритмы не требуют пилотирования и не меняют матричную блочную структуру. Эти алгоритмы позволяют разрабатывать соответствующие параллельные программы.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

FAST MATRIX DECOMPOSITION IN PARALLEL COMPUTER ALGEBRA

The new algorithms for finding matrix decomposition and matrix inversion in arbitrary fields are described. For the commutative domains the algorithm for finding adjoint matrices is proposed. These algorithms have the same complexity as matrix multiplication and do not require pivoting. For singular matrices they allow to obtain a nonsingular block of the biggest size. The proposed algorithms are pivot-free, and do not change the matrix block structure. They are suitable for parallel hardware implementation. GRATITUDES: Supported by the Sci. Program Devel. Sci. Potent. High. School, RNP 2.1.1.1853.

Текст научной работы на тему «Быстрое матричное разложение в параллельной компьютерной алгебре»

UDC 519.688

FAST MATRIX DECOMPOSITION IN PARALLEL COMPUTER ALGEBRA

© Gennadi Ivanovich Malaschonok

Tambov State University named after G.E. Derzhavin, Internatsionalnava, 33, Tambov, 392000, Russia, Doctor of Physics and Mathematics, Professor of Mathematical Analysis

Department, e-mail: [email protected]

Key words: fast algorithms; matrix decomposition; parallel algorithms; computer algebra.

The new algorithms for finding matrix decomposition and matrix inversion in arbitrary fields are described. For the commutative domains the algorithm for finding adjoint matrices is proposed. These algorithms have the same complexity as matrix multiplication and do not require pivoting. For singular matrices they allow to obtain a nonsingular block of the biggest size. The proposed algorithms are pivot-free, and do not change the matrix block structure. They are suitable for parallel hardware implementation.

1 Introduction

One of the popular linear algebra method is LU matrix decomposition. A lot of different implementations are well known for this decomposition. But the LU decomposition requires pivoting. With partial pivoting it has the form PA = LU and with full pivoting (Trefethen and Bau) it has the form PAQ = LU, where L and U are the lower- and the upper- triangular matrices, P and Q are the permutation matrices.

Another well known decomposition is the decomposition of inverse matrix, which is based

AC

on the Schur complement trick. Let A = ^ g ^ J be an invertible matrix with invertible

block A, then the inverse matrix A-1 can be written in the form

I -A-1C ph I 0 ph I 0 ph A-1 0

0 I ) V 0 (D - BA-1C)-1 y \-B I ) \ 0 I

In these algorithms it is assumed that principal minors are invertible and the leading elements

are nonzero as in the most of the direct algorithms for matrix inversion. Fast matrix multiplication and fast block matrix inversion were discovered by Strassen [1]. If we used fast matrix multiplication then we get the recursive algorithm for matrix inversion which has the same complexity as the algorithm of matrix multiplication.

In a general case it is necessary to find suitable nonzero elements and to perform permutations of matrix columns or rows. Bunch and Hopkroft suggested such algorithm with full pivoting for matrix inversion [2].

There are known other recursive methods for adjoint and inverse matrix computation, which have the complexity of matrix multiplications ([3]-[5]).

The permutation operation is not a very difficult operation in the case of sequential computations by one processor, but it is a difficult operation in the case of parallel computations, when different blocks of a matrix are disposed in different processors, A matrix decomposition without permutations is needed for parallel computation for construction of efficient and fast computational schemes.

The problem of obtaining pivot-free algorithm was studied in [6], [7] by S.Watt. He presented the algorithm that is based on the following identity for a nonsingular matrix: A-1 = (ATA)-1AT, Here AT is the transposed matrix to A and all principal minors of the matrix ATA are nonzero. This method is useful for making an efficient parallel program with the help of Strassen’s fast decomposition of inverse matrix for dense nonsingular matrix over the field of zero characteristic when field elements are represented by the float numbers. Other parallel matrix algorithms are developed in [8] - [11].

Another form of matrix A decomposition is Bruhat decomposition A = VwU, where V and U are nonsingular upper triangular matrices and w is a matrix of permutation, French mathematician Francois Georges Rene Bruhat was the first who worked with matrix decomposition in this form, Bruhat decomposition plays an important role in algebraic group. The generalized Bruhat decomposition was introduced and developed by D,Grigoriev [1], [12]. He uses the Bruhat decomposition in the form A = VwU, where V and U are upper triangular

A

factors of the Bruhat decomposition of a nonsingular matrix over a field was analyzed in [13] and [14],

This paper is devoted to the construction of matrix decomposition methods in a common case of singular matrices in a field of arbitrary characteristic and in the domain.

For the matrix over the field two decompositions will be constructed which have the forms LAU = E and FA = H, For the matrix over the domain one decompositions will be constructed of the form: GA = dH. Where L and U are lower and upper nonsingular triangular matrices, F and G is a nonsingular matrix, d is a determinant of some nonsingular block of matrix A which size is equal rank (A) and equalit ies rank(E) = rank(H) = rank(A) hold.

A E H UETL

and HTF are inverse matrices for matrix A, and HTG is the adjoint matrix for the matrix A

These algorithms have the same complexity as matrix multiplication and do not require pivoting. For singular matrices they allow to obtain a nonsingular block of the biggest size and the echelon form and the kernel of matrix.

The preliminary variants of these algorithms were developed in [15], [16] and [17].

The rest of the paper is organized as follows. Section 2 provides some necessary background and notations. Section 3 presents the algorithm of LEU decomposition. Section 4 presents the fast matrix decomposition in the field and computation of the inverse matrix. Section 5 presents the fast matrix decomposition in the commutative domain and computation of ajoit matrix. In the Appendix 6 we dispose the proof of the theorem 1.

2 Preliminaries

We introduce some notations that will be used in the following sections.

Let F be a field, Fnx n be an n x n matrix ring over F, Sn be a permutation group of

n elements. Let Pn be a multiplicative semigroup in Fn x n consisting of mat rices A having

exactly rank(A) nonzero entries, all of them equal to 1, We call Pn the permutation semigroup

Sn

The semigroup Dn C Pn is formed by the diagonal matrices. So |Dn| = 2n and the identity matrix I is the identity element in D,n, Sn and Pn.

Let Wij E Pn be a matrix, which has only one nonzero element in the position (i,j).

For an arbitrary matrix E of Pn, which has the rank n — s (s = 0, ..n) we shall denote bv i~E = {i1, ..,is} the ordered set of zero row numbers and je = {j1, ..,js} the ordered set of zero column numbers.

Definition 1 Let E E Pn be the matrix of the rank n — s, let iE = {h,..,is} and jE =

{ j 1 , .. , j s } E

Let us denote by E the matrix

E = £ Wi,,,,,,

k=1,..s

and call it the complimentary matrix for E . For the case s = 0 we put E = 0 .

It is easy to see that VE E Pn : E + E E Sn, and VI E Dn : I + I = I, Therefore the map I 1 = I — ^ and we have II = 0 , We can define the a partial order on Dn :

I < J ^ J — I E Dn. For each matrix E E Pn we shall denote by

IE = EET and JE = ETE

IE, JE E Dn IE

E JE E

Therefore we have several zero identities:

Et1e = I e E = EJe = Je Et = 0. (1)

I, J E Dn Fn n

FJn = {B : B E Fnxn, IBJ = B}.

We call them (I, J) -zero matrix. It is evident that Fnxn = F£Xn, 0 E UI}jFJn and if I2 < I1 and J2 < J^en Fn J C Fn xn. ' ' '

Definition 2 We shall call the factorization of the matrix A E F^^n

A = L-1 EU-1, (2)

LEU E E Pn L U

unitriangular matrices and

L — 1e E F^n, U — Je E FJfx,nj. (3)

LEU

(L,E,U) = LU (A),

Sentence 1 Let (L, E, U) = LU(A) be the LEU -decomposition of matrix A E FJn then

L = 1 e + ILIe , U = Je + JeUJ, E e FJn, (4)

L-1 = 1 e + L-1Ie , U-1 = Je + JeU-1.

E

Dn

E = LAU = (1e + ILIe )IAJ (Je + Je UJ) = I (1e + LIe I )A(Je + JJe U )J.

To prove the property of 'matrix L-1 let us consider the identity

I = L-1L = L-1(1e + LIe ) = L-11e + IIe .

Therefore L-1IE = IE and L-1 = L-1(IE + IE) = IE + L-1IE . The proof of the matrix U-1 property may be obtained similarly.

E IE < I JE < J

We shall call it the property of immersion.

Examples,

For any matrix I E Dn, E E Pnj 0 = a E F the product (al + I)I I is a LEU decompositions of matrix al and the product (aIE + IE)E I is a LEU decompositions of aE

3 Algorithm of LEU decomposition

Theorem 1 For any matrix A E Fnxn of size n = 2k, k ^ 0 a LEU -decomposition exists.

LEU

multiplications for the matrices of size n = 2k-1.

The proof of this theorem is in the Appendix,

Theorem 2 For any matrix A of size s, (s ^ 1), an algorithm of LEU -decomposition which

has the same complexity as matrix 'multiplication exists.

Proof 2 We have proved an existence of LEU -decomposition for matrices of size 2k ,k > 0. Let A E FSxjs be a matrix of size 2k-1 < s < 2k, A' be a matrix of size 2k, which has in the

A

LEU-decomposition of 'matrix A: (L',E',U') = LU(A). According to the Sentence 1 the product L'A'U' = E' has the form

h L 0 ph A 0 ph U 0 p = h E 0 p

0 I 0 0 0 I = 0 0 .

LAU = E LEU A

The total amount of 'matrix multiplications in (7)-(15) is equal to 17 and total amount of recursive calls is equal to 4- We do not consider multiplications of the permutation matrices.

We can compute the decomposition of the second order matrix by means of 5 multiplicative operations. Therefore we get the following recurrent equality for complexity

t(n) = 4t(n/2) + 17M (n/2),t(2) = 5.

Let jand ft be constants, 3 ^ ft > 2, and let M(n) = jn? + o(n?) be the number of

n x n

After summation from n = 2k to 21 we obtain

n? — 2?-2n2 5

17j (402?(k-1) + ... + 4k-22?1) + 4k-25 = 17j-f--------------- + — n2.

} ’ 2? — 4 16

Therefore the complexity of the decomposition is

17 jn?

rs-z -----

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

2? — 4.

A A-1 = UETL

inversion is written in the expressions (7)-(15). This algorithm has the complexity of matrix multiplications.

4 Algorithm with one-sided decomposition

Let us introduce the set of h-matrices in the ring Fnxn . We say that a matrix H is of h-tvpe if for some matrix E E Pn the two equations H = IEH and E = HJE take place. We call

E the main part of the h-type matrix H. In other words, the sets of zero rows of matrices H

EE

H. In particular, if H is nonsingular, then H E Sn. The nonzero columns of matrix E are

H

Let A E Fnxn be a matrix of rank (A) = r ^ n. We wish to obtain a nonsingular matrix R with the following properties:

(1) RA = H and ^ ^ matrix of h-type with main part E E Pn .

(2) If A = I A, I E Dn, rank I = r, then R = I + IRI.

If A is invertible, then HTH = I and A-1 = HTR. As this algorithm is a generalization

H

RA = H

R1A = A1, R2A1 = A2, R3A2 = A3, R = R3R2R1, A3 = H.

RA A 2n A Ai

A=i 11) ■Ai=( Ai: At )■(i='■23).

BecTHiiK Try, t. 15, Bbin. 4, 2010

4.1 The first substep

Let R11A11 = H11 and H11 be the h-tvpe matrix with the main part En . We denote J11 E'T1E11, I11 = E11E'T1 and put

R1 ( —a21eT1 i ) ( 0 I ^A and R1A a1.

A1

An = H11, A12 = R11A12, A21 = A21(I — E11H11), A22 = A22 — A21El1A-12.

Let us note that A^ = A^1J 11, because in the place of each nonzero column of matrix E11 in

the matrix I — E^Hn the zero column stands,

4.2 The second substep

R12 R21

R12 A 112 = H12 , R21 A 211 = H21 ,

H12 H21 h E12 E21

us denote Bl2 = R21A22 . Consider the diagonal matrices J12 = E't'2E12, J21 = E't'1E21, I12 =

E12E'T2, 121 = E21ET1 and put

R = h I —AhE.\ W I 0 W I — I11A12ET2 0 \h R12 0

R2 V 0 I ) v —B12ET2 I )\ 0 I ) \ 0 R21

(I — I11A12E12 + A11E21B22E12) R12 —A11E21R'

11a12e12 + a11e21b22e12)r12 —a11e21r21 22E12R12 R21

Using the identities H12 = 111H12l I11E12 = 0, E't'2I11 = 0, and the fact, that R12 = I11 +

111R12111 and Hn = I11H11 R12H11 = H11, E:T2R12H11 = ^ E^T2R12A^2 = ElT2H12,

we get the block

A22 = R12A112 + ( — I11A12 + Al11E21B22)E,T2H12 — AUE21B22 = H12 + I11Al12 — I11A12 Ei2H12 + A11E21B22(ElT2H12 — I) = H12 + (I11A12 — AUE21B22)(I — E12H12) , A2

A2n = Hn(I — ET1H21),

A12 = H12 + (I11A12 — A11E21B22)(I — E12H12) , A221 = H21,

A22 = B\2(I — ET2R12A12) = Bl22(I — ET2H12) ■ _ _

Let us note that these blocks have the properties A^1 = A\1J21, A^1 = A^1J 1, A^2 — E12 = (Aj2 — E12)J 12, A22 = A^2J 12. We use them in the following section.

4.3 The third substep

R22

R22A^2 = H22,

where H22 is the h -type matrix with the main part E22 .

Let us consider the diagonal matrices J22 = E't'2E22, I22 = E22E't'2 and put

R = h i ~a2beT2 )h i 0 )h i 0 ) = h i —A12ET2R22

R3 v 0 1 / V 0 I — ET2 ) ^ 0 R22 ) V 0 (I — hA^R-a

Then we obtain the matrix A3 = R3A2 with the blocks:

A11 = A11, A%1 = A32 = A12(I — E22H22) , A32 = H22 + I21A22(I — E22H22) that have the properties A^ = A^J22 1 A^^2 — E22 = (A22 — E22~)J22 •

4.4 The result

We obtain the matrix

R = R R R = I L + FG —FR21

R r3r2 R1 ( —MG MR21

Here we use the notation

L = (I — l11A1l2ErT2)R12R11 M = (I — l2lA22ET22)R22 ,

F =^{^1 + AI2ET2R22)

11

G = R21(A22Ef2R12 + A21E'T1) R11 ■

A112 = R11A12,

A21 = A21(I — E1T1H11),

A22 = A22 — A21E\1 Al2,

A22 = R21A12(I — ET2H12),

A12 = H12 + I11A12(I — E12H12) — H11E21A22 ■

AM we obtain the equation RA = H with h-type matrix H which has the main part E = E11 E12 E21 E22

4.5 The important particular cases

We can outline two important particular cases.

In the first case the matrix A has an invertible block A11. In this case we obtain H11 =

H22 = R12 = R21 = I, H12 = H21 = E12 = E21 = 0 , A22 = A22 — A21A1_2, A^2 = R21A22 7

A22 = A12 = R11A12; L = R11 M = R22 F = A12R22 i G = A21R11

r = { R11 + R11A12R22A21R11 —R11A12R22 \

V —R22A21R11 R22 )

More over, if matrix A is invertible, then R = A-1.

In the second case the matrix A has zero block An = 0 and two invertible blocks A21 and Ai2 , In this case we obtain Hn = H22 = 0,Rn = R22 = H12 = H21 = I, Aj2 = A12,

A21 = A21, A22 = A22, A22 = 0 , Ai2 = H12 , L = R12R11 , M = 1 > F = 0 ; G = R21A22R12 ,

r = i R12 0

- R21A22R12 R21

5 Matrix decomposition in the commutative domain R

The mapping Aext : Rnxn x (R\0) ^ (Rnxn)3 x (R\0)

(A,S,E,d) = Aext(M,do),

with n = 2k we call the extended adjoint mapping of the couple ( M, d0) if it recursive defined as follows.

For M = 0: Aext(M, do) = (doI, 0, 0,do).

For k = 0 and M = a = 0 : Aext(a, d0) = (d0, a, a, a).

For k > 0 and M = 0 we have to divide matrix M into four equal blocks M = (Mj),

i, j E {1, 2}. Let

^1^ S11, E11, d11) = d0),

we denote

Let

M^ — AnM12/d0, M^ — M21Yn/d0, M^ — (M22 — M21E21M112)/d0.

(A12, S12, E12, d12) = Aeict(111 ^^12, d11^

(A21, S21, E21, d21) = Aext(M211,d11).

Denote M22 = A21M22Y12/(d11 )2,ds = d21d12/d11

Let (A22, S22, E22, d22) = Aext(121M22, ds).

Denote

M121

S11Y21/d11, M22 = S12d21 + (111M112 d11d21 — S11E21A21M22)Y12/(d11)

M32 = ^^i2Y12/(dsd11) , M22 = S22 + 121M22Y12/ds, L = (Id11 — 111Ml2El2)A12A11d22/(d11)2,

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

Q = (Id12d21 — 121M22E22d11)A22/ds,

F = S11E2T1 d22 + M122E22A22/ds,

G = A21(M212E22A12d0 + -M21El1d12d11) A11/(d11d0) ,

(L + FG/(d11 d21)/d12 — FA21/(d11d21)

— QG/(d11d21d12 ) QA21/ (d11d21)

2

S = f Mnd22/d21 Ml2 A E = /" E11 E12 p

y S21d22/d21 M22 / , V E21 E22 y

Then

(A, ^ E d22) = d0).

We use here the notations

A? = EijE2j , Jij = Eij , Yj = dijI — E2jSij , ^ j E {1, 2}.

Theorem 3 For arbitrary matrix M E Rnxn the extended adjoint mapping (A, S, E, d) = Aext(M, 1) definds the extended adjoint nonsingular matrix A, the echelon matrix S and the matrix E, which have the property: AM = S and dE = SJE .

The proof is based on the algorithm with one-sided decomposition of the previous section. All division operations are based on the determinant identities [4] and give as a result the quotients which are the elements of the domain R or matrices over the domain R,

S M S1 = El S El AM = El S

and S1JE = dJE , Let us write the matrix S1 in the form S1 = S0 + dJ, It is easy to see that

S02 = 0 , Therefore (S0 + dJ)(S0 — dJ) = 0 , rank(S0 — d,7) = rank(,7), rank(S1) = rank(J),

so rank(S1) + rank(S0 — dJ) = n, is why the columns of the matrix S0 + dJ generate S1 M

M

kern(M) = span(ET AM — dI).

6 Conclusion

The algorithms for finding matrix decomposition and matrix inversion are described. These algorithms have the same complexity as matrix multiplication and do not require pivoting. For singular matrices they allow to obtain a nonsingular block of the biggest size. These algorithms may be used in any field, including real and complex numbers, finite fields and their extensions. The proposed algorithms are pivot-free, and do not change the matrix block structure. They are suitable for parallel hardware implementation,

7 Appendix

In this appendix we put the proof of the theorem 1,

Proof of Theorem 1

For the matrix of size 1 x 1, when k = 0 , we can write the following LEU decompositions

LU(0) = (1,0,1) and LU(a) = (a-1,1,1), if a = 0.

Let us assume that for any matrix of size n we can write a LEU decomposition and let us given matrix A E F72,}x2n has the size 2n , We shall construct a LEU decomposition of matrix A ’

First of all we shall divide the matrices A, / > J and a desired matrix E into four equal blocks:

A

A11 A12 A21 A22

,/ = diag(/1, /2), J = diag( J1, J2),E

E11 E12 E21 E22

and denote

Let

denote the matrices

/ij = EjEij, Jj = EijEij Vi,j G {l, 2}.

(L11, E11, U11) = LU(An), Q = L11A12, B = A21U11,

A1

A21

BJ 11, A12 = 111Q, A22 = A22 — BE11Q

Let

(L12, E12, U12) = LU(Ak) and (L21, E21, U21) = LU(A11)

denote the matrices Let us put and denote

G — L21A22U12, A22 — /21GJ12.

(L22, E22, U22) = LU(A22),

?T\ \r—(TT T?T n~f 1 pT,

W — (GEj^L^ + L21BET1), V — (U21E21GJ12 + Ej^QU^),

U11U21 -U11VU22

0 U12U22

We have to prove that

(L, E, U) = LU(A).

(5)

(6)

(7)

(8) (9)

l0)

ll)

l2)

13)

14)

15)

As far as L11,L12, L21, L22 are low triangular nonsingular matrices and Un,U12, U21, U22 are upper unitriangular matrices we can see in (10) that the matrix L is a low triangular

U

Let us show that E G f*2ra ■ As far as E11, E12, E21, E22 G and A11 — /1A11J1, A1

21

BJn , A12 = /nQ, A22 = /21GJ12 and due to the Sentence 1 we obtain E11 = /11E11J11,

E21 = E21J 1h E12 = 111E12 ; E22 = 121E22J12-

E

different rows and columns of the matrix E .So E G P2n, and next identities hold

1 21 1 E12 E11 J21 E11 1 21 1 E12 J11 = J11J21 = 0, (16)

E11 1 12 1 E1 1E 12 1 1 11 2 11E = /12/11 = 0, (17)

E12E22 : 2 J22 2 E1 = J12E22 2 J22 2 J1 = 0, (18)

21 E2 1 22 1 E2 = E22/21 21 E2 2 122 = /22/21 : = 0. (19)

We have to prove, that E = LAU, This equation in block form consists of four block equalities:

The Sentence 1 together with equations A12 = /11 LnA12, A21 = A21U11Jn , A22 = /21L21(A22 — A21U11Ejr1L11A12)U12J 12 give the next properties of L- and U- blocks:

which follows from the definition of the block V in (13), (24), (16) and (6), the equality

which follows from the definition of the block W in (13), (23), (17) and (6),

1, The first equality of (20) follows from (27), (23) and (24),

2,The right part of the second equality of (20) takes the form L12(I — /11)QU12U22 due to (8), (27) and (28), To prove the second equality we use the definition of the blocks B and A12 in (8) and (9), then the second equality in (27) and identity (25): L12(I — /11)QU12U22 = L12A12U12U22 = E12 U22 = E12.

3, The right part of the third equality of (20) takes the form L22L21B(I — J11)U21 due to

B

E11 = L12L11A11U11U21;

E12 = L12L11 (A12U12 — A11U11V )U22i E21 = L22(L21 A21 — WL11A11)U11U211

E21

E22

(20)

E22 — L22 ((L21A22 — WL11A12) U12 — (L21A21 — WL11A11 )UnV) U22.

Therefore we have to prove these block equalities.

Let us note, that from the identity A11 = /1A11J1 and Sentence 1 we get

L11 = 111 + /1L11/11, U11 = J11 + J11U12 J1.

(21)

(22)

The following identities can be easy checked now

L12 E11 = E11, L12/11 = /11,

(23)

E12 U22 = E12, J12 U22 = J12,

E11U21 = E11, J11U21 = J1b

L22 E21 = E21, L22/21 = /21.

(24)

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

(25)

(26)

We shall use the following equalities,

L11A11U11 = E11, L12A12U12 = E12, L21A21U21 = E21 ,L22A22U22 = E22, (27)

which follows from (7),(10) and (12), the equality

E11V = /11QU12,

(28)

WE11 — L21BJ11

(29)

A121

L22L21BJ11U21 = L22L21A21U21 = L22E21 = E21 ,

4, The identity

E12L12 = E12 L12(/11 + 111) = E12L12/11 (30)

follows from (23) and (17).

We have to cheek that (L21A22 — WL11A12)U12 = (L21A22 — (GEj^L^ + L21BEjr1)Q)U12 =

L21(A22 — BE11Q)U12 — GE12L12QU12 = L21A22U12 — GE12 L12111QU12 = G — GEi2 L12A12U12 = G — GEji2E12 = GJ12 , using the definitions of the blocks W in (13), ^1^d A12 in (9), the identity (28), the second equality in (27) and the definition (6),

We have to check that — (L21A21 — WL11A11)U11V = — (L21A21U11 — WE11 )V = (—L21B + L21BJ11) V = —L21BJ11V = —L21BJ 11(U21E21GJ 12 + EjrLQU12) = — L21A21U21E21GJ12 = —121GJ12 , using the first equality in (27), the identity (29), the definitions of the blocks V in (13), (1), then the third equality in (27) and definition (6),

To prove the forth equality we have to substitute obtained expressions to the right part of the fourth equality:

L22 (GJ 12 — 121GJ12) U22 = L22121GJ12U22 = L22A22U22 = E22.

For the completion of the proving of this theorem we have to demonstrate the special form

of the matrices U and L : L — 1E e F/,/E and U — JE e ,j ,

The matrix L is invertible and < 1 therefore we have to prove that L = 1E + 1L1E, where = diag(111 + 112,121 + ^2), 1 e = diag(111112,121122), 1 = diag(1l,12).

L

L12L11 = 11L12L11 (111 + 112) + 1ll112, 0 = A0^! + 122

— L22WL11 = —12 L22WLll(/ll + 112), L22L21 = 12L22L21 (121 + 122) + 121122.

To prove the first block equalities we have to multiply its left part by the unit matrix in the form I = (11 +11) from the left side and by the unit matrix in the form I = (111 +112) +111112 from the left side. Then we use the following identities to obtain in the left part the same expression as in the right part: L11111 = 1 l1 , L12112 = 112 , 11L12L11 = 11, 11(111 + 112) = 0, The same idea may be used for proving the last block equality, but we must use other forms of unit matrix: I = (12 + 12), I = (121 + 122) + 121122 ■

The second block equality is evident.

Let us prove the third block equality. We have to multiply the left part of the third block equality by the unit matrix in the form I = (12 +12) from the left side and by the unit matrix in the form I = (111 + 112) + 111112 from the right side,

W

W = (L21 (A22 — A21U11E11 Q)U12El2L12 + L21A21 U11En ).

We have to use in the left part the equations 12L22 = 12 , 12L21 = 12 , 12A22 = 0 , 12A21 = 0 ,

and L11111 = 111, L12112 = 112 j E,l'2/12 = ^ Ell/11 = 0 ■

The property of the matrix U: U — JE e ,j may be proved in the same wav as the L

References

1. Grigoriev D. Analogy of Bruhat decomposition for the closure of a cone of Chevalley group of a classical serie // Soviet Math, Dokl.1981. V. 23, N, 2, P, 393-397,

2. Bunch J., Hopkroft J. Triangular factorization and inversion by fast matrix multiplication

11 Mat. Comp. 1974. V. 28. P. 231-236.

3. Malaschonok G.I. Effective Matrix Methods in Commutative Domains j j Formal Power Series and Algebraic Combinatorics. Berlin: Springer, 2000. P. 506-517.

4. Malaschonok G.I. Matrix computational methods in commutative rings. Tambov: Tambov State University, 2002.

5. Akritas A., Malaschonok G. Computation of Adjoint Matrix j j Fourth International Workshop on Computer Algebra Systems and Applications (CASA 2006), LNCS 3992. Berlin: Springer, 2006. P. 486-489.

6. Watt S.M. Pivot-Free Block Matrix Inversion. Maple Conference 2006, July 23-26, Waterloo, Canada. 2006. URL: http://www.csd.uwo.ca/ watt/pub/reprints/2006-me-bminv-poster.pdf.

7. Watt S.M. Pivot-Free Block Matrix Inversion j j Proc 8th International Symposium on Symbolic and Numeric Algorithms in Symbolic Computation (SYNASC), IEEE Computer Society. 2006. P. 151-155.

8. Eberly W. Efficient parallel independent subsets and matrix factorization j j 3rd IEEE Symposium on Parallel and Distributed Processing. Dallas, USA, 1991. P. 204-211.

9. Kaltofen E., Pan V. Processor-efficient parallel solution of linear systems over an abstract field // 3rd Annual ACM Symposium on Parallel Algorithms and Architectures. ACM Press, 1991. P. 180-191.

10. Kaltofen E., Pan V. Processor-efficient parallel solution of linear systems II: The general case // 33rd IEEE Symposium on Foundations of Computer Science. Pittsburgh, USA, 1992. P. 714-723.

11. Kaltofen E., Pan V. Parallel solution of Toeplitz and Toeplitz-like linear systems over fields of small positive characteristic j j PASCO 94: First International Symposium on Parallel Symbolic Computation, World Scientific Publishing, 1994. P. 225-233.

12. Grigoriev D. Additive complexity in directed computations j j Theoretical Computer Science. 1982. V. 19. P. 39-67.

13. Kolotilina L. Yu. Sparsity of Bruhat decomposition factors of nonsingular matrices j j Notes of Scientific Seminars of LOMI. 1992. V. 202. P. 5-17.

14. Kolotilina L.Yiu, Yeremin A. Yu. Bruhat decomposition and solution of linear algebraic systems with sparse matrices j j Sov. J. Numer.Anal. and Math. Model. 1987. V. 2. P. 421-436.

15. Malaschonok G.I. Parallel Algorithms of Computer Algebra j j Materials of the conference dedicated for the 75 years of the Mathematical and Physical Dep. of Tambov State University. (November 22-24, 2005). Tambov: TSU, 2005. P. 44-56.

16. Malaschonok G.I, Zuyev M.S. Generalized algorithm for computing of inverse matrix j j 11-th conference "Derzhavinskie Chtenia", February 2-6, 2006. Tambov: TSU, 2006. P. 58-62.

17. Malaschonok G.I. On computation of kernel of operator acting in a module j j Tambov University Reports. Natural and Technical Sciences. 2008. V. 13. Issue 1. P. 129-131.

18. Strassen V. Gaussian Elimination is not optimal j j Numerische Mathematik. 1969. V. 13. P. 354-356.

GRATITUDES: Supported by the Sci. Program Devel, Sci. Potent. High. School, RNP 2.1.1.1853.

Accepted for edition 7.06.2010.

БЫСТРОЕ МАТРИЧНОЕ РАЗЛОЖЕНИЕ В ПАРАЛЛЕЛЬНОЙ КОМПЬЮТЕРНОЙ АЛГЕБРЕ

© Геннадий Иванович Малашонок

Тамбовский государственный университет им. Г.Р. Державина, Интернациональная, 33, Тамбов, 392000, Россия, доктор физико-математических наук, профессор кафедры математического анализа, e-mail: [email protected]

Ключевые слова: быстрые алгоритмы; матричное разложение; параллельные алгоритмы; компьютерная алгебра.

Предложены новые алгоритмы для вычисления матричного разложения и для вычисления обратной матрицы в случае матриц над произвольными полями. Для коммутативных областей предложен алгоритм вычисления присоединенной матрицы. Эти алгоритмы имеют сложность матричного умножения и не требуют поиска ведущего элемента и выполнения перестановок элементов матриц. Для вырожденных матриц они позволяют находить невырожденный блок наибольшего размера. Предлагаемые алгоритмы не требуют пилотирования и не меняют матричную блочную структуру. Эти алгоритмы позволяют разрабатывать соответствующие параллельные программы.

i Надоели баннеры? Вы всегда можете отключить рекламу.