HEAD PREVIOUS

Appendix A
Summary of Matrix Algebra

A.1  Vector and Matrix Multiplication

We consider a vector v of length J (in an abstract vector space of dimensions J) to be an ordered sequence of J numbers 103 . The vector can be displayed either as a column
v=( v1 v2 : vJ ) (A.1)
or as a row, which we regard as the transpose, denoted T , of the column vector:
vT =( v1 , v2 ,..., vJ ). (A.2)
Vectors of the same dimensions can be added together so that the jth entry of u+v is uj + vj .
The scalar product of two vectors u, v, in vector notation is indicated by a dot, but in matrix notation the dot is usually omitted. Instead we write it
uT v= j=1 J uj vj . (A.3)
If we have a set of k column vectors vk , for k=1,,K, the jth element of the kth vector can be written, Vjk , and they can be arrayed compactly one after the other as
V=( V11 V12 V1K V21 V22 V2K ::: VJ1 VJ2 VJK ). (A.4)
This is a matrix. We can consider matrix multiplication to be a generalization of the scalar product. So premultiplying a J×K matrix V, by a length J row vector uT gives a new row vector of length K
uT V=( j=1 J uj Vj1 , j=1 J uj Vj2 ,, j=1 J uj VjK ). (A.5)
If we further have a set of M row vectors, we can display them as a matrix
U=( U11 U12 U1J U21 U22 U2J ::: UM1 UM2 UMJ ) (A.6)
(dispensing with the transpose notation for brevity and consistency). And multiplication of the matrices U ( M×J) and V ( J×K) can be considered to give an M×K matrix:
UV=( j=1 J U1j Vj1 j=1 J U1j Vj2 j=1 J U1j VjK j=1 J U2j Vj1 j=1 J U2j Vj2 j=1 J U2j VjK ::: j=1 J UMj Vj1 j=1 J UMj Vj2 j=1 J UMj VjK ). (A.7)
This is the definition of matrix multiplication. A matrix (or vector) can also be multiplied by a single number: a scalar, λ (say). The (jk)th element of λV is λ Vjk .
The transpose of a matrix A=( Aij ) is simply the matrix formed from reversing the order of suffixes 104 : AT =( Aij T )=( Aji ). The transpose of a product of two matrices is therefore the reverse of the product of the transposes:
(AB )T = BT AT . (A.8)

A.2  Determinants

The determinant of a square matrix is a single scalar that is an important measure of its character. Determinants may be defined inductively. Suppose we know the definition of determinants of matrices of size (M-1)×(M-1). Define the determinant of an M×M matrix A whose ijth entry is Aij , as the expression
det(A)=|A|= j=1 M A1j Co1j (A) (A.9)
where Coij (A) is the ijth cofactor of the matrix A. The ijth cofactor of an M×M matrix is (-1 )i+j times the determinant of the (M-1)×(M-1) matrix obtained by removing the ith row and the jth column of the original matrix:
Coij (A)=(-1 )i+j |( A11 A1j-1 A1j+1 A1M :::::: Ai-1,1 Ai-1,j-1 Ai-1,j+1 Ai-1,M Ai+1,1 Ai+1,j-1 Ai+1,j+1 Ai+1,M :::::: AM1 AMj-1 AMj+1 AMM )|. (A.10)
The inductive definition is completed by defining the determinant of a 1×1 matrix to be equal to its single element. The determinant of a 2×2 matrix is then A11 A22 - A12 A21 , and of a 3×3 matrix is A11 ( A22 A33 - A23 A32 )+ A12 ( A23 A31 - A21 A13 )+ A13 ( A21 A32 - A22 A31 ).
The determinant of an M×M matrix may equivalently be defined as the sum over all the M! possible permutations P of the integers 1,...,M, of the product of the entries Πi Ai,P(i) times the signum of P (plus or minus 1 according to whether P is even or odd):
|A|= P sgn(P) A1,P(1) A2,P(2) ... AM,P(M) . (A.11)
This expression shows that there is nothing special about the first row in eq. (A.9). One could equally well have used any row, i, giving |A|= j=1 M Aij Coij (A); or one could have used any column, j, |A|= i=1 M Aij Coij (A). All the results are the same.
The determinant of the transpose of a matrix A is equal to its determinant: | AT |=|A|. The determinant of a product of two matrices is the product of the determinants: |AB|=|A||B|. A matrix is said to be singular if its determinant is zero, otherwise it is nonsingular. If a matrix has two identical (or proportional, i.e. dependent) rows or two identical columns, then its determinant is zero and it is singular 105 .

A.3  Inverses

The unit matrix is square,
I=( δij )=( 100 010 ::: 001 ) (A.12)
with ones on the diagonal and zeroes elsewhere. It may be of any size, N, and if need be then denoted IN . For any M×N matrix A,
IM A=A       and       A IN =A. (A.13)
The inverse of a square matrix A, if it exists, is another matrix written A-1 such that 106
A-1 A=A A-1 =I. (A.14)
A nonsingular square matrix possesses an inverse. A singular matrix does not.
The inverse of a matrix may be identified by considering the identity
j=1 M Aij Cokj (A)= δik |A|. (A.15)
For i=k, this equality arises as the expansion of the determinant by row i. For ik, the sum represents the determinant, expanded by row k, of a matrix in which the row k has been replaced by a copy of row i. The modified matrix has two rows identical, so its determinant is zero, as is δij ,ij. Now if we regard Co(A) as a matrix, consisting of all the cofactors. Then we can consider j=1 M Aij Cokj (A) as being the matrix product of A by the transpose of the cofactor matrix, ACo(A )T . So if |A| is nonzero we may divide (A.15) through by it and find
A[Co(A )T /|A|]=I. (A.16)
This equality shows that
A-1 =Co(A )T /|A|. (A.17)
Consequently the solution of the nonsingular matrix equation Ax=b is
x= Co(A )T b |A| , (A.18)
which for column vectors x and b is Cramer's rule.
The inverse of the product of two nonsingular matrices is the reversed product of their inverses:
(AB )-1 = B-1 A-1 . (A.19)

A.4  Eigenanalysis

A square matrix A maps the linear space of column vectors onto itself via Ax=y, with y the vector onto which x is mapped. An eigenvector is a vector which is mapped onto a multiple of itself. That is
Ax=λx, (A.20)
where λ is a scalar called the eigenvalue. In general a square matrix of dimension N has N different eigenvectors. Obviously an eigenvector times any scalar is still an eigenvector, which is not considered to be different.
Since eq. (A.20), which is (A-λI)x=0, is a homogeneous equation for the elements of x, in order for there to be a non-zero solution, x, the determinant of the coefficients must be zero:
|A-λI|=0. (A.21)
For an N×N matrix, this determinant gives a polynomial of order N for λ, whose N roots are the N eigenvalues.
If A is symmetric, that is if AT =A, then the eigenvectors corresponding to different eigenvalues are orthogonal, that is, their scalar product is zero. See this by considering two eigenvectors e1 and e2 , corresponding to different eigenvalues λ1 , λ2 , and using the respective versions of eq. (A.20) and the properties of the transpose.
e2 T A e1 = e2 T λ1 e1 ,    e2 T AT e1 =( e1 T A e2 )T =( e1 T λ2 e2 )T = e2 T λ2 e1 . (A.22)
So by subtraction
0= e2 T (A- AT ) e1 =( λ1 - λ2 ) e2 T e1 . (A.23)
If there are multiple independent eigenvectors with identical eigenvalues, they can be chosen to be orthogonal. In that standard case, the eigenvectors are all orthogonal: ei T ej =0 for ij.
If we then take the eigenvectors also to be normalized such that ej T ej =1, we can construct a square matrix U whose columns are equal to these eigenvectors (as in eq. (A.4)). The matrix U whose columns are orthonormal is said to be an orthonormal matrix (sometimes just called orthogonal). The inverse of U is its transpose: U-1 = UT . This U is a unitary basis transformation which diagonalizes A. This fact follows from the observation that AU=DU=UD where D is the diagonal matrix constructed from the eigenvalues:
D=( λ1 00 0 λ2 0 ::: 00 λN ). (A.24)
Therefore
UT AU= UT UD=D. (A.25)

HEAD NEXT