Notebook 12 – Math 2121, Fall 2020
This notebook is about the algorithmic problem of computing eigenvalues and eigenvectors for (large) square matrices. First: why is this a problem?
If we know the eigenvalues
A bigger problem is figuring out the eigenvalues. The eigenvalues of an
When
However, there is no analogue of the quadratic formula for polynomials of degree 5 or higher. So when
Even approximating these solutions is not easy if
xxxxxxxxxx
using LinearAlgebra
xxxxxxxxxx
using Plots
There is a simple iterative algorithm to compute one eigenvalue of an
Here is the idea. Assume
Assume these eigenvalues satisfy
Let
These are linearly independent, so form a basis for
Now choose any vector
We can write this vector as a linear combination
for
When
Thus for
Therefore if
If we compute this vector, then
In this way we can compute the largest eigenvalue
discover_eigenvalue_iterative_method (generic function with 2 methods)
x
function discover_eigenvalue_iterative_method(A, tol=1e-12; limit=10000)
# implements iterative algorithm to find largest eigenvalue
# random starting vector
w = rand(size(A)[1], 1)
for i=1:limit
x = A * w
lambda = dot(x, w) # this computes w^T A w
# stop iteration if we have already found an eigenvalue within tolerance
if norm(x - lambda * w) < tol
break
end
w = x / norm(x)
end
lambda = dot(A * w, w)
return lambda, w
end
random_matrix (generic function with 1 method)
xxxxxxxxxx
function random_matrix(n)
# create a random diagonalizable matrix with real eigenvalues
eigs = rand(-10:1:10, 1, n)
A = zeros(n, n)
for i=1:n
A[i,i] = eigs[i]
end
while true
B = rand(-1:0.01:1, n, n)
if det(B) != 0
return B * A * B^-1
end
end
end
4×4 Array{Float64,2}:
-4.45103 -0.70114 0.741133 -4.21311
-1.18223 -4.02395 0.989379 -3.81804
4.9414 -5.28547 2.7975 3.19111
-1.26461 -2.53311 1.51567 -6.32252
Q = random_matrix(4)
-10.0
4×1 Array{Float64,2}: -0.557762 -0.531202 0.150507 -0.619736
xxxxxxxxxx
lambda, u = discover_eigenvalue_iterative_method(Q)
4×1 Array{Float64,2}: 5.57762 5.31202 -1.50507 6.19736
4×1 Array{Float64,2}: 5.57762 5.31202 -1.50507 6.19736
xxxxxxxxxx
Q * u, lambda * u
This algorithm works well most of the time, as we see in these example computations.
One situation where the iterative algorithm fails, even when
2×2 Array{Int64,2}:
100 1
0 -100
xxxxxxxxxx
B = [100 1; 0 -100]
-45.4841
2×1 Array{Float64,2}: 0.51996 0.85419
lambdaB, uB = discover_eigenvalue_iterative_method(B)
2×1 Array{Float64,2}: 52.8502 -85.419
2×1 Array{Float64,2}: -23.6499 -38.8521
xxxxxxxxxx
B * uB, lambdaB * uB
We can get around this issue with the following idea.
The eigenvalues of
So if
To avoid the issue of eigenvalues adding to zero, we first apply the iterative algorithm to
If the positive square root of the result eigenvalue is
discover_eigenvalue (generic function with 2 methods)
xxxxxxxxxx
function discover_eigenvalue(A, tol=1e-12; limit=10000)
mu_squared, _ = discover_eigenvalue_iterative_method(A^2, tol; limit=limit)
mu = abs(mu_squared)^0.5
# decide which square root to return
if rank(A - mu * I) < size(A)[1]
nu, u = discover_eigenvalue_iterative_method(A + mu * I, tol; limit=limit)
return nu - mu, u
else
nu, u = discover_eigenvalue_iterative_method(A - mu * I, tol; limit=limit)
return nu + mu, u
end
end
100.0
2×1 Array{Float64,2}: 1.0 0.0
lambda_B, u_B = discover_eigenvalue(B)
2×1 Array{Float64,2}: 100.0 0.0
2×1 Array{Float64,2}: 100.0 0.0
B * u_B, lambda_B * u_B
The following code generates some pictures of the convergence process of the iterative eigenvalue algorithm applied to
plot_vector (generic function with 1 method)
xxxxxxxxxx
function plot_vector(v, scaling, color; opacity=0.25)
v = v / norm(v) * (0.5 + 0.5 * (1 - (1 - scaling)^2))
quiver!(quiver = ([v[1]],[v[2]]), [0], [0], color=color, opacity=opacity)
end
animate_eigenvector_search (generic function with 2 methods)
xxxxxxxxxx
function animate_eigenvector_search(A, steps=25)
v = rand(2, 1)
w = rand(2, 1)
p = scatter([0], [0], xlim=(-1,1), ylim=(-1,1), legend=false, aspect_ratio=:equal,
title="Iterative method of finding eigenvectors and eigenvalues")
eig = eigen(A)
if (A[1,1] - A[2,2])^2 + 4 * A[1,2] * A[2,1] >= 0
plot_vector(eig.vectors[:,1], 1.0, :grey; opacity=1.0)
plot_vector(-eig.vectors[:,1], 1.0, :grey; opacity=1.0)
plot_vector(eig.vectors[:,2], 1.0, :grey; opacity=1.0)
plot_vector(-eig.vectors[:,2], 1.0, :grey; opacity=1.0)
end
anim = for i=0:2 * steps + 1
# find first eigenvector v
if i <= steps
plot_vector(v, i / steps, :blue)
v = A * v
v = v / norm(v)
# replace A by A - lambda I
elseif i == steps + 1
eigenvalue_v = (A * v)[1] / v[1]
A = A - I * eigenvalue_v
# first second eigevector w
else
plot_vector(w, (i - steps - 1) / steps, :orange)
w = A * w
w = w / norm(w)
end
end every 1
gif(anim, fps = 10)
end
2×2 Array{Float64,2}:
-10.6148 8.80783
-3.04604 7.61477
xxxxxxxxxx
A = random_matrix(2)
xxxxxxxxxx
animate_eigenvector_search(A, 50)
The blue vectors are the iterations of our eigenvalue algorithm applied to
Once the algorithm converges to an eigenvector
If the second algorithm converges to an eigenvector
bad_matrix (generic function with 1 method)
function bad_matrix()
while true
A = 4 * (rand(2, 2) .- 0.5)
a, b, c, d = A[1,1], A[1,2], A[2,1], A[2,2]
if (a - d)^2 + 4 * b * c < 0
return A
end
end
end
For some matrices
The failure here is related to the matrix
2×2 Array{Float64,2}:
1.94497 1.4051
-1.87377 -0.0389516
C = bad_matrix()
xxxxxxxxxx
animate_eigenvector_search(C, 50)
Let's ignore this situation.
We can use the iterative algorithm as a building block in a larger algorithm to find all the eigenvalues along with bases for all the nonzero eigenspaces.
RREF (generic function with 2 methods)
column_space_basis (generic function with 2 methods)
xxxxxxxxxx
function column_space_basis(A, tol=1e-6)
# function we'll need to compute basis of column space of a matrix
pivots, _ = pivot_positions(A, tol)
ans = zeros(size(A)[1], 0)
for (i, j) = pivots
ans = hcat(ans, A[1:end, j])
end
ans
end
null_space_basis (generic function with 2 methods)
xxxxxxxxxx
function null_space_basis(A, tol=1e-6)
# function we'll need to compute basis of null space of a matrix
n = size(A)[2]
pivots, rref = pivot_positions(A, tol)
free = [j for j=1:n if ! any(j == pivot[2] for pivot=pivots)]
ans = zeros(n, length(free))
for k = 1:length(free)
ans[free[k], k] = 1
for (i, j) = pivots
ans[j, k] = -rref[i, free[k]]
end
end
ans
end
3×4 Array{Int64,2}:
1 1 2 2
2 2 4 4
1 2 1 3
V = [ 1 1 2 2; 2 2 4 4; 1 2 1 3 ]
3×2 Array{Float64,2}:
1.0 1.0
2.0 2.0
1.0 2.0
column_space_basis(V)
4×2 Array{Float64,2}:
-3.0 -1.0
1.0 -1.0
1.0 0.0
0.0 1.0
xxxxxxxxxx
null_space_basis(V)
Here is the algorithm to find all eigenvalues of an
First find eigenvalue with largest absolute value
.Compute basis
of eigenspace .Find
such that is a basis for .
Let
Let
Compute
matrix that is the standard matrix of .Now, recursively, compute basis of
consisting of eigenvectors for .
Can show that if
is an eigenvector for
discover_all_eigenvalues (generic function with 2 methods)
x
function discover_all_eigenvalues(A, tol=1e-12)
# returns list `eigenvalues` and matrix `eigenvectors` if possible to compute
# the list is the sequence of eigenvalues of A with multiplicity
# the columns of the matrix are the corresponding eigenvectors
n = size(A)[1]
# if dim Nul A == n then A is zero matrix
if size(null_space_basis(A))[2] == n
eigenvalues = [0 for i=1:n]
eigenvectors = rand(n, n)
return eigenvalues, eigenvectors
else
# find largest eigenvalue, and basis for corresponding eigenspace
lambda, v = discover_eigenvalue(A, tol)
eigenvectors = null_space_basis(A - lambda * I, sqrt(tol))
m = size(eigenvectors)[2]
eigenvalues = [lambda for i=1:m]
# compute basis u_1, u_2, ..., u_m, v_1, v_2, ..., v_{n-m}
basis = column_space_basis(hcat(eigenvectors, I))
# compute matrix B
B = (basis^-1)[m + 1:end, 1:end] * A * basis[1:end, m + 1:end]
# recursively construct eigenvalues and eigenvectors for B
values, vectors = discover_all_eigenvalues(B, tol)
for i=1:n - m
mu = values[i]
w = basis[1:end, m + 1:end] * vectors[1:end, i]
next_eigenvector = A * w - lambda * w
# optional: rescale eigenvector. may give more accurate result
next_eigenvector = next_eigenvector / norm(next_eigenvector)
# add next eigenvalue and eigenvector to return values
append!(eigenvalues, values[i])
eigenvectors = hcat(eigenvectors, next_eigenvector)
end
return eigenvalues, eigenvectors
end
end
This relatively simple algorithm works fairly well for random matrices with all real eigenvalues. Such a matrix will be diagonalizable with high probability.
20×20 Array{Float64,2}:
-4.68913 -0.617843 -1.6796 -2.91886 … -1.94144 11.0339 3.43901
0.195941 8.7241 -0.150681 -1.77377 -4.13233 3.3259 7.81364
-4.68126 9.92404 -7.27107 -4.07498 -2.46368 -3.58807 3.66445
-1.30953 9.94846 -4.01916 -3.29598 -6.00274 10.4173 9.55129
-3.53049 7.13987 -5.55512 -0.119382 -2.77807 0.736316 4.71966
-5.21105 4.60439 -8.11439 0.858668 … -3.46147 -2.70693 2.82454
5.46664 5.04265 4.37109 1.33311 -0.90611 0.441502 0.178801
⋮ ⋱
-2.72044 -4.67965 0.120696 -1.92521 -0.977146 7.9267 3.97427
3.69971 -6.24075 1.11717 7.25487 … 3.35706 -10.8081 -8.07071
-5.2448 -4.44301 -3.69576 -0.811984 -2.02894 2.92825 3.09466
2.90048 -10.205 5.90078 3.02679 0.799246 1.4141 -5.80947
-3.60132 5.28736 -2.90162 0.424574 -2.93991 5.91369 10.4931
-5.3127 9.44401 -8.29506 -0.502237 -3.83783 -4.90386 4.78342
xxxxxxxxxx
M = random_matrix(20)
-10.0
-9.0
-8.0
-8.0
7.0
-7.0
-6.0
-6.0
-5.0
5.0
4.0
-2.0
-1.0
1.0
1.0
1.0
2.0
3.0
3.0
3.0
20×20 Array{Float64,2}: -0.010101 0.228228 -0.150134 … 0.0674063 -0.142466 0.131463 -0.79798 0.202461 -0.0336989 -0.38511 0.252647 0.147469 0.585859 0.272401 -0.126279 0.0628273 0.148879 0.0948665 -0.717172 -0.147244 0.137405 0.201773 -0.45317 0.123764 0.40404 0.117795 -0.0128432 0.0722329 -0.206481 0.369894 0.40404 0.327618 -0.149195 … -0.537339 0.364683 0.209404 0.414141 -0.294488 -0.120648 0.156754 -0.24368 -0.16277 ⋮ ⋱ -0.656566 0.283445 -0.010508 0.14895 0.113807 -0.229795 0.838384 0.103071 -0.315836 … -0.0744556 0.0133835 0.022895 -0.888889 0.0809842 0.381928 0.237066 -0.000742201 -0.238057 -0.767677 -0.209823 0.0150178 -0.158967 0.0414805 -0.22779 -0.717172 0.298169 0.327534 0.229747 -0.337764 -0.0178747 1.0 0.283445 -0.0131408 0.0300725 0.262729 -0.0655744
D, P = discover_all_eigenvalues(M)
xxxxxxxxxx
# P * Diagonal(D) * P^-1, M
6.296838914532222e-10
x
norm(P * Diagonal(D) - M * P) # norm computes sqrt(sum of squares of all entries)
Of course, more advanced algorithms to do the same computations are already implemented in the Julia package LinearAlgebra and these are more robust, faster, and more accurate.
Eigen{Float64,Float64,Array{Float64,2},Array{Float64,1}}
values:
20-element Array{Float64,1}:
-9.999999999999991
-8.99999999999999
-8.0
-7.999999999999989
-6.999999999999989
-5.999999999999993
-5.9999999999999805
⋮
2.9999999999999827
2.9999999999999987
2.9999999999999987
4.000000000000007
5.000000000000008
7.000000000000016
vectors:
20×20 Array{Float64,2}:
0.00350381 0.228228 0.268391 … -0.374486 -0.0599401 -0.0495405
0.276801 0.202461 -0.0678501 0.30427 -0.167832 -0.324265
-0.203221 0.272401 0.187501 0.234054 0.355645 -0.099081
0.248771 -0.147244 -0.155182 -0.339378 -0.339661 -0.261214
-0.140153 0.117795 -0.0659771 -0.273063 0.023976 -0.292739
-0.140153 0.327618 -0.0538223 … -0.296468 0.195804 -0.211673
-0.143656 -0.294488 -0.086948 0.191144 -0.0679321 -0.117096
⋮ ⋱
0.227748 0.283445 0.0063399 -0.13263 -0.0679321 0.157629
-0.290817 0.103071 0.0546691 … 0.206748 -0.395605 0.193658
0.308336 0.0809842 -0.369211 -0.269162 0.00399601 0.17114
0.26629 -0.209823 -0.17892 0.039009 0.0639361 0.25671
0.248771 0.298169 -0.453486 -0.374486 -0.235764 -0.274725
-0.346878 0.283445 -0.133741 -0.156036 0.023976 -0.364798
xxxxxxxxxx
a = eigen(M)
2.0779501385380722e-13
x
norm(a.vectors * Diagonal(a.values) * a.vectors^-1 - M)