Notebook 12 – Math 2121, Fall 2020

This notebook is about the algorithmic problem of computing eigenvalues and eigenvectors for (large) square matrices. First: why is this a problem?

If we know the eigenvalues λ of a square matrix A, then we can construct a basis for every nonzero eigenspace Nul(AλI) using our algorithm based on row reduction. However, this is potentially time consuming if we have to row reduce many large matrices AλI.

A bigger problem is figuring out the eigenvalues. The eigenvalues of an n×n matrix A are the solutions to the polynomial equation det(AxI)=0.

When n=2, we can solve this equation using the quadratic formula.

However, there is no analogue of the quadratic formula for polynomials of degree 5 or higher. So when n5 there is no formula to describe the exact solutions to det(AxI)=0.

Even approximating these solutions is not easy if n is large.

24.3 μs
1.1 ms
15.8 s

There is a simple iterative algorithm to compute one eigenvalue of an n×n matrix A, along with its eigenvector. This algorithm usually works if A has n distinct eigenvalues.

Here is the idea. Assume A has real eigenvalues λ1,λ2,,λn.

Assume these eigenvalues satisfy |λ1|>|λ2||λn|0.

Let v1,v2,,vnRn be the associated eigenvectors.

These are linearly independent, so form a basis for Rn.

Now choose any vector w=[w1w2wn]Rn. The length of w is w=w12+w22++wn2

We can write this vector as a linear combination

w=c1v1+c2v2++cnvn

for c1,c2,,cnR. Consider

Aw=c1λ1v1+c2λ2v2++cnλnvn

A2w=c1λ12v1+c2λ22v2++cnλn2vn

A3w=c1λ13v1+c2λ23v2++cnλn3vn

ANw=c1λ1Nv1+c2λ2Nv2++cnλnNvn

When N is very large, we have (λi/λ1)N0 for i=2,3,,n so

1λ1NANw=c1v1+c2(λ2/λ1)Nv2++cn(λn/λ1)Nvnc1v1.

Thus for N0 we have ANwc1λ1Nv1 which means ANw|c1||λ1|Nv1.

Therefore if N is large enough then

u:=1ANwANw±1v1v1

If we compute this vector, then Auλ1u and uTAuλ1 since u=1.

In this way we can compute the largest eigenvalue λ1 of A, along with one of its eigenvectors u.

3.4 ms
discover_eigenvalue_iterative_method (generic function with 2 methods)
106 μs
random_matrix (generic function with 1 method)
57 μs
Q
4×4 Array{Float64,2}:
 -4.45103  -0.70114  0.741133  -4.21311
 -1.18223  -4.02395  0.989379  -3.81804
  4.9414   -5.28547  2.7975     3.19111
 -1.26461  -2.53311  1.51567   -6.32252
29.2 μs
31.9 μs
18.1 μs

This algorithm works well most of the time, as we see in these example computations.

One situation where the iterative algorithm fails, even when A has all real eigenvalues, is if the two largest eigenvalues of A in absolute value satisfy |λ1|=|λ2| and λ1+λ2=0

13.9 μs
B
2×2 Array{Int64,2}:
 100     1
   0  -100
5.6 μs
12.2 ms
15 μs

We can get around this issue with the following idea.

The eigenvalues of A2 are the squares λ2 of the eigenvalues λ of A.

So if A has two eigenvalues λ and λ, then A2 will have only the eigenvalue λ2.

To avoid the issue of eigenvalues adding to zero, we first apply the iterative algorithm to A2.

If the positive square root of the result eigenvalue is λ, then we apply the iterative algorithm again to A+λI or AλI. These matrices will have eigenvalues 0 and ±2λ rather than λ and λ.

29.5 μs
discover_eigenvalue (generic function with 2 methods)
119 μs
35.8 μs
15.6 μs

The following code generates some pictures of the convergence process of the iterative eigenvalue algorithm applied to 2×2 matrices.

9.7 μs
plot_vector (generic function with 1 method)
86.7 μs
animate_eigenvector_search (generic function with 2 methods)
147 μs
A
2×2 Array{Float64,2}:
 -10.6148   8.80783
  -3.04604  7.61477
35.4 μs
5.9 s

The blue vectors are the iterations of our eigenvalue algorithm applied to A. The picture shows earlier iterations are shorter vector than later ones.

Once the algorithm converges to an eigenvector u with eigenvalue λ, we can apply iterative algorithm a second time to AλI. The iterations of this algorithm are the orange vectors.

If the second algorithm converges to an eigenvector v with eigenvalue μ, then we can have diagonalized A=PDP1 where P=[uv] and D=[λ00μ].

20.4 μs
bad_matrix (generic function with 1 method)
64.3 μs

For some matrices C, the iterative algorithm will fail to converge to any eigenvectors and we won't be able to discover an eigenvalue.

The failure here is related to the matrix C having complex eigenvalues, which we'll discuss next time. Here is what the iterative algorithm looks like applied to such matrices:

14.5 μs
C
2×2 Array{Float64,2}:
  1.94497   1.4051
 -1.87377  -0.0389516
8.5 μs
6.1 s

Let's ignore this situation.

We can use the iterative algorithm as a building block in a larger algorithm to find all the eigenvalues along with bases for all the nonzero eigenspaces.

14.7 μs
RREF (generic function with 2 methods)
494 μs
column_space_basis (generic function with 2 methods)
68.7 μs
null_space_basis (generic function with 2 methods)
125 μs
V
3×4 Array{Int64,2}:
 1  1  2  2
 2  2  4  4
 1  2  1  3
51.1 ms
3×2 Array{Float64,2}:
 1.0  1.0
 2.0  2.0
 1.0  2.0
453 ms
4×2 Array{Float64,2}:
 -3.0  -1.0
  1.0  -1.0
  1.0   0.0
  0.0   1.0
154 ms

Here is the algorithm to find all eigenvalues of an n×n matrix A:

  • First find eigenvalue with largest absolute value λ.

  • Compute basis u1,u2,,um of eigenspace Nul(AλI).

  • Find v1,v2,,vnm such that u1,u2,,um,v1,v2,,vnm is a basis for Rn.

Let K:RnRnm be linear transformation with

K(ui)=0andK(vi)=ei.

Let L:RnmRn be linear transformation with L(ei)=vi.

  • Compute (nm)×(nm) matrix B that is the standard matrix of KL.

  • Now, recursively, compute basis of Rnm consisting of eigenvectors w for B.

Can show that if wRnm has Bw=μw then λμ and

(AλI)[v1v2vnm]w

is an eigenvector for A with eigenvalue μ.

9.5 μs
discover_all_eigenvalues (generic function with 2 methods)
103 μs

This relatively simple algorithm works fairly well for random matrices with all real eigenvalues. Such a matrix will be diagonalizable with high probability.

14.2 μs
M
20×20 Array{Float64,2}:
 -4.68913    -0.617843  -1.6796    -2.91886   …  -1.94144    11.0339     3.43901
  0.195941    8.7241    -0.150681  -1.77377      -4.13233     3.3259     7.81364
 -4.68126     9.92404   -7.27107   -4.07498      -2.46368    -3.58807    3.66445
 -1.30953     9.94846   -4.01916   -3.29598      -6.00274    10.4173     9.55129
 -3.53049     7.13987   -5.55512   -0.119382     -2.77807     0.736316   4.71966
 -5.21105     4.60439   -8.11439    0.858668  …  -3.46147    -2.70693    2.82454
  5.46664     5.04265    4.37109    1.33311      -0.90611     0.441502   0.178801
  ⋮                                           ⋱                         
 -2.72044    -4.67965    0.120696  -1.92521      -0.977146    7.9267     3.97427
  3.69971    -6.24075    1.11717    7.25487   …   3.35706   -10.8081    -8.07071
 -5.2448     -4.44301   -3.69576   -0.811984     -2.02894     2.92825    3.09466
  2.90048   -10.205      5.90078    3.02679       0.799246    1.4141    -5.80947
 -3.60132     5.28736   -2.90162    0.424574     -2.93991     5.91369   10.4931
 -5.3127      9.44401   -8.29506   -0.502237     -3.83783    -4.90386    4.78342
57.9 μs
62.9 ms
1.4 μs
6.296838914532222e-10
34.3 μs

Of course, more advanced algorithms to do the same computations are already implemented in the Julia package LinearAlgebra and these are more robust, faster, and more accurate.

9.2 μs
a
Eigen{Float64,Float64,Array{Float64,2},Array{Float64,1}}
values:
20-element Array{Float64,1}:
 -9.999999999999991
 -8.99999999999999
 -8.0
 -7.999999999999989
 -6.999999999999989
 -5.999999999999993
 -5.9999999999999805
  ⋮
  2.9999999999999827
  2.9999999999999987
  2.9999999999999987
  4.000000000000007
  5.000000000000008
  7.000000000000016
vectors:
20×20 Array{Float64,2}:
  0.00350381   0.228228    0.268391   …  -0.374486  -0.0599401   -0.0495405
  0.276801     0.202461   -0.0678501      0.30427   -0.167832    -0.324265
 -0.203221     0.272401    0.187501       0.234054   0.355645    -0.099081
  0.248771    -0.147244   -0.155182      -0.339378  -0.339661    -0.261214
 -0.140153     0.117795   -0.0659771     -0.273063   0.023976    -0.292739
 -0.140153     0.327618   -0.0538223  …  -0.296468   0.195804    -0.211673
 -0.143656    -0.294488   -0.086948       0.191144  -0.0679321   -0.117096
  ⋮                                   ⋱                          
  0.227748     0.283445    0.0063399     -0.13263   -0.0679321    0.157629
 -0.290817     0.103071    0.0546691  …   0.206748  -0.395605     0.193658
  0.308336     0.0809842  -0.369211      -0.269162   0.00399601   0.17114
  0.26629     -0.209823   -0.17892        0.039009   0.0639361    0.25671
  0.248771     0.298169   -0.453486      -0.374486  -0.235764    -0.274725
 -0.346878     0.283445   -0.133741      -0.156036   0.023976    -0.364798
145 μs
2.0779501385380722e-13
121 μs