Notebook 15 – Math 2121, Fall 2020

In this notebook we will explore the eigenvalues of a random orthogonal matrix.

An orthogonal matrix is a square matrix $Q$ with the following equivalent properties:

The columns of $Q$ are orthonormal.
It holds that $Q Q^{T} = Q^{T} Q = I_{n}$ when $Q$ is $n \times n$ .
The matrix $Q$ is invertible with $Q^{- 1} = Q^{T}$ .

Suppose $Q$ is an $n \times n$ orthogonal matrix and $Q_{i j}$ is the entry in position $(i, j)$ of $Q$ .

Then the property $Q^{T} Q = I_{n}$ implies that $\sum_{i = 1}^{n} (Q_{i j})^{2} = 1$ for each $j = 1, 2, \dots, n$ .

Therefore all entries of $Q$ must be real numbers in the interval $[- 1, 1]$ .

In other words, the set of all $n \times n$ orthogonal matrices, within the set of $n \times n$ matrices, is a region of finite volume. In fact, this volume is at most $2^{n^{2}}$ .

If you have a region of space of finite volume, you can try to sample points from that region uniformly at random. That is, you can randomly select points from the region in such a way that no point is more likely to be selected than any other.

In particular, it makes sense to talk about a uniformly random element of the set of $n \times n$ orthogonal matrices. But how, practically, can we generate this sort of random matrix?

42.1 μs

xxxxxxxxxx
 
using LinearAlgebra

497 μs

xxxxxxxxxx
 
using Plots

11.6 s

xxxxxxxxxx
 
using Distributions

2.5 s

Julia makes it easy to generate matrices whose individual entries are chosen uniformly at random from some interval. The following generates a 4-by-4 matrix each of whose entries is drawn uniformly at random from the interval $[0, 1]$ :

7.2 μs

4×4 Array{Float64,2}:
 0.67146   0.328488    0.212337  0.86587
 0.373607  0.410998    0.515854  0.246521
 0.609279  0.285134    0.256232  0.246153
 0.756213  0.00691312  0.198154  0.289334

xxxxxxxxxx
 
M = rand(4, 4)

52.1 ms

With high probability this random matrix is invertible. However, it is not typically orthogonal.

4.2 μs

0.03614020519777915

xxxxxxxxxx
 
det(M) # will be nonzero if M is invertible

218 ms

2.3857618450813916

xxxxxxxxxx
 
norm(M * transpose(M) - I) # will be zero if M is orthogonal

906 ms

We introduced the Gram Schmidt process in class.

This algorithm gives us a way to generate random orthogonal matrices:

Generate a random $n \times n$ matrix $A$ , with entries sampled independently from what's called the standard normal distribution.
Perform the Gram Schmidt process on the columns of $A$ .
Normalize the resulting orthogonal vectors to have unit length
Use these orthonormal vectors as the columns of a matrix $Q$ .

The matrix $Q$ generated in this way will be orthogonal, and uniformly distributed over all $n \times n$ orthogonal matrices.

7.7 μs

gram_schmidt_process (generic function with 1 method)

xxxxxxxxxx
 
function gram_schmidt_process(A)
    # performs orthonormal version of Gram Schmidt process on columns of matrix A
    # resulting orthogonal matrix is returned as Q
    # the second matrix R returned is upper triangular such that A = QR
    
    n = size(A)[1]
    Q, R = zeros(n, n), zeros(n, n)
    
    for j=1:n
        v = A[1:n, j]
        x = A[1:n, j]
        for i = 1:j - 1
            u = Q[1:n, i]
            x -= dot(u, v) * u
        end
        Q[1:n, j] = x / norm(x)
        R[1:n, j] = transpose(Q) * v
    end
​
    return Q, R
end

46.2 μs

The definition of the standard normal distribution is something you'll see in a class on probability.

Very briefly, this is a probability distribution that governs many natural processes (like scores on exams). If you sample many random numbers from the standard normal distribution, the histogram you'll get looks like the following plot:

6.1 μs

xxxxxxxxxx
 
begin
    pdf = x -> exp(-x^2/2) / sqrt(2 * pi)
    q1 = plot(pdf, legend=false, title="Density function")
    
    samples = rand(Normal(0, 1), 100000, 1)
    q2 = histogram(samples, normalize = :pdf, legend=false, title="Histogram")
    
    plot(q1, q2)
end

7.4 s

plot_complex_numbers (generic function with 1 method)

xxxxxxxxxx
 
function plot_complex_numbers(numbers, title)
    # creates scatter plot of some complex numbers, also drawing unit circle 
    scatter(numbers, aspect_ratio=:equal, legend=false, title=title)
    plot!([exp(2 * pi * x * im) for x = 0:0.01:1], aspect_ratio=:equal)
end

50.3 μs

xxxxxxxxxx
 
n = 300

1.5 μs

xxxxxxxxxx
 
A = rand(Normal(0, 1), n, n);

1.2 ms

xxxxxxxxxx
 
Q, R = gram_schmidt_process(A);

490 ms

1.4312821076116828e-12

xxxxxxxxxx
 
norm(A - Q * R)

110 ms

Here is our random 300 by 300 orthogonal matrix:

12.5 μs

300×300 Array{Float64,2}:
  0.0448874    0.0491666   -0.0739788   …  -0.0317672    0.0736054    -0.124933
 -0.056724     0.0556908    0.0604959      -0.0555389    0.015055      0.0244777
  0.0920992    0.00327534   0.00674899      0.0757148    0.0509849    -0.000920985
  0.0528592    0.03524      0.0471785      -0.0568397   -0.0440849    -0.0474272
  0.00119506  -0.0354313   -0.0123086       0.0178764    0.0179261     0.0520182
 -0.0203609    0.100654     0.0199392   …  -0.05953      0.038262      0.0528598
  0.0400311   -0.0235413    0.0126642      -0.0156188    0.0786087    -0.0436996
  ⋮                                     ⋱                             
 -0.0475376    0.0326474    0.00481165      0.00945692   0.0121606     0.0357641
 -0.0816301    0.0506513   -0.153399    …   0.0776899   -0.000731917  -0.113912
 -0.0700405    0.0554915    0.0238207      -0.0776223   -0.179625      0.0360585
 -0.0575912    0.0625821   -0.0142933      -0.0100343    0.00717623   -0.0340247
 -0.0545604    0.0385057   -0.0205914       0.0849721    0.0526457     0.0195796
  0.00672487   0.0163181   -0.0407016       0.0407104   -0.0492313     0.0823654

xxxxxxxxxx
 
Q

822 ns

1.2246005129615576e-12

xxxxxxxxxx
 
norm(transpose(Q) * Q - I) # will be small if Q is orthogonal as desired

10.2 ms

Recall that if $z = a + b i \in C$ then $| z | = \sqrt{a^{2} + b^{2}}$ .

All eigenvalues $λ$ for $Q$ are complex numbers with $| λ | = 1$ .

This is because if $Q v = λ v$ for $0 \neq v \in C^{n}$ then $Q \bar{v} = \bar{λ} \bar{v}$ and $Q^{- 1} \bar{v} = (1 / \bar{λ}) \bar{v}$ .

Thus, in this case we have

${\bar{v}}^{T} Q v = {\bar{v}}^{T} (Q v) = {\bar{v}}^{T} (λ v) = λ {\bar{v}}^{T} v = λ ∥ v ∥^{2}$

while at the same time

${\bar{v}}^{T} Q v = (Q^{T} \bar{v})^{T} v = (Q^{- 1} \bar{v})^{T} v = ((1 / \bar{λ}) \bar{v})^{T} v = (1 / \bar{λ}) ∥ v ∥^{2} .$

Since $∥ v ∥^{2}$ is a nonzero real number, it follows that $λ = 1 / \bar{λ}$ so $| λ |^{2} = λ \bar{λ} = 1$ and $| λ | = 1$ .

Below is a plot of the eigenvalues of our random orthogonal matrix:

16.9 μs

xxxxxxxxxx
 
begin
    eig = eigen(Q).values
    if n <= 1000
        plot_complex_numbers(eig, "Eigenvalues of random $(n)-by-$(n) orthogonal matrix")
    end
end

1.1 s

The eigenvalues are somehow randomly distributed on the unit circle

${z \in C : | z | = 1} .$

What random process generates these values?

Another way to generate random points on the unit circle is to independently pick the points uniformly at random each time. If we do this, the resulting set of points appears like this:

4.9 μs

xxxxxxxxxx
 
begin
    random = [exp(2 * pi * rand() * im) for i=1:n]
    if n <= 1000
        plot_complex_numbers(random, "$(n) random complex numbers of unit length")
    end
end

37.2 ms

We can examine several versions of these plots.

There seems to be something qualitatively different between the two different distributions of points of the unit circle.

If we use the eigenvalues of a random orthogonal matrix, the points seem to be more evenly spaced.

If we use 300 points chosen independently at random, gaps and clusters tend to occur.

Can make this distinction more quantitative?

11.9 μs

angle (generic function with 1 method)

xxxxxxxxxx
 
function angle(z)
    # given a complex number z = a + bi, returns the angle between 0.0 and 2 * pi
    # that the vector [a; b] makes with the positive x-axis
    
    u1, u2 = real(z), imag(z)
    
    if u1 == 0
        return u2 > 0 ? pi/2 : 3 * pi/2
    end
    
    if u2 == 0
        return u1 > 0 ? 0 : pi
    end
​
    a = atan(abs(u2) / abs(u1))
    
    if u1 > 0 && u2 > 0
        return a
    elseif u1 > 0 && u1 > 0
        return 2 * pi - a
    elseif u1 < 0 && u2 > 0
        return pi - a
    else
        return pi + a
    end
end

51.2 μs

sorted_angles (generic function with 1 method)

xxxxxxxxxx
 
function sorted_angles(zlist)
    # given a list of complex numbers, computes their angles and sorts result
    ans = sort([angle(v) for v=zlist])
    # returned list is shifted to start with 0.0
    return ans .- ans[1]
end

26.2 μs

The following plots show that if we arrange the eigenvalues of a random orthogonal matrix in order going counterclockwise, we obtain 300 almost perfectly evenly spaced points on the unit circle.

We do not see such even spacing for 300 independently chosen random points, arranged in order.

16.5 μs

xxxxxxxxxx
 
begin
    p1 = plot(sorted_angles(eig), legend=false, title="Sorted angles of eigenvalues")
    p2 = plot(sorted_angles(random), legend=false, title="Sorted angles of random")
    plot(p1, p2, layout=(1, 2))
end

2.5 ms

sorted_fluctuations (generic function with 1 method)

xxxxxxxxxx
 
function sorted_fluctuations(z)
    n = length(z)
    sorted_angles(z) - [2 * pi * v / n for v=0:n - 1]
end

45.5 μs

xxxxxxxxxx
 
begin
    plot(title="Sorted angles of complex numbers versus y = 2πx/$n")
    m = maximum(abs.(sorted_fluctuations(random))) + 0.1
    plot!(
        sorted_fluctuations(eig), 
        label="Using eigenvalues of random orthogonal matrix", ylim=(-m, m)
    )
    plot!(
        sorted_fluctuations(random), 
        label="Using random complex numbers of unit length", legend=:bottomleft
    )
end

3.5 ms

What this means practically is that if $n$ is large enough and you select an $n \times n$ orthogonal matrix uniformly at random, and then you choose one of its $n$ eigenvalues $λ$ at random, the resulting complex number $λ$ will be uniformly distributed in the unit circle ${z \in C : | z | = 1}$ .

9.3 μs