Notebook 2 – Math 2121, Fall 2020

Today we will see some plotting and random functions in Julia, as we explore the meaning of (reduced) echelon form and the row reduction algorithm.

x
 
md"# Notebook 2 -- Math 2121, Fall 2020
Today we will see some plotting and random functions in Julia, as we explore the meaning of (reduced) echelon form and the row reduction algorithm."

13.9 μs

Running this notebook (optional)

If you have Pluto up and running, you can access the notebook we are currently viewing by entering this link (right click -> Copy Link) in the Open from file menu in Pluto.

x
 
md"## Running *this* notebook (optional)
If you have Pluto up and running, you can access the notebook we are currently viewing by entering [this link](http://www.math.ust.hk/~emarberg/teaching/2020/Math2121/julia/02_Math2121_Fall2020.jl) (right click -> Copy Link)  in the *Open from file* menu in Pluto.
"

17.8 μs

Plotting in Pluto

There are some good packages for making graphics in Julia. Today we'll use Plots.

xxxxxxxxxx
 
md"## Plotting in Pluto
​
There are some good packages for making graphics in Julia. Today we'll use _Plots_.
"

11.2 μs

xxxxxxxxxx
 
begin
    # This sets up a clean package environment
    import Pkg
    Pkg.activate(mktempdir())
end

2.4 ms

Plots.PlotlyBackend

x
 
begin
    # This add s the Plots package to our environment, and imports it
    using Plots
    plotly()
end

455 μs

Echelon form and random matrices

Let's write some functions to determine whether a matrix is in echelon form.

x
 
md"## Echelon form and random matrices
​
Let's write some functions to determine whether a matrix is in echelon form.
"

9.9 μs

find_leading (generic function with 3 methods)

xxxxxxxxxx
 
begin
    # Returns true if row has all zeros.
    function is_zero_row(A, row, tolerance=10e-16)
        (m, n) = size(A)
        return all([abs(A[row,col]) < tolerance for col=1:n])
    end
    
    # Returns first index of nonzero entry in row, or 0 if none exists.
    function find_leading(A, row=1, tolerance=10e-16)
        (m, n) = size(A)
        for col=1:n
            if abs(A[row, col]) > tolerance
                return col
            end
        end
        return 0
    end
end

77.6 μs

xxxxxxxxxx
 
find_leading([0 0 7 0 5])

10.9 ms

in_echelon_form (generic function with 2 methods)

x
 
function in_echelon_form(A, tolerance=10e-12)
    (m, n) = size(A)
​
    # check locations of zero rows
    seen_zero_row = false
    for row=1:m
        if is_zero_row(A, row, tolerance)
            seen_zero_row = true
        elseif seen_zero_row
            return false
        end
    end
​
    # check alignment of leading entries in nonzero rows
    for row=2:m
        if is_zero_row(A, row, tolerance)
            continue
        end
        if find_leading(A, row, tolerance) <= find_leading(A, row - 1, tolerance)
            return false
        end
    end
​
    return true
end

85.2 μs

in_echelon_form_verbose (generic function with 2 methods)

xxxxxxxxxx
 
function in_echelon_form_verbose(A, tolerance=10e-12)
        (m, n) = size(A)
        
        seen_zero_row = false
        for row=1:m
            if is_zero_row(A, row, tolerance)
                seen_zero_row = true
            elseif seen_zero_row
                return Text(string("false: row ", row, " is a nonzero row below a zero row"))
            end
        end
        
        for row=2:m
            if is_zero_row(A, row, tolerance)
                continue
            end
            if find_leading(A, row, tolerance) <= find_leading(A, row - 1, tolerance)
                return Text(string("false: leading entry in row ", row, " is not to the right of the leading entry in row ", row - 1))
            end
        end
​
        return Text("true: matrix is in echelon form")
    end

52.8 μs

in_reduced_echelon_form (generic function with 2 methods)

xxxxxxxxxx
 
function in_reduced_echelon_form(A, tolerance=10e-12)
    (m, n) = size(A)
​
    # check if in echelon form
    if ! in_echelon_form(A, tolerance)
        return false
    end
    
    for row=1:m
        if ! is_zero_row(A, row, tolerance)
            col = find_leading(A, row, tolerance)
            
            # check that leading entry is 1
            if abs(A[row,col] - 1) > tolerance
                return false
            end
            
            # check that leading entry is only nonzero entry in column
            if any([i != row && abs(A[i,col]) > tolerance for i=1:m])
                return false
            end
        end
    end
​
    return true
end

71.6 μs

in_reduced_echelon_form_verbose (generic function with 2 methods)

xxxxxxxxxx
 
function in_reduced_echelon_form_verbose(A, tolerance=10e-12)
    (m, n) = size(A)
​
    if ! in_echelon_form(A, tolerance)
        return Text("false: not in echelon form")
    end
    
    for row=1:m
        if ! is_zero_row(A, row, tolerance)
            col = find_leading(A, row, tolerance)
            if abs(A[row,col] - 1) > tolerance
                return Text(string("false: leading entry in row ", row, " is not 1"))
            end
            if any([i != row && abs(A[i,col]) > tolerance for i=1:m])
                return Text(string("false: leading entry in row ", row, " is not the only nonzero entry in its column"))
            end
        end
    end
​
    return Text("true: matrix is in reduced echelon form")
end

109 μs

4×9 Array{Int64,2}:
 0  0  5  1  2  3  4  5  6
 0  0  0  6  8  9  0  2  0
 0  0  0  0  0  0  7  8  9
 0  0  0  0  0  0  0  0  0

xxxxxxxxxx
 
A = [ 0 0 5 1 2 3 4 5 6;
      0 0 0 6 8 9 0 2 0;
      0 0 0 0 0 0 7 8 9;
      0 0 0 0 0 0 0 0 0]

9.7 μs

true

x
 
in_echelon_form(A)

80.2 ms

true: matrix is in echelon form

x
 
in_echelon_form_verbose(A)

27.5 ms

false

x
 
in_reduced_echelon_form(A)

9.7 μs

false: leading entry in row 1 is not 1

x
 
in_reduced_echelon_form_verbose(A)

12.5 μs

It's easy to make random matrices using the Random package.

x
 
md"It's easy to make random matrices using the _Random_ package."

10.2 μs

xxxxxxxxxx
 
using Random

557 μs

3×5 Array{Int64,2}:
 3   2  3   2   2
 3   3  2   2  -3
 3  -1  0  -3   1

x
 
B = rand(-3:3, 3, 5)

13.1 μs

false: leading entry in row 2 is not to the right of the leading entry in row 1

xxxxxxxxxx
 
in_echelon_form_verbose(B)

49.1 μs

false: not in echelon form

x
 
in_reduced_echelon_form_verbose(B)

5.4 μs

If we generate a random matrix in this way, where each entry is chosen at random from a specified list or range, then what we get is almost never in echelon form. This is because too many entries are nonzero.

For a random matrix to have good chances of being in echelon form, we must make it more likely that a given entry is zero.

Whether a matrix is in echelon form does not depend on the values of its nonzero entries, only their positions (note that this is not true for reduced echelon form). So let's write a function that generates a random matrix with all entries either 0 (zero) or 1 (nonzero).

This function takes a parameter that controls how likely a given entry is to be zero.

xxxxxxxxxx
 
md"If we generate a random matrix in this way, where each entry is chosen at random from a specified list or range, then what we get is almost never in echelon form. This is because too many entries are nonzero.
​
For a random matrix to have good chances of being in echelon form, we must make it more likely that a given entry is zero.
​
Whether a matrix is in echelon form does not depend on the values of its nonzero entries, only their positions (note that this is not true for *reduced* echelon form).
So let's write a function that generates a random matrix with all entries either 0 (zero) or 1 (nonzero). 
​
This function takes a parameter that controls how likely a given entry is to be zero."

12.8 μs

random_boolean_matrix (generic function with 1 method)

x
 
function random_boolean_matrix(m, n, zero_probability)
    A = rand(m, n)
    for i=1:m
        for j=1:n
            A[i, j] = Int(A[i, j] > zero_probability) 
        end
    end
    return A
end

48.9 μs

3×5 Array{Float64,2}:
 0.0  1.0  0.0  0.0  1.0
 0.0  0.0  1.0  1.0  1.0
 0.0  1.0  0.0  1.0  1.0

xxxxxxxxxx
 
C = random_boolean_matrix(3, 5, 0.5)

11 μs

false: leading entry in row 3 is not to the right of the leading entry in row 2

xxxxxxxxxx
 
in_echelon_form_verbose(C)

13.3 μs

Now let's try the following experiment.

For different values of p between 0.0 and 1.0, we'll generate a bunch of random 01-matrices of a given size, where the chance that a given entry is 0 is p.

We'll graph what proportion of the time this matrix is in echelon form, as a function of p.

xxxxxxxxxx
 
md"""
Now let's try the following experiment.
​
For different values of `p` between 0.0 and 1.0, we'll generate a bunch of random 01-matrices of a given size, where the chance that a given entry is 0 is `p`.
​
We'll graph what proportion of the time this matrix is in echelon form, as a function of `p`.
"""

11.1 μs

Choose dimensions

nrows =2

ncols =3

xxxxxxxxxx
 
begin
    using PlutoUI
    
    rowslider = @bind nrows Slider(1:10, default=2, show_value=true)
    colslider = @bind ncols Slider(1:10, default=3, show_value=true)
    
    md"""**Choose dimensions**
    
    `nrows = `$(rowslider)
    
    `ncols = `$(colslider)
    """
    
end

391 μs

ntrials

 
ntrials = 1000 # size of bunch to generate for each value of p

1.3 μs

2×2 Array{Float64,2}:
 0.0  0.0
 0.0  0.0

 
R = random_boolean_matrix(nrows, ncols, 0.5) # typical random matrix of this size

10.2 μs

true: matrix is in echelon form

 
in_echelon_form_verbose(R)

7.7 μs

Plots.jl

x
 
begin
    # returns average number of times that random 01-matrix is in echelon form
    function accumulate_echelon(m, n, zero_prob, trials)
        s = 0
        for i=1:trials
            s += Int(in_echelon_form(random_boolean_matrix(m, n, zero_prob)))
        end
        return s / trials
    end
​
    zero_probs = 0:0.001:1.0
    likelihoods = [accumulate_echelon(nrows, ncols, p, ntrials) for p in zero_probs]
    plot(
        zero_probs,
        likelihoods, 
        xlabel="Probablity p that individual matrix entry is zero",
        ylabel="Fraction of $(nrows)-by-$(ncols) matrices",
        label=["Echelon form" "Reduced echelon form"],
        title="How many random 01-matrices are in echelon form?",
        legend=false
    )
    # plot!(zero_probs, [x^(2) for x in zero_probs])
end

364 ms

If p == 1.0 then our random matrix is always the zero matrix which is in echelon form. This is why when p is 1.0 the value of the graph is 1.0.

x
 
md"If `p == 1.0` then our random matrix is always the *zero matrix* which is in echelon form. This is why when `p` is 1.0 the value of the graph is 1.0."

16.9 μs

2×2 Array{Float64,2}:
 0.0  0.0
 0.0  0.0

x
 
random_boolean_matrix(nrows, ncols, 1.0)

8.1 μs

true: matrix is in echelon form

x
 
in_echelon_form_verbose(random_boolean_matrix(nrows, ncols, 1.0))

11.3 μs

If p == 0.0 then our random matrix is always the matrix of all ones which is never in echelon form if nrows > 1. This is why when p is 0.0 the value of the graph is 0.0.

x
 
md"If `p == 0.0` then our random matrix is always the *matrix of all ones* which is never in echelon form if `nrows > 1`. This is why when `p` is 0.0 the value of the graph is 0.0."

15.5 μs

2×2 Array{Float64,2}:
 1.0  1.0
 1.0  1.0

xxxxxxxxxx
 
random_boolean_matrix(nrows, ncols, 0.0)

8.9 μs

false: leading entry in row 2 is not to the right of the leading entry in row 1

xxxxxxxxxx
 
in_echelon_form_verbose(random_boolean_matrix(nrows, ncols, 0.0))

16.1 μs

Reduction to reduced echelon form

Below is a function RREF that converts a matrix to its reduced echelon form. You can unhide the code by clicking the icon on the left, if you want to take a look.

x
 
md"## Reduction to reduced echelon form
​
Below is a function `RREF` that converts a matrix to its reduced echelon form. You can unhide the code by clicking the icon on the left, if you want to take a look.
"

8.7 μs

RREF (generic function with 2 methods)

xxxxxxxxxx
 
begin
    function find_nonzeros_in_column(A, row_start, col, tol=10e-12)
        (m, n) = size(A)
        return [i for i=row_start:m if abs(A[i,col]) > tol]
    end
    
    function columns(A)
        (m, n) = size(A)
        return n
    end
​
    function rowop_replace(A, source_row, target_row, scalar_factor)
        @assert source_row != target_row
        for j=1:columns(A)
            A[target_row,j] += A[source_row,j] * scalar_factor
        end
        return A
    end
​
    function rowop_scale(A, target_row, scalar_factor)
        @assert scalar_factor != 0
        for j=1:columns(A)
            A[target_row,j] *= scalar_factor
        end
        A
    end
    
    function rowop_swap(A, s, t)
        for j=1:columns(A)
            A[t,j], A[s,j] = A[s,j], A[t,j]
        end
        A
    end
​
    function RREF_with_rowop_count(A, tol=10e-12)
        rowop_count = 0
        A = float(copy(A))
        (m, n) = size(A)
        
        row = 1
        for col=1:n
            nonzeros = find_nonzeros_in_column(A, row, col, tol)
            if length(nonzeros) == 0
                continue
            end
            
            i = nonzeros[1]
            
            if abs(A[i, col] - 1) < tol
                A[i, col] = 1
            else
                A = rowop_scale(A, i, 1 / A[i, col])
                rowop_count += 1
            end
            
            for j=nonzeros[2:end]
                A = rowop_replace(A, i, j, -A[j, col])
                rowop_count += 1
            end
​
            if i != row
                A = rowop_swap(A, row, i)
                rowop_count += 1
            end
            
            row += 1
        end
        
        for row=m:-1:1
            l = find_leading(A, row, tol)
            if l > 0
                for k=1:row - 1
                    if abs(A[k,l]) > tol
                        rowop_replace(A, row, k, -A[k,l])
                        rowop_count += 1
                    end
                end
            end
        end
        
        # round entries to give nicer output; could omit this step
        for row=1:m
            for col=1:n
                if abs(A[row,col] - round(A[row,col])) < tol
                    A[row,col] = round(A[row,col])
                end
            end
        end
        
        return A, rowop_count
    end
    
    function RREF_RowOpCount(A, tol=10e-12)
        A, ops = RREF_with_rowop_count(A, tol)
        return ops
    end
​
    function RREF(A, tol=10e-12)
        A, ops = RREF_with_rowop_count(A, tol)
        return A
    end
end

276 μs

Here is some more code to compute and count the pivot positions in a matrix.

xxxxxxxxxx
 
md"Here is some more code to compute and count the pivot positions in a matrix."

5.6 μs

pivot_positions (generic function with 2 methods)

x
 
function pivot_positions(A, tol=10e-12)
    rref = RREF(A, tol)
    return [
        (i, find_leading(rref, i)) 
        for i=1:size(A)[1] if find_leading(rref, i) > 0
    ]
end

58.9 μs

npivots (generic function with 2 methods)

x
 
function npivots(A, tol=10e-12)
    return length(pivot_positions(A, tol))
end

27.1 μs

3×4 Array{Int64,2}:
 0   3  -6   6
 3  -7   8  -5
 3  -9  12  -9

x
 
M = [0 3 -6 6; 3 -7 8 -5; 3 -9 12 -9]

9.5 μs

3×4 Array{Float64,2}:
 1.0  0.0  -2.0  3.0
 0.0  1.0  -2.0  2.0
 0.0  0.0  -0.0  0.0

xxxxxxxxxx
 
RREF(M)

9 μs

true

xxxxxxxxxx
 
in_reduced_echelon_form(RREF(M))

15.1 μs

Tuple11

x
 
pivot_positions(M)

9.7 μs

xxxxxxxxxx
 
npivots(M)

9.8 μs

How many pivot positions does a random N-by-N matrix have?

The possible values range from zero to N.

xxxxxxxxxx
 
md"How many pivot positions does a random N-by-N matrix have?
​
The possible values range from zero to N.
"

11.3 μs

M =5

N =6

x
 
begin
    rslider = @bind n1 Slider(1:20, default=2, show_value=true)
    sslider = @bind n2 Slider(1:20, default=2, show_value=true)
    md"""
    `M = `$(rslider)
    
    `N = `$(sslider)
    """
end

105 μs

value_range

Int641

xxxxxxxxxx
 
value_range = 0:1

9.1 μs

Int641

xxxxxxxxxx
 
begin
    pivot_counts = [0 for i=0:n1]
    for i=1:1000
        pivot_counts[1 + npivots(rand(value_range, n1, n2))] += 1
    end
    pivot_counts    
end

28.6 ms

Plots.jl

xxxxxxxxxx
 
pie(pivot_counts, legend = false, title="Counting $(n1)-by-$(n2) matrices by number of pivot positions")
​

4.1 ms

We see from these examples that almost every random matrix, of sufficiently large size with sufficiently unconstrained entries, has the largest possible number of pivot positions (which is the whichever of M or N is larger).

xxxxxxxxxx
 
md"We see from these examples that almost every random matrix, of sufficiently large size with sufficiently unconstrained entries, has the largest possible number of pivot positions (which is the whichever of M or N is larger)."

9.2 μs

Effort to compute `RREF(A)`

One final experiment for today.

Given a typical matrix A, how many row operations should we have to perform to compute RREF(A)? Could we get unlucky and have to do many more operations in one case compared to another?

To investigate, let's do the following. To simplify things, we'll just consider square matrices, and we'll just generate random 01-matrices as above.

For each probabilty p controlling how likely a given matrix entry is zero, we will generate a large number of random 01-matrices.

For these matrices A, we'll calculate the average number of row operations needed to compute RREF(A) and the standard deviation. Then we can plot both as a function of p.

xxxxxxxxxx
 
md"""## Effort to compute `RREF(A)`
​
One final experiment for today.
​
Given a typical matrix A, how many row operations should we have to perform to compute RREF(A)? Could we get unlucky and have to do many more operations in one case compared to another?
​
To investigate, let's do the following. To simplify things, we'll just consider square matrices, and we'll just generate random 01-matrices as above. 
​
For each probabilty p controlling how likely a given matrix entry is zero, we will generate a large number of random 01-matrices. 
​
For these matrices A, we'll calculate the average number of row operations needed to compute `RREF(A)` and the standard deviation. Then we can plot both as a function of p.
​
"""

15.6 μs

rtrials

xxxxxxxxxx
 
rtrials = 1000

1.8 μs

N =2

xxxxxxxxxx
 
begin
    zslider = @bind z Slider(2:200, default=2, show_value=true)
    md"""
    `N = `$(zslider)
    """
end

73.3 μs

Plots.jl

x
 
begin
    function accumulate_rowops(m, n, zero_prob, trials)
        s = 0
        for i=1:trials
            # s += RREF_RowOpCount(rand(-1:1,m, n))
            s += RREF_RowOpCount(random_boolean_matrix(m, n, zero_prob))
        end
        mean = s / trials
        return mean
    end
​
    function accumulate_rowops_std(m, n, zero_prob, trials, mean)
        s = 0
        for i=1:trials
            # s += (RREF_RowOpCount(rand(-1:1, m, n)) - mean)^2
            s += (RREF_RowOpCount(random_boolean_matrix(m, n, zero_prob)) - mean)^2
        end
        std = sqrt(s / (trials - 1))
        return std
    end
    
    x = 0:0.01:1
    k = z
    averages = [accumulate_rowops(k, k, p, rtrials) for p in x]
    deviations = [
        accumulate_rowops_std(k, k, x[i], rtrials, averages[i])
        for i=1:length(x)
    ]
    plot(
        x, 
        hcat(averages, deviations), 
        xlabel="Probablity p that individual matrix entry is zero",
        ylabel="Number of row operations",
        title="RowOps to compute RREF for $(k)-by-$(k) 01-matrices",
        label=["mean" "std"]
    )
end

209 ms

Moral of the story: if the size N of our matrix is large, then at almost all values of p the average number of row operations needed to compute RREF(A) is nearly the same, and the standard deviation is also flat.

This means that if N is large then in some sense we can't get too unlucky: all computations of RREF(A) will take close to the same number of operations.

But this is only in the limit. When N is small, there is some interesting variation in the number of row operations needed.

x
 
md"Moral of the story: if the size N of our matrix is large, then at almost all values of p the average number of row operations needed to compute `RREF(A)` is nearly the same, and the standard deviation is also flat.
​
This means that if N is large then in some sense we can't get too unlucky: all computations of `RREF(A)` will take close to the same number of operations.
​
But this is only in the limit. When N is small, there is some interesting variation in the number of row operations needed."

13.8 μs

Enter cell code...xxxxxxxxxx

351 ns

Notebook 2 – Math 2121, Fall 2020

Running this notebook (optional)

Plotting in Pluto

Echelon form and random matrices

Reduction to reduced echelon form

Effort to compute RREF(A)

Effort to compute `RREF(A)`