Notebook 2 – Math 2121, Fall 2020
Today we will see some plotting and random functions in Julia, as we explore the meaning of (reduced) echelon form and the row reduction algorithm.
x
md"# Notebook 2 -- Math 2121, Fall 2020Today we will see some plotting and random functions in Julia, as we explore the meaning of (reduced) echelon form and the row reduction algorithm."Running this notebook (optional)
If you have Pluto up and running, you can access the notebook we are currently viewing by entering this link (right click -> Copy Link) in the Open from file menu in Pluto.
x
md"## Running *this* notebook (optional)If you have Pluto up and running, you can access the notebook we are currently viewing by entering [this link](http://www.math.ust.hk/~emarberg/teaching/2020/Math2121/julia/02_Math2121_Fall2020.jl) (right click -> Copy Link) in the *Open from file* menu in Pluto."Plotting in Pluto
There are some good packages for making graphics in Julia. Today we'll use Plots.
xxxxxxxxxxmd"## Plotting in PlutoThere are some good packages for making graphics in Julia. Today we'll use _Plots_."xxxxxxxxxxbegin # This sets up a clean package environment import Pkg Pkg.activate(mktempdir())endx
begin # This add s the Plots package to our environment, and imports it using Plots plotly()endEchelon form and random matrices
Let's write some functions to determine whether a matrix is in echelon form.
x
md"## Echelon form and random matricesLet's write some functions to determine whether a matrix is in echelon form."find_leading (generic function with 3 methods)xxxxxxxxxxbegin # Returns true if row has all zeros. function is_zero_row(A, row, tolerance=10e-16) (m, n) = size(A) return all([abs(A[row,col]) < tolerance for col=1:n]) end # Returns first index of nonzero entry in row, or 0 if none exists. function find_leading(A, row=1, tolerance=10e-16) (m, n) = size(A) for col=1:n if abs(A[row, col]) > tolerance return col end end return 0 endend3xxxxxxxxxxfind_leading([0 0 7 0 5])in_echelon_form (generic function with 2 methods)x
function in_echelon_form(A, tolerance=10e-12) (m, n) = size(A) # check locations of zero rows seen_zero_row = false for row=1:m if is_zero_row(A, row, tolerance) seen_zero_row = true elseif seen_zero_row return false end end # check alignment of leading entries in nonzero rows for row=2:m if is_zero_row(A, row, tolerance) continue end if find_leading(A, row, tolerance) <= find_leading(A, row - 1, tolerance) return false end end return trueendin_echelon_form_verbose (generic function with 2 methods)xxxxxxxxxxfunction in_echelon_form_verbose(A, tolerance=10e-12) (m, n) = size(A) seen_zero_row = false for row=1:m if is_zero_row(A, row, tolerance) seen_zero_row = true elseif seen_zero_row return Text(string("false: row ", row, " is a nonzero row below a zero row")) end end for row=2:m if is_zero_row(A, row, tolerance) continue end if find_leading(A, row, tolerance) <= find_leading(A, row - 1, tolerance) return Text(string("false: leading entry in row ", row, " is not to the right of the leading entry in row ", row - 1)) end end return Text("true: matrix is in echelon form") endin_reduced_echelon_form (generic function with 2 methods)xxxxxxxxxxfunction in_reduced_echelon_form(A, tolerance=10e-12) (m, n) = size(A) # check if in echelon form if ! in_echelon_form(A, tolerance) return false end for row=1:m if ! is_zero_row(A, row, tolerance) col = find_leading(A, row, tolerance) # check that leading entry is 1 if abs(A[row,col] - 1) > tolerance return false end # check that leading entry is only nonzero entry in column if any([i != row && abs(A[i,col]) > tolerance for i=1:m]) return false end end end return trueendin_reduced_echelon_form_verbose (generic function with 2 methods)xxxxxxxxxxfunction in_reduced_echelon_form_verbose(A, tolerance=10e-12) (m, n) = size(A) if ! in_echelon_form(A, tolerance) return Text("false: not in echelon form") end for row=1:m if ! is_zero_row(A, row, tolerance) col = find_leading(A, row, tolerance) if abs(A[row,col] - 1) > tolerance return Text(string("false: leading entry in row ", row, " is not 1")) end if any([i != row && abs(A[i,col]) > tolerance for i=1:m]) return Text(string("false: leading entry in row ", row, " is not the only nonzero entry in its column")) end end end return Text("true: matrix is in reduced echelon form")end4×9 Array{Int64,2}:
0 0 5 1 2 3 4 5 6
0 0 0 6 8 9 0 2 0
0 0 0 0 0 0 7 8 9
0 0 0 0 0 0 0 0 0xxxxxxxxxxA = [ 0 0 5 1 2 3 4 5 6; 0 0 0 6 8 9 0 2 0; 0 0 0 0 0 0 7 8 9; 0 0 0 0 0 0 0 0 0]truex
in_echelon_form(A)true: matrix is in echelon formx
in_echelon_form_verbose(A)falsex
in_reduced_echelon_form(A)false: leading entry in row 1 is not 1x
in_reduced_echelon_form_verbose(A)It's easy to make random matrices using the Random package.
x
md"It's easy to make random matrices using the _Random_ package."xxxxxxxxxxusing Random3×5 Array{Int64,2}:
3 2 3 2 2
3 3 2 2 -3
3 -1 0 -3 1x
B = rand(-3:3, 3, 5)false: leading entry in row 2 is not to the right of the leading entry in row 1xxxxxxxxxxin_echelon_form_verbose(B)false: not in echelon formx
in_reduced_echelon_form_verbose(B)If we generate a random matrix in this way, where each entry is chosen at random from a specified list or range, then what we get is almost never in echelon form. This is because too many entries are nonzero.
For a random matrix to have good chances of being in echelon form, we must make it more likely that a given entry is zero.
Whether a matrix is in echelon form does not depend on the values of its nonzero entries, only their positions (note that this is not true for reduced echelon form). So let's write a function that generates a random matrix with all entries either 0 (zero) or 1 (nonzero).
This function takes a parameter that controls how likely a given entry is to be zero.
xxxxxxxxxxmd"If we generate a random matrix in this way, where each entry is chosen at random from a specified list or range, then what we get is almost never in echelon form. This is because too many entries are nonzero.For a random matrix to have good chances of being in echelon form, we must make it more likely that a given entry is zero.Whether a matrix is in echelon form does not depend on the values of its nonzero entries, only their positions (note that this is not true for *reduced* echelon form).So let's write a function that generates a random matrix with all entries either 0 (zero) or 1 (nonzero). This function takes a parameter that controls how likely a given entry is to be zero."random_boolean_matrix (generic function with 1 method)x
function random_boolean_matrix(m, n, zero_probability) A = rand(m, n) for i=1:m for j=1:n A[i, j] = Int(A[i, j] > zero_probability) end end return Aend3×5 Array{Float64,2}:
0.0 1.0 0.0 0.0 1.0
0.0 0.0 1.0 1.0 1.0
0.0 1.0 0.0 1.0 1.0xxxxxxxxxxC = random_boolean_matrix(3, 5, 0.5)false: leading entry in row 3 is not to the right of the leading entry in row 2xxxxxxxxxxin_echelon_form_verbose(C)Now let's try the following experiment.
For different values of p between 0.0 and 1.0, we'll generate a bunch of random 01-matrices of a given size, where the chance that a given entry is 0 is p.
We'll graph what proportion of the time this matrix is in echelon form, as a function of p.
xxxxxxxxxxmd"""Now let's try the following experiment.For different values of `p` between 0.0 and 1.0, we'll generate a bunch of random 01-matrices of a given size, where the chance that a given entry is 0 is `p`.We'll graph what proportion of the time this matrix is in echelon form, as a function of `p`."""Choose dimensions
nrows =
ncols =
xxxxxxxxxxbegin using PlutoUI rowslider = nrows Slider(1:10, default=2, show_value=true) colslider = ncols Slider(1:10, default=3, show_value=true) md"""**Choose dimensions** `nrows = `$(rowslider) `ncols = `$(colslider) """ end1000ntrials = 1000 # size of bunch to generate for each value of p2×2 Array{Float64,2}:
0.0 0.0
0.0 0.0R = random_boolean_matrix(nrows, ncols, 0.5) # typical random matrix of this sizetrue: matrix is in echelon formin_echelon_form_verbose(R)x
begin # returns average number of times that random 01-matrix is in echelon form function accumulate_echelon(m, n, zero_prob, trials) s = 0 for i=1:trials s += Int(in_echelon_form(random_boolean_matrix(m, n, zero_prob))) end return s / trials end zero_probs = 0:0.001:1.0 likelihoods = [accumulate_echelon(nrows, ncols, p, ntrials) for p in zero_probs] plot( zero_probs, likelihoods, xlabel="Probablity p that individual matrix entry is zero", ylabel="Fraction of $(nrows)-by-$(ncols) matrices", label=["Echelon form" "Reduced echelon form"], title="How many random 01-matrices are in echelon form?", legend=false ) # plot!(zero_probs, [x^(2) for x in zero_probs])endIf p == 1.0 then our random matrix is always the zero matrix which is in echelon form. This is why when p is 1.0 the value of the graph is 1.0.
x
md"If `p == 1.0` then our random matrix is always the *zero matrix* which is in echelon form. This is why when `p` is 1.0 the value of the graph is 1.0."2×2 Array{Float64,2}:
0.0 0.0
0.0 0.0x
random_boolean_matrix(nrows, ncols, 1.0)true: matrix is in echelon formx
in_echelon_form_verbose(random_boolean_matrix(nrows, ncols, 1.0))If p == 0.0 then our random matrix is always the matrix of all ones which is never in echelon form if nrows > 1. This is why when p is 0.0 the value of the graph is 0.0.
x
md"If `p == 0.0` then our random matrix is always the *matrix of all ones* which is never in echelon form if `nrows > 1`. This is why when `p` is 0.0 the value of the graph is 0.0."2×2 Array{Float64,2}:
1.0 1.0
1.0 1.0xxxxxxxxxxrandom_boolean_matrix(nrows, ncols, 0.0)false: leading entry in row 2 is not to the right of the leading entry in row 1xxxxxxxxxxin_echelon_form_verbose(random_boolean_matrix(nrows, ncols, 0.0))Reduction to reduced echelon form
Below is a function RREF that converts a matrix to its reduced echelon form. You can unhide the code by clicking the icon on the left, if you want to take a look.
x
md"## Reduction to reduced echelon formBelow is a function `RREF` that converts a matrix to its reduced echelon form. You can unhide the code by clicking the icon on the left, if you want to take a look."RREF (generic function with 2 methods)xxxxxxxxxxbegin function find_nonzeros_in_column(A, row_start, col, tol=10e-12) (m, n) = size(A) return [i for i=row_start:m if abs(A[i,col]) > tol] end function columns(A) (m, n) = size(A) return n end function rowop_replace(A, source_row, target_row, scalar_factor) source_row != target_row for j=1:columns(A) A[target_row,j] += A[source_row,j] * scalar_factor end return A end function rowop_scale(A, target_row, scalar_factor) scalar_factor != 0 for j=1:columns(A) A[target_row,j] *= scalar_factor end A end function rowop_swap(A, s, t) for j=1:columns(A) A[t,j], A[s,j] = A[s,j], A[t,j] end A end function RREF_with_rowop_count(A, tol=10e-12) rowop_count = 0 A = float(copy(A)) (m, n) = size(A) row = 1 for col=1:n nonzeros = find_nonzeros_in_column(A, row, col, tol) if length(nonzeros) == 0 continue end i = nonzeros[1] if abs(A[i, col] - 1) < tol A[i, col] = 1 else A = rowop_scale(A, i, 1 / A[i, col]) rowop_count += 1 end for j=nonzeros[2:end] A = rowop_replace(A, i, j, -A[j, col]) rowop_count += 1 end if i != row A = rowop_swap(A, row, i) rowop_count += 1 end row += 1 end for row=m:-1:1 l = find_leading(A, row, tol) if l > 0 for k=1:row - 1 if abs(A[k,l]) > tol rowop_replace(A, row, k, -A[k,l]) rowop_count += 1 end end end end # round entries to give nicer output; could omit this step for row=1:m for col=1:n if abs(A[row,col] - round(A[row,col])) < tol A[row,col] = round(A[row,col]) end end end return A, rowop_count end function RREF_RowOpCount(A, tol=10e-12) A, ops = RREF_with_rowop_count(A, tol) return ops end function RREF(A, tol=10e-12) A, ops = RREF_with_rowop_count(A, tol) return A endendHere is some more code to compute and count the pivot positions in a matrix.
xxxxxxxxxxmd"Here is some more code to compute and count the pivot positions in a matrix."pivot_positions (generic function with 2 methods)x
function pivot_positions(A, tol=10e-12) rref = RREF(A, tol) return [ (i, find_leading(rref, i)) for i=1:size(A)[1] if find_leading(rref, i) > 0 ]endnpivots (generic function with 2 methods)x
function npivots(A, tol=10e-12) return length(pivot_positions(A, tol))end3×4 Array{Int64,2}:
0 3 -6 6
3 -7 8 -5
3 -9 12 -9x
M = [0 3 -6 6; 3 -7 8 -5; 3 -9 12 -9]3×4 Array{Float64,2}:
1.0 0.0 -2.0 3.0
0.0 1.0 -2.0 2.0
0.0 0.0 -0.0 0.0xxxxxxxxxxRREF(M)truexxxxxxxxxxin_reduced_echelon_form(RREF(M))1
1
2
2
x
pivot_positions(M)2xxxxxxxxxxnpivots(M)How many pivot positions does a random N-by-N matrix have?
The possible values range from zero to N.
xxxxxxxxxxmd"How many pivot positions does a random N-by-N matrix have?The possible values range from zero to N."M =
N =
x
begin rslider = n1 Slider(1:20, default=2, show_value=true) sslider = n2 Slider(1:20, default=2, show_value=true) md""" `M = `$(rslider) `N = `$(sslider) """end0
1
xxxxxxxxxxvalue_range = 0:10
0
1
23
301
675
xxxxxxxxxxbegin pivot_counts = [0 for i=0:n1] for i=1:1000 pivot_counts[1 + npivots(rand(value_range, n1, n2))] += 1 end pivot_counts endxxxxxxxxxxpie(pivot_counts, legend = false, title="Counting $(n1)-by-$(n2) matrices by number of pivot positions")We see from these examples that almost every random matrix, of sufficiently large size with sufficiently unconstrained entries, has the largest possible number of pivot positions (which is the whichever of M or N is larger).
xxxxxxxxxxmd"We see from these examples that almost every random matrix, of sufficiently large size with sufficiently unconstrained entries, has the largest possible number of pivot positions (which is the whichever of M or N is larger)."Effort to compute RREF(A)
One final experiment for today.
Given a typical matrix A, how many row operations should we have to perform to compute RREF(A)? Could we get unlucky and have to do many more operations in one case compared to another?
To investigate, let's do the following. To simplify things, we'll just consider square matrices, and we'll just generate random 01-matrices as above.
For each probabilty p controlling how likely a given matrix entry is zero, we will generate a large number of random 01-matrices.
For these matrices A, we'll calculate the average number of row operations needed to compute RREF(A) and the standard deviation. Then we can plot both as a function of p.
xxxxxxxxxxmd"""## Effort to compute `RREF(A)`One final experiment for today.Given a typical matrix A, how many row operations should we have to perform to compute RREF(A)? Could we get unlucky and have to do many more operations in one case compared to another?To investigate, let's do the following. To simplify things, we'll just consider square matrices, and we'll just generate random 01-matrices as above. For each probabilty p controlling how likely a given matrix entry is zero, we will generate a large number of random 01-matrices. For these matrices A, we'll calculate the average number of row operations needed to compute `RREF(A)` and the standard deviation. Then we can plot both as a function of p."""1000xxxxxxxxxxrtrials = 1000N =
xxxxxxxxxxbegin zslider = z Slider(2:200, default=2, show_value=true) md""" `N = `$(zslider) """endx
begin function accumulate_rowops(m, n, zero_prob, trials) s = 0 for i=1:trials # s += RREF_RowOpCount(rand(-1:1,m, n)) s += RREF_RowOpCount(random_boolean_matrix(m, n, zero_prob)) end mean = s / trials return mean end function accumulate_rowops_std(m, n, zero_prob, trials, mean) s = 0 for i=1:trials # s += (RREF_RowOpCount(rand(-1:1, m, n)) - mean)^2 s += (RREF_RowOpCount(random_boolean_matrix(m, n, zero_prob)) - mean)^2 end std = sqrt(s / (trials - 1)) return std end x = 0:0.01:1 k = z averages = [accumulate_rowops(k, k, p, rtrials) for p in x] deviations = [ accumulate_rowops_std(k, k, x[i], rtrials, averages[i]) for i=1:length(x) ] plot( x, hcat(averages, deviations), xlabel="Probablity p that individual matrix entry is zero", ylabel="Number of row operations", title="RowOps to compute RREF for $(k)-by-$(k) 01-matrices", label=["mean" "std"] )endMoral of the story: if the size N of our matrix is large, then at almost all values of p the average number of row operations needed to compute RREF(A) is nearly the same, and the standard deviation is also flat.
This means that if N is large then in some sense we can't get too unlucky: all computations of RREF(A) will take close to the same number of operations.
But this is only in the limit. When N is small, there is some interesting variation in the number of row operations needed.
x
md"Moral of the story: if the size N of our matrix is large, then at almost all values of p the average number of row operations needed to compute `RREF(A)` is nearly the same, and the standard deviation is also flat.This means that if N is large then in some sense we can't get too unlucky: all computations of `RREF(A)` will take close to the same number of operations.But this is only in the limit. When N is small, there is some interesting variation in the number of row operations needed."Enter cell code...
xxxxxxxxxx