Notebook 4 – Math 2121, Fall 2020
Running this notebook (optional)
You can run this notebook on your own by entering this link in the Open from file menu in Pluto.
Multiplying matrices and vectors
Suppose
In today's lecture we introduced a way of multiplying
If
For example:
xxxxxxxxxx
md"# Notebook 4 -- Math 2121, Fall 2020
2×2 Array{Int64,2}:
1 10
100 1000
xxxxxxxxxx
matrix = [1 10; 100 1000]
-1
1
xxxxxxxxxx
vector = [-1; 1]
9
900
xxxxxxxxxx
matrix * vector
Multiplying matrices
There is a simple way to extend this method of multiplying a matrix and a vector to a method of multiplying two matrices together. We will discuss this matrix multiplication operation in more depth in a few lectures, but here is a first look.
xxxxxxxxxx
md"##### Multiplying matrices
Assume
Then every column of
Write
Then the matrix product of
xxxxxxxxxx
md"Assume $A$ and $B$ are both $n\times n$ matrices, that is, square matrices of the same size.
Enter cell code...
xxxxxxxxxx
If
The first column is
xxxxxxxxxx
md"
A concrete example is
We can double check this computation in Julia.
xxxxxxxxxx
md"A concrete example is $\left[\begin{array}{cc} 10 & 1 \\ 1000 & 100 \end{array}\right]\left[\begin{array}{cc} 2 & 4 \\ 6 & 8 \end{array}\right] = \left[\begin{array}{c} 26 & 48 \\ 2600 & 4800 \end{array}\right]$.
2×2 Array{Int64,2}:
10 1
1000 100
xxxxxxxxxx
A = [10 1; 1000 100]
2×2 Array{Int64,2}:
2 4
6 8
xxxxxxxxxx
B = [2 4; 6 8]
2×2 Array{Int64,2}:
26 48
2600 4800
xxxxxxxxxx
A * B
2×2 Array{Int64,2}:
26 48
2600 4800
xxxxxxxxxx
begin
# equivalent to computing A * B:
v1 = [2;
6]
v2 = [4;
8]
hcat(A * v1, A * v2)
end
This definition of
In that case, the result of multiplying
For simplicity for we focus today on the case when
xxxxxxxxxx
md"This definition of $AB$ makes sense whenever $A$ is an $m\times p$ matrix and $B$ is a $p\times n$ matrix.
In that case, the result of multiplying $A$ and $B$ is an $m\times n$ matrix $AB$.
For simplicity for we focus today on the case when $m=p=q=n$."
Efficiency
The question we want to explore: how fast can we multiply two
In other words, what is the most efficient algorithm for matrix multiplication?
If you don't want to think about the new operation of multiplying two matrices, then it's equivalent to think about how fast we can multiply an
xxxxxxxxxx
md"##### Efficiency
Enter cell code...
xxxxxxxxxx
To multiply matrices, we just need to be able to perform addition and multiplication of individual scalars (i.e., numbers). Multiplication operations are much more time consuming than addition.
So our efficiency question can be reduced to: what is the fewest number of scalar multiplications needed to multiply two
xxxxxxxxxx
md"To multiply matrices, we just need to be able to perform addition and multiplication of individual scalars (i.e., numbers). Multiplication operations are much more time consuming than addition.
Enter cell code...
xxxxxxxxxx
The formula above for the
In general, entry
This formula involves
So you might expect that we need
Let's write some code to verify this.
xxxxxxxxxx
md"The formula above for the $n=2$ case shows 8 multiplications:
scalar_multiply (generic function with 1 method)
xxxxxxxxxx
function scalar_multiply(a, b, count)
# custom function to multiply scalars and return incremented multiplication count
# if a == 0 or b == 0 then this returns 0 with same count.
# otherewise returns a * b with count incremented by one.
return (a == 0 || b == 0) ? (0, count) : (a * b, count + 1)
end
scalar_type (generic function with 1 method)
xxxxxxxxxx
function scalar_type(A)
# helper function that returns type (e.g., Int, Float64, ...) of matrix entries
return minimum(size(A)) > 0 ? typeof(A[1,1]) : Int
end
matrix_multiply (generic function with 2 methods)
x
function matrix_multiply(A, B, count=0)
# multiplies matrices A and B using the naive algorithm.
# returns both the product AB and number of scalar multiplications used.
# check that matrix sizes match, so that AB is defined
(rows, p), (q, cols) = size(A), size(B)
p == q
C = zeros(scalar_type(A), rows, cols)
for i=1:rows
for j=1:cols
for t=1:p
value, count = scalar_multiply(A[i,t], B[t,j], count)
C[i,j] += value
end
end
end
return C, count
end
Now let's actually check how many scalar multiplications are used in the naive algorithm when computing the product of two
xxxxxxxxxx
md"Now let's actually check how many scalar multiplications are used in the naive algorithm when computing the product of two $n\times n$ matrices, say for $n=1,2,\dots,200$."
xxxxxxxxxx
begin
sizes = 1:200
total_multiplications = []
for n=sizes
A = rand(Int, n, n)
B = rand(Int, n, n)
C, multiplication_count = matrix_multiply(A, B)
append!(total_multiplications, multiplication_count)
# check that our multiplication algorithm gives correct answer!
C == A * B
end
end
xxxxxxxxxx
begin
1
8
27
64
125
216
343
512
729
1000
1331
1728
2197
2744
3375
4096
4913
5832
6859
8000
9261
10648
12167
13824
15625
17576
19683
21952
24389
27000
29791
32768
35937
39304
42875
46656
50653
54872
59319
64000
6967871
7077888
7189057
7301384
7414875
7529536
7645373
7762392
7880599
8000000
xxxxxxxxxx
total_multiplications
true
xxxxxxxxxx
# the numbers look like they're exactly n^3, and they are:
total_multiplications == [n^3 for n in 1:length(total_multiplications)]
Faster algorithms
Now let's discuss a different matrix multiplication algorithm.
The following is called the Strassen algorithm, discovered by Volker Strassen in the 1960s.
The significance of this algorithm is just to demonstrate that there is a faster way to multiply matrices than the naive method.
The idea of the algorithm is, to compute
First, we split both
and into four submatrices.Then we express
as a sum of terms derived from multiplying these submatrices together.The clever part is to somehow write this using only 7 multiplications instead of the usual 8.
If
then the multiplications involve submatrices, that is, scalars.If
then we recursively apply the algorithm the multiply the submatrices together.
xxxxxxxxxx
md"##### Faster algorithms
Enter cell code...
xxxxxxxxxx
We can improve this algorithm by modifying the final step of the recursion.
When
Setting this threshhold to 1 means we would keep subdividing the matrices until they are
x
md"We can improve this algorithm by modifying the final step of the recursion.
When $n$ is small, below some threshhold value, then it's faster to use the naive multiplication algorithm. We can choose this threshhold below.
Setting this threshhold to 1 means we would keep subdividing the matrices until they are $1\times 1$. This isn't the most efficient choice."
Fast multiply threshold:
xxxxxxxxxx
begin
using PlutoUI
tslider = THRESHHOLD Slider(1:40, default=6, show_value=true)
md"""*Fast multiply threshold:* $(tslider)
"""
end
fast_multiply (generic function with 2 methods)
x
begin
function pad(A, n)
# extends matrix A to an n-by-n matrix by adding zeros in vacant positions.
#
# returns n-by-n matrix [A 0]
# [0 0]
(p, q), stype = size(A), scalar_type(A)
return vcat(hcat(A, zeros(stype, p, n - q)), zeros(stype, n - p, n))
end
function fast_multiply(A, B, count=0; thrshld=THRESHHOLD)
# multiplies matrices A and B using the Strassen algorithm (1969).
# returns both the product AB and the number of scalar multiplications used.
(rows, p), (q, cols) = size(A), size(B)
p == q
# pads A and B with zeros if not same size.
# can ignore this code for square matrices.
if size(A) != size(B)
n = maximum([rows, p, q, cols])
A, B = pad(A, n), pad(B, n)
else
n = rows
end
# if size of input matrices is below THRESHHOLD value, then we switch
# to the naive multiplication algorithm. this can gain some efficiency.
if n <= thrshld
return matrix_multiply(A, B, count)
else
# pads inputs with one extra row/column to have even-by-even size
if n % 2 != 0
n += 1
A, B = pad(A, n), pad(B, n)
end
p = n ÷ 2 # shorthand for n/2
q = p + 1 # shorthand for n/2 + 1
# divide A into 4 submatrices of size n/2-by-n/2
A11 = A[1:p,1:p]
A12 = A[1:p,q:n]
A21 = A[q:n,1:p]
A22 = A[q:n,q:n]
# divide B into 4 submatrices of size n/2-by-n/2
B11 = B[1:p,1:p]
B12 = B[1:p,q:n]
B21 = B[q:n,1:p]
B22 = B[q:n,q:n]
# uses only 7 multiplications.
# we ignore +/- operations as these are faster than multiplications.
M1, count = fast_multiply(A11 + A22, B11 + B22, count, thrshld=thrshld)
M2, count = fast_multiply(A21 + A22, B11, count, thrshld=thrshld)
M3, count = fast_multiply(A11, B12 - B22, count, thrshld=thrshld)
M4, count = fast_multiply(A22, B21 - B11, count, thrshld=thrshld)
M5, count = fast_multiply(A11 + A12, B22, count, thrshld=thrshld)
M6, count = fast_multiply(A21 - A11, B11 + B12, count, thrshld=thrshld)
M7, count = fast_multiply(A12 - A22, B21 + B22, count, thrshld=thrshld)
C = zeros(scalar_type(A), n, n)
# only +/- operations below, no multiplications!
C[1:p,1:p] = M1 + M4 - M5 + M7
C[1:p,q:n] = M3 + M5
C[q:n,1:p] = M2 + M4
C[q:n,q:n] = M1 - M2 + M3 + M6
return C[1:rows,1:cols], count
end
end
end
Let's compare both algorithms for
xxxxxxxxxx
md"Let's compare both algorithms for $2\times 2$ matrices."
2×2 Array{Int64,2}:
73 52
37 -44
xxxxxxxxxx
X = rand(-100:100, 2, 2)
2×2 Array{Int64,2}:
52 -9
-23 -73
xxxxxxxxxx
Y = rand(-100:100, 2, 2)
2×2 Array{Int64,2}: 2600 -4453 2936 2879
8
xxxxxxxxxx
# note: 8 multiplications
matrix_multiply(X, Y)
2×2 Array{Int64,2}: 2600 -4453 2936 2879
7
xxxxxxxxxx
# note: only 7 multiplications!
fast_multiply(X, Y, thrshld=1)
Now let's actually check how many scalar multiplications are used in the naive algorithm when computing the product of two
xxxxxxxxxx
md"Now let's actually check how many scalar multiplications are used in the naive algorithm when computing the product of two $n\times n$ matrices for $n=1,2,\dots,$ $(minimum([40 * THRESHHOLD, 200]))."
xxxxxxxxxx
begin
nvals = 1:minimum([40 * THRESHHOLD, 200])
naive_multiplications = total_multiplications[1:length(nvals)]
strassen_multiplications = []
for n = nvals
A = rand(Int, n, n)
B = rand(Int, n, n)
C, multiplication_count = fast_multiply(A, B)
append!(strassen_multiplications, multiplication_count)
# check that our multiplication algorithm gives correct answer!
A * B == C
end
end
Let's compare the multiplication counts.
xxxxxxxxxx
md"Let's compare the multiplication counts."
1
8
27
64
125
216
344
448
710
875
1272
1512
2208
2408
2904
3136
4645
4970
5760
6125
8424
8904
10056
10584
15064
15456
16432
16856
19872
20328
21464
21952
31870
32515
34105
34790
39595
40320
42110
42875
3625752
3630312
5096442
5103938
5121710
5128928
5148670
5155920
5163776
5166952
xxxxxxxxxx
strassen_multiplications
1
8
27
64
125
216
343
512
729
1000
1331
1728
2197
2744
3375
4096
4913
5832
6859
8000
9261
10648
12167
13824
15625
17576
19683
21952
24389
27000
29791
32768
35937
39304
42875
46656
50653
54872
59319
64000
6967871
7077888
7189057
7301384
7414875
7529536
7645373
7762392
7880599
8000000
xxxxxxxxxx
naive_multiplications
xxxxxxxxxx
begin
Consider arbitrary
Let
Let
The plot below graphs the quotient
If the Strassen algorithm is faster, then this quotient should be greater than 1.
xxxxxxxxxx
md"Consider arbitrary $n\times n$ matrices $A$ and $B$.
xxxxxxxxxx
begin
When
When properly implemented, the Strassen algorithm is practical, and computes
(Caution: the code in this notebook is for demo purposes only and is not efficiently implemented for practical use. There is no need for us to write our own matrix multiplication functions, as we can and should just use the built-in methods.)
xxxxxxxxxx
md"When $n$ is very large, the number of multiplications needed by the Strasssen algorithm will approach a constant times $n^{\log_2(7)} = n^{2.807...}$.
The naive algorithm uses $n^{\log_2(8)} = n^3$ multiplications.
When properly implemented, the Strassen algorithm is practical, and computes $AB$ faster than the naive algorithm. This is not the best possible algorithm!
(Caution: the code in this notebook is for demo purposes only and is not efficiently implemented for practical use. There is no need for us to write our own matrix multiplication functions, as we can and should just use the built-in methods.)
"
Remarks
As of now, the fastest known matrix multiplication algorithm uses a constant times
When multiplying two matrices, you have to process all
The best possible value of the exponent
Currently, we just know that the best value is somewhere in the range
See wikipedia for more info.
xxxxxxxxxx
md"##### Remarks
As of now, the fastest known matrix multiplication algorithm uses a constant times $n^{2.3728639}$ multiplications. This is not a practical algorithm: you only get a speedup for very large $n$.
When multiplying two matrices, you have to process all $2n^2$ entries, so if a matrix multiplication algorithm uses a constant times $n^\omega$ multiplications then we must have $\omega\geq 2$.
The best possible value of the exponent $\omega$ is still unknown.
Currently, we just know that the best value is somewhere in the range $2 \leq \omega \leq 2.3728639$.
See [wikipedia](https://en.wikipedia.org/wiki/Matrix_multiplication_algorithm#Sub-cubic_algorithms) for more info.
"