Notebook 4 – Math 2121, Fall 2020

Running this notebook (optional)

You can run this notebook on your own by entering this link in the Open from file menu in Pluto.

Multiplying matrices and vectors

Suppose A is an m×n matrix and vRn is a vector.

In today's lecture we introduced a way of multiplying A and v to get a vector AvRm.

If m=n=2 then this operation is [A11A12A21A22][v1v2]=[A11v1+A12v2A21v2+A22v2].

For example:

22.9 μs
matrix
2×2 Array{Int64,2}:
   1    10
 100  1000
5.5 μs
vector
5.3 μs
8.9 μs
Multiplying matrices

There is a simple way to extend this method of multiplying a matrix and a vector to a method of multiplying two matrices together. We will discuss this matrix multiplication operation in more depth in a few lectures, but here is a first look.

12.7 μs

Assume A and B are both n×n matrices, that is, square matrices of the same size.

Then every column of B is a vector vRn that we can multiply with A.

Write v(i) for column i of B so that B=[v(1)v(2)v(n)].

Then the matrix product of A and B is the n×n matrix AB=[Av(1)Av(2)Av(n)].

17.7 μs
---

If n=2 then [A11A12A21A22][B11B12B21B22]=[A11B11+A12B21A11B12+A12B22A21B11+A22B21A21B12+A22B22].

The first column is Av(1) for v(1)=[B11B21] and the second column is Av(2) for v(2)=[B12B22].

10.9 μs

A concrete example is [1011000100][2468]=[264826004800].

We can double check this computation in Julia.

10.5 μs
A
2×2 Array{Int64,2}:
   10    1
 1000  100
33.1 ms
B
2×2 Array{Int64,2}:
 2  4
 6  8
7.4 μs
2×2 Array{Int64,2}:
   26    48
 2600  4800
17.6 μs
2×2 Array{Int64,2}:
   26    48
 2600  4800
14.9 μs

This definition of AB makes sense whenever A is an m×p matrix and B is a p×n matrix.

In that case, the result of multiplying A and B is an m×n matrix AB.

For simplicity for we focus today on the case when m=p=q=n.

10.8 μs
Efficiency

The question we want to explore: how fast can we multiply two n×n matrices A and B?

In other words, what is the most efficient algorithm for matrix multiplication?

If you don't want to think about the new operation of multiplying two matrices, then it's equivalent to think about how fast we can multiply an n×n matrix A with a sequence of n vectors.

32.4 μs
---

To multiply matrices, we just need to be able to perform addition and multiplication of individual scalars (i.e., numbers). Multiplication operations are much more time consuming than addition.

So our efficiency question can be reduced to: what is the fewest number of scalar multiplications needed to multiply two n×n matrices?

14.3 μs
---

The formula above for the n=2 case shows 8 multiplications:

A11B11,A11B12,A21B11,A21B12,A12B21,A12B22,A22B21,A22B22.

In general, entry (i,j) in AB is Ai1B1j+Ai2B2j+Ai3B3j++AinBnj.

This formula involves n multiplications and there are n2 entries (i,j).

So you might expect that we need n3 multiplications (giving 8=23 for n=2) to compute AB.

Let's write some code to verify this.

15.8 μs
scalar_multiply (generic function with 1 method)
24.9 μs
scalar_type (generic function with 1 method)
22.7 μs
matrix_multiply (generic function with 2 methods)
92.6 μs

Now let's actually check how many scalar multiplications are used in the naive algorithm when computing the product of two n×n matrices, say for n=1,2,,200.

10.2 μs
1.4 s
Plots.jl
05010015020002×10​64×10​66×10​68×10​6
nMultiplications neededMultiplying n-by-n matrices with naive algorithm
1.8 ms
1.2 μs
true
78.3 ms
Faster algorithms

Now let's discuss a different matrix multiplication algorithm.

The following is called the Strassen algorithm, discovered by Volker Strassen in the 1960s.

The significance of this algorithm is just to demonstrate that there is a faster way to multiply matrices than the naive method.

The idea of the algorithm is, to compute AB where A and B are n×n matrices, proceed as follows:

  • First, we split both A and B into four n/2×n/2 submatrices.

  • Then we express AB as a sum of terms derived from multiplying these submatrices together.

  • The clever part is to somehow write this using only 7 multiplications instead of the usual 8.

  • If n=2 then the multiplications involve 1×1 submatrices, that is, scalars.

  • If n>2 then we recursively apply the algorithm the multiply the submatrices together.

14.2 μs
---

We can improve this algorithm by modifying the final step of the recursion.

When n is small, below some threshhold value, then it's faster to use the naive multiplication algorithm. We can choose this threshhold below.

Setting this threshhold to 1 means we would keep subdividing the matrices until they are 1×1. This isn't the most efficient choice.

13.4 μs

Fast multiply threshold: 6

341 μs
fast_multiply (generic function with 2 methods)
140 μs

Let's compare both algorithms for 2×2 matrices.

9.3 μs
X
2×2 Array{Int64,2}:
 73   52
 37  -44
10.4 μs
Y
2×2 Array{Int64,2}:
  52   -9
 -23  -73
10.9 μs
5.3 μs
186 ms

Now let's actually check how many scalar multiplications are used in the naive algorithm when computing the product of two n×n matrices for n=1,2,, 200.

18.4 μs
5.1 s

Let's compare the multiplication counts.

8.7 μs
757 ns
1.2 μs
Plots.jl
05010015020002×10​64×10​66×10​68×10​6
Strassen algorithmNaive algorithmnMultiplications neededMultiplying n-by-n matrices [@threshhold = 6]
1.9 ms

Consider arbitrary n×n matrices A and B.

Let Mnaive(n) be the number of multiplications used by the naive algorithm compute AB.

Let Mstrassen(n) be the number of multiplications used by the Strassen algorithm.

The plot below graphs the quotient Mnaive(n)Mstrassen(n).

If the Strassen algorithm is faster, then this quotient should be greater than 1.

14.2 μs
Plots.jl
7132549971930.00.51.01.52.02.5
nQuotient of multiplication counts(naive alg count) ÷ (fast alg count) [threshhold = 6]
45.8 ms

When n is very large, the number of multiplications needed by the Strasssen algorithm will approach a constant times nlog2(7)=n2.807.... The naive algorithm uses nlog2(8)=n3 multiplications.

When properly implemented, the Strassen algorithm is practical, and computes AB faster than the naive algorithm. This is not the best possible algorithm!

(Caution: the code in this notebook is for demo purposes only and is not efficiently implemented for practical use. There is no need for us to write our own matrix multiplication functions, as we can and should just use the built-in methods.)

12.6 μs
Remarks

As of now, the fastest known matrix multiplication algorithm uses a constant times n2.3728639 multiplications. This is not a practical algorithm: you only get a speedup for very large n.

When multiplying two matrices, you have to process all 2n2 entries, so if a matrix multiplication algorithm uses a constant times nω multiplications then we must have ω2.

The best possible value of the exponent ω is still unknown.

Currently, we just know that the best value is somewhere in the range 2ω2.3728639.

See wikipedia for more info.

15.7 μs