Notebook 18 – Math 2121, Fall 2020
In today's notebook we discuss some applications of singular value decompositions.
xxxxxxxxxx
using LinearAlgebra
xxxxxxxxxx
using Plots
xxxxxxxxxx
plotly()
Every square matrix maps the unit disc to an ellipse
We proved this fact in class for
Let's confirm this by drawing some pictures.
plot_path (generic function with 2 methods)
xxxxxxxxxx
function plot_path(path, title="")
plot(path[1,:], path[2,:], aspect_ratio=:equal, title=title, legend=false)
scatter!([0], [0])
end
Construct the unit disc
2×2601 Transpose{Float64,Array{Float64,2}}:
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 … 0.968583 0.992115 1.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 -0.24869 -0.125333 -2.44929e-16
xxxxxxxxxx
begin
dr = 0.02
dt = 0.02
Disc = transpose(hcat(
[r * cos(2 * pi * t) for r=0:dr:1 for t=0.0:dt:1],
[r * sin(2 * pi * t) for r=0:dr:1 for t=0.0:dt:1 ]))
end
Make a
2×2 Array{Float64,2}:
-1.674 7.531
5.59 6.774
xxxxxxxxxx
M = rand(-10:0.001:10, 2, 2)
xxxxxxxxxx
begin
# compute SVD of matrix
U, S, V = svd(M)
p1 = plot_path(Disc, "Unit disc D")
# columns of V
quiver!(quiver = ([V[1, 1]],[V[2, 1]]), [0], [0], color=:yellow)
quiver!(quiver = ([V[1, 2]],[V[2, 2]]), [0], [0], color=:violet)
p2 = plot_path(M * Disc, "Ellipse {Mv : v in D}")
# columns of U * S
quiver!(quiver = ([U[1, 1] * S[1]],[U[2, 1] * S[1]]), [0], [0], color=:yellow)
quiver!(quiver = ([U[1, 2] * S[2]],[U[2, 2] * S[2]]), [0], [0], color=:violet)
plot(p1, p2)
end
Application: subspaces of best fit
Suppose we have a list of vectors
will have rank (at most)
But what if the vectors
Here is one way to do this:
Compute an SVD
.Set all entries in
outside the upper left block to zero.Call the resulting matrix
.Compute a basis for the column space of
.
The matrix
We will illustrate this idea below when
In this case, what we have just described is another way to compute a line of best fit.
noise (generic function with 2 methods)
xxxxxxxxxx
function noise(m, n, magnitude=0.001)
# returns an m-by-n matrix of noise
return magnitude * rand(m, n) .- magnitude/2
end
50
xxxxxxxxxx
m = 50
Create a random
2×50 Array{Float64,2}:
327.446 60.319 396.382 77.553 336.063 … 379.148 267.127 -258.51 422.233
37.278 6.867 45.126 8.829 38.259 43.164 30.411 -29.43 48.069
xxxxxxxxxx
data = rand(-10:0.001:10, 2, 1) * rand(-50:50, 1, m)
xxxxxxxxxx
begin
x1, y1 = minimum(data, dims=2);
x2, y2 = maximum(data, dims=2);
xlim = (x1 - 10, x2 + 10);
ylim = (y1 - 10, y2 + 10);
scatter(data[1,:], data[2,:], legend=false, title="Datapoints without noise")
plot!([data[1, 1]*x for x=-5:5], [data[2, 1]*x for x=-5:5], xlim=xlim, ylim=ylim)
end
xxxxxxxxxx
begin
ndata = data + noise(2, m, 20)
scatter(ndata[1,:], ndata[2,:], legend=false, title="Datapoints with noise")
_U, _S, _V = svd(ndata)
B = _U * Diagonal([_S[1], 0]) * transpose(_V)
plot!([B[1, 1]*x for x=-50:50], [B[2, 1]*x for x=-50:50], xlim=xlim, ylim=ylim)
end
Application: compressed sensing
Suppose you're running a service like Netflix where there are a large number of users who can watch some large number of movies. Each user rates each movie with a score between 1.0 and 5.0.
Consider the matrix whose rows are indexed by movies and whose columns are indexed by users, in which each entry is one user's score of one movie.
If every user had watched every movie, then we could fill out every entry in this matrix and we would know every user's preferences. We make the following hopeful assumption about this ideal matrix: it probably has low rank compared to the number of users and movies.
In other words, we assume there are some small number of abstract user 'types', maybe people who only like horror movies or only like movies in certain languages or only likes comedies, and the preferences of a typical user is (approximately) a linear combination of these basic types.
In practice, we only have access to a noisy version of this matrix, because not every user has watched every movie. For movies a user hasn't watched, we assign the default rating of 2.5.
Is it possible to recover (approximately) the ideal matrix from the noisy version? That is, can we predict what a user's rating on an unseen movie will be? This is a compressed sensing problem.
A version of this problem was the basis for the USD 1 million Netflix Prize about ten years ago.
4000
xxxxxxxxxx
users = 4000
400
xxxxxxxxxx
movies = 400
0.75
xxxxxxxxxx
watch_frequency = 0.75
Here is one way we might try to predict user ratings.
If the ideal ratings matrix
It follows that all but the largest
Then we try to recover
Then we set all but the first
If the noisy ratings matrix is not too noisy, then this very simple method of recovering the ideal ratings matrix can work fairly well.
In practice, perfect ratings predictions are difficult, but every small improvements to your prediction algorithm can be extremely valuable, if there are a very large number of users.
generate_noiseless_ratings_matrix (generic function with 1 method)
xxxxxxxxxx
function generate_noiseless_ratings_matrix(users, movies)
rank = rand(3:12)
basis = rand(1:5, rank, movies)
combinations = rand(users, rank)
combinations = Diagonal(sum(combinations, dims=2)[:])^-1 * combinations
noiseless_matrix = transpose(combinations * basis)
end
generate_noisy_ratings_matrix (generic function with 1 method)
xxxxxxxxxx
function generate_noisy_ratings_matrix(users, movies, watch_frequency)
# returns noisy rating matrix plus a function to compute error of
# recovered matrix compared to ideal rating matrix (which is not returned)
noiseless_matrix = generate_noiseless_ratings_matrix(users, movies)
noisy_matrix = noiseless_matrix[:,:]
m, n = size(noiseless_matrix)
noise_mask = zeros(m, n)
for i=1:m
for j=1:n
if rand() > watch_frequency
noisy_matrix[i, j] = 2.5
noise_mask[i, j] = 1
end
end
end
measure_error = x -> norm((noiseless_matrix - x) .* noise_mask)
return noisy_matrix, measure_error
end
400×4000 Array{Float64,2}: 3.72258 3.27579 2.5 2.5 3.63034 … 3.72601 3.8469 3.32659 3.91386 3.89421 4.05451 4.1789 4.28448 4.58943 4.1644 3.81497 3.24141 2.5 3.5913 2.5 2.5 2.24011 2.5 3.56145 3.64509 4.05129 2.6286 2.9963 3.18952 2.5 2.36413 2.91828 2.96441 2.96438 2.5 2.68022 2.17202 2.5 1.49161 3.23483 2.5 2.35855 2.20062 2.5 3.03937 2.84238 3.11622 2.5 2.63011 3.08712 … 2.86364 2.70931 2.94004 3.16261 3.28851 3.15564 2.5 4.17942 3.61483 3.38076 3.25995 2.82698 2.5 ⋮ ⋱ 2.77538 2.70192 2.28838 2.5 2.5 3.09185 2.8094 1.86173 3.68248 3.04346 2.5 2.20903 2.5 3.07184 … 3.16449 2.5 2.5 3.01874 3.24505 2.5 2.5 3.88973 2.5 2.5 3.08237 3.19508 4.02675 3.60239 3.26188 2.51181 4.14773 3.53322 3.66822 3.75195 2.5 2.5 4.43874 4.41847 4.30483 2.5 4.36425 4.44593 4.47941 2.5 4.24331 3.56029 3.0937 2.5 4.62951 2.5 2.5 3.66375 2.5 3.70153
#1
xxxxxxxxxx
noisy, measure_error = generate_noisy_ratings_matrix(users, movies, watch_frequency)
3749.64
233.945
212.981
211.019
64.1959
63.2061
62.4739
61.6208
57.1018
56.888
56.4822
56.1183
55.3432
54.643
53.7979
51.8557
51.0547
50.7022
50.2693
50.108
7.41396
7.32439
7.26869
7.20832
7.13759
7.05278
6.9317
6.80644
4.24778
4.19142
xxxxxxxxxx
svals = svd(noisy).S
xxxxxxxxxx
plot(log.(svals), legend=false, title="Logarithms of singular values of noisy matrix")
xxxxxxxxxx
plot(-diff(log.(svals))[1:25], legend=false,
title="Differences between logarithms of consecutive singular values")
The location of the last spike in this graph is a good guess for the rank of our ideal ratings matrix.
recover (generic function with 1 method)
xxxxxxxxxx
function recover(A, r)
# given SVD for A = U * Sigma * V^T, this method replaces
# all but first r singular values of A by zero
# and then returns U * (modified Sigma) * V^T
U, S, V = svd(A)
S = Diagonal([i > r ? 0.0 : S[i] for i=1:size(S)[1]])
return U * S * transpose(V)
end
Error without any attempt at recovery:
620.3058865020329
xxxxxxxxxx
measure_error(noisy)
Now let's attempt to recover the ideal ratings matrix use the SVD algorithm from above:
xxxxxxxxxx
error = [measure_error(recover(noisy, x)) for x = 1:25];
The minimum error should occur when our recovered matrix has the rank we predicted for the ideal ratings matrix.
x
plot(
error,
ylim = (0, maximum(error) * 1.1),
legend=false,
title="Error of recovered matrix compared to true matrix",
xlabel="x: rank of recovered matrix",)