Notebook 17 – Math 2121, Fall 2020

In today's class we discussed how to compute a line of best fit by finding a least-squares solution to a certain linear system.

Here we explore the problem of finding a polynomial curve of best fit that passes through some given datapoints, which imagine as having the form $(x, f (x))$ for some unknown function $f$ .

15.5 μs

xxxxxxxxxx
 
using Plots

68.8 s

Suppose xvalues = [x1, x2, ..., xn]. The following method constructs the matrix

$A = [\begin{array}{cccc} 1 & x_{1} & x_{1}^{2} & \dots & x_{1}^{m - 1} \\ 1 & x_{2} & x_{2}^{2} & \dots & x_{2}^{m - 1} \\ ⋮ & ⋮ & ⋮ & ⋱ & ⋮ \\ 1 & x_{n} & x_{n}^{2} & \dots & x_{n}^{m - 1} \end{array}] .$

This is sometimes called a Vandermonde matrix.

If $f : R \to R$ is a function and we have $A [\begin{matrix} c_{0} \\ c_{1} \\ ⋮ \\ c_{m - 1} \end{matrix}] = [\begin{matrix} f (x_{1}) \\ f (x_{2}) \\ ⋮ \\ f (x_{n}) \end{matrix}] \in R^{n}$ then

$p (x) = c_{0} + c_{1} x + c_{2} x^{2} + \dots + c_{m - 1} x^{m - 1}$

is a polynomial satisfying $p (x_{i}) = f (x_{i})$ for each $i = 1, 2, \dots, n$ .

13.8 μs

vandermonde (generic function with 2 methods)

xxxxxxxxxx
 
function vandermonde(xvalues, m=0)
    n = length(xvalues)
    m = m == 0 ? n : m
    A = zeros(n, m)
    for i=1:n
        for j=1:m
            A[i, j] = xvalues[i]^(j - 1)
        end
    end
    return A
end

51.0 μs

4×4 Array{Float64,2}:
 1.0  1.0   1.0    1.0
 1.0  3.0   9.0   27.0
 1.0  4.0  16.0   64.0
 1.0  5.0  25.0  125.0

xxxxxxxxxx
 
vandermonde([1, 3, 4, 5], 4)

21.5 ms

Any square Vandermonde matrix is invertible. Hence, there is a unique polynomial of degree $d + 1$ passing through any given $d$ datapoints $(x_{i}, y_{i})$ with $x_{1} < x_{2} < \dots < x_{d}$ .

7.0 μs

equally_spaced_xvalues (generic function with 1 method)

xxxxxxxxxx
 
function equally_spaced_xvalues(a, b, n)
    @assert n >= 2
    @assert b > a
    delta = (b - a) / (n - 1)
    return [i for i=a:delta:b]
end

28.6 μs

Float641

-1.0

-0.75

-0.5

-0.25

0.0

0.25

0.5

0.75

1.0

xxxxxxxxxx
 
equally_spaced_xvalues(-1, 1, 9)

115 ms

The following method returns the polynomial of best fit (in the sense of least squares) that agrees in the input function fn at the given xvalues.

If not given, the degree of the returned polynomial is the length of xvalues.

15.0 μs

fit_polynomial (generic function with 2 methods)

xxxxxxxxxx
 
function fit_polynomial(fn, xvalues, degree=0)
    b = [fn(x) for x=xvalues]
    A = vandermonde(xvalues, degree)
    coefficients = A \ b
    return x -> sum([x^(i - 1) * coefficients[i] for i=1:length(coefficients)])
end

86.8 μs

Below are some methods to plot our interpolations.

6.7 μs

compare_plots (generic function with 2 methods)

xxxxxxxxxx
 
function compare_plots(fun, pol, xvalues, title="")
    d = 0.001
    rng = xvalues[1]:d:xvalues[end]
    p = plot(fun, rng, label="function", legend = :outertopright, title=title)
    plot!(pol, rng, label="polynomial")
    scatter!(xvalues, zeros(1, length(xvalues)),label=false, color=:grey)
    return p
end

48.8 μs

polynomial_iterpolation_anim (generic function with 2 methods)

xxxxxxxxxx
 
function polynomial_iterpolation_anim(maxpoints, f, a, b, xvalues, degree=0)    
    anim = @animate for npoints=[minimum([i maxpoints]) for i=2:maxpoints + 10]
        xvals = xvalues(a, b, npoints)
        compare_plots(f, fit_polynomial(f, xvals, degree), xvals,
            "Degree $(degree == 0 ? npoints - 1 : degree) interpolation with $(npoints) datapoints")
    end every 1
    gif(anim, "anim_fps15.gif", fps = 2)
end

77.4 μs

Here are methods to measure the error of our interpolations.

6.6 μs

error (generic function with 1 method)

xxxxxxxxxx
 
function error(f, g, a, b)
    delta = (b - a) / 10000
    sum([abs(f(i) - g(i))^2 * delta for i=a:delta:b])
end

54.6 μs

plot_log_error (generic function with 2 methods)

xxxxxxxxxx
 
function plot_log_error(maxpoints, f, a, b, xvalues, degree=0)
    plot([
        log(error(f, fit_polynomial(f, xvalues(a, b, npoints), degree), a, b))
        for npoints = 2:maxpoints
    ], legend=false, title="Log error in successive polynomial interpolations",
    xlabel="degree of interpolation", ylabel="error")
end

82.3 μs

Consider the function $f (x) = \frac{1}{1 + 25 x^{2}}$ .

6.4 μs

#11 (generic function with 1 method)

xxxxxxxxxx
 
f = x -> 1 / (1 + 25 * x^2)

31.5 μs

xxxxxxxxxx
 
begin
    npoints = 9
    a, b = -1, 1
    equally_spaced = equally_spaced_xvalues(a, b, npoints)
    interpolation = fit_polynomial(f, equally_spaced)
    compare_plots(
        f, interpolation, equally_spaced, 
        "Polynomial interpolation with $(npoints) datapoints")
end

7.1 s

A major problem involved in polynomial interpolation is overfitting.

We see this for the preceding function when we sample evenly spaced points in the interval $[- 1, 1]$ .

As we add more datapoints and compute higher degree polynomials that fit the data, the 'fit' of these polynomials actually becomes worse!

On other intervals, say $[- 1, 0]$ , this problem is less dramatic.

11.7 μs

xxxxxxxxxx
 
polynomial_iterpolation_anim(20, f, a, b, equally_spaced_xvalues)

3.2 s

xxxxxxxxxx
 
plot_log_error(40, f, a, b, equally_spaced_xvalues)

693 ms

xxxxxxxxxx
 
polynomial_iterpolation_anim(12, f, -1, 0, equally_spaced_xvalues)

1.3 s

xxxxxxxxxx
 
plot_log_error(20, f, -1, 0, equally_spaced_xvalues)

77.7 ms

One solution to the problem of overfitting is to fix the degree of our polynomial interpolation. This gives a better error that won't explode as we add more datapoints.

On the other hand, the error may have a nontrivial lower bound and may not tend to zero as we add more data.

If the number of datapoints $n$ is greater than our fixed degree $m$ , then it may be impossible to construct a polynomial that passes through every datapoint. In this case the polynomial of best fit corresponds to a least-squares (approximate) solution to the linear system

$A [\begin{matrix} c_{0} \\ c_{1} \\ ⋮ \\ c_{m} \end{matrix}] = [\begin{matrix} f (x_{1}) \\ f (x_{2}) \\ ⋮ \\ f (x_{n}) \end{matrix}]$

where $A$ is a Vandermonde matrix as defined above.

9.6 μs

fixed_degree

xxxxxxxxxx
 
fixed_degree = 20

4.4 μs

xxxxxxxxxx
 
polynomial_iterpolation_anim(45, f, a, b, equally_spaced_xvalues, fixed_degree)

5.3 s

xxxxxxxxxx
 
plot_log_error(60, f, a, b, equally_spaced_xvalues, fixed_degree)

372 ms

Another solution to the overfitting problem is to sample our input function at different $x$ -values.

6.8 μs

chebyshev_spaced_xvalues (generic function with 1 method)

xxxxxxxxxx
 
function chebyshev_spaced_xvalues(a, b, n)
    @assert n >= 2
    @assert b > a
    cheb = [cos((2 * k - 1) / (2 * n) * pi) for k=n:-1:1]
    return [a + (i + 1) * (b - a) / 2 for i=cheb]
end

172 μs

The $x$ -values returned by this method are not equally spaced between $a$ and $b$ .

However, if we sample our function at these points, the resulting polynomial interpolation is much more accurate, and we avoid overfitting as the degree of our interpolation increases:

8.2 μs

xxxxxxxxxx
 
polynomial_iterpolation_anim(30, f, a, b, chebyshev_spaced_xvalues)

3.1 s

xxxxxxxxxx
 
plot_log_error(30, f, a, b, chebyshev_spaced_xvalues)

217 ms

The error here is better than what we get by fixing the degree of our interpolation and using least-squares.

8.6 μs

What are the numbers chebyshev_spaced_xvalues(a, b, n)?

7.1 μs

The Chebyshev polynomials $T_{n} (x)$ of the first kind are defined by the recurrence

$T_{0} (x) = 1,$

$T_{1} (x) = x,$

$T_{n + 1} (x) = 2 x T_{n} (x) - T_{n - 1} (x) .$

These are the unique polynomials that satisfy $T_{n} (\cos (θ)) = \cos (n θ)$ .

8.3 μs

chebyshev (generic function with 1 method)

xxxxxxxxxx
 
function chebyshev(n)
    if n == 0
        return x -> 1
    elseif n == 1
        return x -> x
    else
        return x -> 2 * x * chebyshev(n - 1)(x) - chebyshev(n - 2)(x)
    end
end
        

62.5 μs

xxxxxxxxxx
 
begin
    p = plot(title="Chebyshev polynomials of the first kind")
    for i=1:5
        plot!(chebyshev(i), -1, 1, label="T_$(i)(x)", legend=:outerright)
    end
    plot(p)
end

2.3 s

The values returned by chebyshev_spaced_xvalues(-1, 1, n) are the roots of $T_{n} (x)$ .

5.3 μs

plot_chebyshev (generic function with 1 method)

xxxxxxxxxx
 
function plot_chebyshev(n)
    q = plot(chebyshev(n), -1, 1, 
        title="Plot of T_$(n)(x) and Chebyshev-spaced x-values", legend=:outerright)
    xvalues = chebyshev_spaced_xvalues(-1, 1, n)
    scatter!(xvalues, zeros(1, n), legend=false, color=:grey)
    plot(q)
end

42.1 μs

xxxxxxxxxx
 
plot_chebyshev(20)

706 ms

It can be shown that using chebyshev_spaced_xvalues(a, b, n) to sample our input function yields the best possible exact polynomial interpolation of degree $n$ .

See https://en.wikipedia.org/wiki/Chebyshev_nodes for more information.

1.8 ms