Dynamics of Linear Extrapolation in Two Dimensions With Applications
Abstract
In this dissertation we study the dynamics of linear extrapolation at an attractive fixed
point in two dimensions. We show that the linear extrapolation map is continuous at the
initial condition, x0, and it commutes with any unitary transformation. Then, we discuss
and classify linear extrapolation for various 2 × 2 matrices. We begin by studying the
stability of linear extrapolation when it follows the linear maps defined by 2 × 2 Jordan
forms. We find that linear extrapolation sometimes helps to accelerate the convergence,
and sometimes it does not help, and we characterize with Jordan cases when it helps and
when it does not. For all types of Jordan forms, we study when linear extrapolation gives a
better improvement than simple iterations. We study linear extrapolation with a general
2 × 2 matrix with real and complex eigenvalues and derive formulas for when the norm of
linear extrapolation result is less than one. Unfortunately, these formulas are very
complicated and do produce useful conditions for using linear extrapolation.
Next, we study linear extrapolation following Block Coordinate Descent in two
dimensions. We show that the norm of a linear extrapolation map after two iterations can
be greater than one for some initial conditions. In fact for some problems and some initial
conditions the linear extrapolation can become unbounded. In contrast, we show that if we
take three or more iterations before linear extrapolation, the linear extrapolation works
perfectly.
Finally, we present an application of linear extrapolation to machine learning. We
show how linear extrapolation can be effectively implemented. We apply linear
extrapolation to a standard neural network program using Back Propagation and a Stochastic Gradient approach to optimization. We show that linear extrapolation greatly
accelerates the minimization process when we compare it to the original program.
Description
Keywords
linear extrapolation, Back Propagation, Stochastic Gradient