Introduction to Proximal Gradient Methods

Michael Winkler

May 10, 2022, 1:30 p.m. S2 054

We consider an optimization problem in an additively composed form, i.e., ${\underset{x \in E}{\min}\, f(x) + g(x)}$, where $E$ is a Euclidean space, $f$ is smooth and $g$ is convex. A common example of these types of problems is the so-called LASSO problem stemming from linear regression analysis in statistics and machine learning. Due to $g$ not being smooth, classical 1st order methods are not applicable for solving these problems. One different approach is the Proximal Gradient Method which exploits the convexity of $g$ instead and consists of a gradient descent w.r.t. $f$ and a so-called proximal mapping w.r.t. $g$ applied to the result of the latter. For $g \equiv \lambda \|\cdot\|_1$, this method yields the well-known ISTA (Iterative Soft Thresholding Algorithm). We present the theoretical foundation of the Proximal Gradient Method involving subdifferentiability and the proximal mapping as well as some first theoretical convergence results which ensure convergence to a stationary point of the considered problem under suitable assumptions.