2015-03-29

Abstract

Convex duality is an important tool in machine learning. Here I present an insightful arrangement of the graphs of a convex function, its dual and their derivatives.

Convex optimisation, and in particular convex duality, is an important technique that is very useful in machine learning. We investigate one way to visualise what is going on.

Let \(f : \mathbb R \to \mathbb R\)
be a convex function. The *convex conjugate* or *dual*
\(f^*\) of \(f\) is defined by \begin{equation*}
f^*(y) ~:=~ \sup_x x y - f(x)
\end{equation*} Under regularity conditions (Legendre), the supremum
above is attained where the derivative vanishes, and we find that the
maxiser \(x\) must satisfy \(y = \nabla f(x)\). We also have \begin{equation*}
\nabla f^* ~=~ (\nabla f)^{-1}
\end{equation*}

This identity suggests an arrangements of graphs relating a function to its dual via its derivative that I have not encountered before.

The function plotted above is the following: \begin{align*} f(x) &~=~ -\sin(x) & f'(x) &~=~ -\cos(x) \\ f^*(z) &~=~ \sqrt{1-z^2} + z \mathop{\mathrm{acos}}(-z) & {f^*}'(z) &~=~ \mathop{\mathrm{acos}}(-z) \end{align*}

Convex duality gets a little hairier when the convex function \(f\) is not differentiable, but with appropriate open-mindedness (set-valuedness) the above idea carries over. Here is the correspondence plot for the prototypical non-differentiable function: the absolute value.

And here are the corresponding formulas. \begin{align*} f(x) &~=~ |x| & f'(x) &~=~ \begin{cases} 1 & x>0 \\ [-1,1] & x=0 \\ -1 & x < 0 \end{cases} \\ f^*(z) &~=~ \begin{cases} 0 & -1 \le z \le 1 \\ \infty & \text{o.w.} \end{cases} & {f^*}'(z) &~=~ \begin{cases} - \infty & z < -1 \\ [-\infty,0] & z=-1 \\ 0 & -1 < z < 1 \\ [0,\infty] & z = 1 \\ \infty & z > 1 \end{cases} \end{align*}