Convex duality is an important tool in machine learning. Here I present an insightful arrangement of the graphs of a convex function, its dual and their derivatives.

# Introduction

Convex optimisation, and in particular convex duality, is an important technique that is very useful in machine learning. We investigate one way to visualise what is going on.

# Convex duality

Let $$f : \mathbb R \to \mathbb R$$ be a convex function. The convex conjugate or dual $$f^*$$ of $$f$$ is defined by \begin{equation*} f^*(y) ~:=~ \sup_x x y - f(x) \end{equation*} Under regularity conditions (Legendre), the supremum above is attained where the derivative vanishes, and we find that the maxiser $$x$$ must satisfy $$y = \nabla f(x)$$. We also have \begin{equation*} \nabla f^* ~=~ (\nabla f)^{-1} \end{equation*}

This identity suggests an arrangements of graphs relating a function to its dual via its derivative that I have not encountered before.

The function plotted above is the following: \begin{align*} f(x) &~=~ -\sin(x) & f'(x) &~=~ -\cos(x) \\ f^*(z) &~=~ \sqrt{1-z^2} + z \operatorname{acos}(-z) & {f^*}'(z) &~=~ \operatorname{acos}(-z) \end{align*}

# Bonus material: non-differentiable $$f$$

Convex duality gets a little hairier when the convex function $$f$$ is not differentiable, but with appropriate open-mindedness (set-valuedness) the above idea carries over. Here is the correspondence plot for the prototypical non-differentiable function: the absolute value.

And here are the corresponding formulas. \begin{align*} f(x) &~=~ |x| & f'(x) &~=~ \begin{cases} 1 & x>0 \\ [-1,1] & x=0 \\ -1 & x < 0 \end{cases} \\ f^*(z) &~=~ \begin{cases} 0 & -1 \le z \le 1 \\ \infty & \text{o.w.} \end{cases} & {f^*}'(z) &~=~ \begin{cases} - \infty & z < -1 \\ [-\infty,0] & z=-1 \\ 0 & -1 < z < 1 \\ [0,\infty] & z = 1 \\ \infty & z > 1 \end{cases} \end{align*}