# Concave-convex vs Convex-convex

2015-09-08

Abstract

We look at a simple quadratic minimax problem on the Euclidean sphere, and consider what happens when it flips from convex to concave.

# Introduction

This is a sneak preview of a nice result with Malek, Bartlett and Abbasi-Yadkori to be presented in (Koolen et al. 2015). Fix $$\alpha \in \mathbb R$$ and $$\boldsymbol{b}\in \mathbb R^d$$ such that $$\|\boldsymbol{b}\| \le 1$$. Let us consider the following min-max problem $V_\alpha ~:=~ \min_{\boldsymbol{a}\in \mathbb R^d} \max_{\substack{\boldsymbol{x}\in \mathbb R^d \\ \|\boldsymbol{x}\| \le 1}} ~ \|\boldsymbol{a}-\boldsymbol{x}\|^2 + (\alpha-1) \|\boldsymbol{x}\|^2 + 2 \boldsymbol{b}^\intercal\boldsymbol{x}$ as a function of $$\alpha$$. (Here $$\|\boldsymbol{x}\|^2 = \boldsymbol{x}^\intercal\boldsymbol{x}$$ denotes the Ecludean norm squared.) In this post we establish that the value and minimiser are $V_\alpha ~=~ \begin{cases} \frac{\|\boldsymbol{b}\|^2}{1-\alpha} & \text{if \alpha \le 0,} \\[.2em] \|\boldsymbol{b}\|^2 + \alpha & \text{if \alpha \ge 0,} \end{cases} \qquad \text{and} \qquad \boldsymbol{a} ~=~ \begin{cases} \frac{\boldsymbol{b}}{1-\alpha} & \text{if \alpha \le 0,} \\[.2em] \boldsymbol{b}& \text{if \alpha \ge 0.} \end{cases}$ The two pieces of $$V_\alpha$$ meet at $$\alpha = 0$$, where the value is $$\|\boldsymbol{b}\|^2$$. The left and right derivatives at $$\alpha = 0$$ are $\|\boldsymbol{b}\|^2 \qquad \text{and} \qquad 1$ and so the function $$V_\alpha$$ is differentiable iff $$\|\boldsymbol{b}\|=1$$.

Why is this min-max problem cool? Our problem is an easy example of a min-max function that flips (at $$\alpha=0$$) from convex-concave to convex-convex. The former type of functions admit nice minimax analysis. The latter is generally much more complicated to deal with. Yet in our example it is still analytically tractable. Graph of $$V_\alpha$$

# Analysis

Let us rewrite \begin{equation}\label{eq:value} V_\alpha ~=~ \min_{\boldsymbol{a}\in \mathbb R^d} \max_{\substack{\boldsymbol{x}\in \mathbb R^d \\ \|\boldsymbol{x}\| \le 1}} ~ \|\boldsymbol{a}\|^2 + \alpha \|\boldsymbol{x}\|^2 + 2 (\boldsymbol{b}- \boldsymbol{a})^\intercal\boldsymbol{x} . \end{equation}

## The convex-concave case $$\alpha \le 0$$

The objective is concave in $$\boldsymbol{x}$$. The maximum is hence found at zero derivative, i.e. $\boldsymbol{x} ~=~ \frac{\boldsymbol{a}- \boldsymbol{b}}{\alpha} ,$ (it remains to check this is feasible) where the value equals $V_\alpha ~=~ \min_{\boldsymbol{a}\in \mathbb R^d} ~ \|\boldsymbol{a}\|^2 - \frac{\|\boldsymbol{a}- \boldsymbol{b}\|^2}{\alpha} .$ As $$\alpha \le 0$$ this is convex in $$\boldsymbol{a}$$, and hence minimised at vanishing derivative. That is, $\boldsymbol{a} ~=~ \frac{\boldsymbol{b}}{1-\alpha} ,$ (back substitution reveals $$\boldsymbol{x}= \frac{\boldsymbol{b}}{1-\alpha}$$, which is indeed feasible as $$\|\boldsymbol{b}\| \le 1$$ by assumption and we are considering the case $$\alpha \le 0$$) where the value equals $V_\alpha ~=~ \frac{\|\boldsymbol{b}\|^2}{1-\alpha} .$ Note that $$\max_{\boldsymbol{x}} \min_{\boldsymbol{a}}$$ results in inner minimiser $$\boldsymbol{a}=\boldsymbol{x}$$ and then again in outer maximiser $$\boldsymbol{x}= \frac{\boldsymbol{b}}{1-\alpha}$$. Therefore the pair $$\boldsymbol{a}=\boldsymbol{x}=\frac{\boldsymbol{b}}{1-\alpha}$$ form a saddle point or Nash equilibrium (are mutually best responses).

## The convex-convex case $$\alpha \ge 0$$

The objective is convex in $$\boldsymbol{x}$$, and it is hence maximised on the unit ball at the point of the unit sphere that maximises the linear term, i.e. at $\boldsymbol{x}~=~ \frac{\boldsymbol{b}-\boldsymbol{a}}{\|\boldsymbol{b}-\boldsymbol{a}\|} ,$ Plugging this in, we find $V_\alpha ~=~ \min_{\boldsymbol{a}\in \mathbb R^d} ~ \|\boldsymbol{a}\|^2 + \alpha + 2 \|\boldsymbol{b}- \boldsymbol{a}\| .$ This is a concave function of $$\boldsymbol{a}$$, although it is not differentiable due to the norm. Yet it is minimised where zero is a sub-gradient. In our case this happens at $\boldsymbol{a} ~=~ \boldsymbol{b},$ where we used the assumption that $$\|\boldsymbol{b}\| \le 1$$. We conclude $V_\alpha ~=~ \|\boldsymbol{b}\|^2 + \alpha .$ Notice that the choice $$\boldsymbol{a}=\boldsymbol{b}$$ kills the linear term in $$\boldsymbol{x}$$ of the objective $$\eqref{eq:value}$$, and hence equalises the objective over the unit sphere.

# Using a single algorithm throughout

We found two algorithms, one for $$\alpha \le 0$$ and one for $$\alpha \ge 0$$. Now suppose we can only play one of them in all cases. How bad would that be?

First suppose we always play $$\boldsymbol{a}=\boldsymbol{b}$$. This kills the linear term in $$\boldsymbol{x}$$. So the objective is maximised in $$\boldsymbol{x}$$ either anywhere on the unit sphere if $$\alpha \ge 0$$ or at $$\boldsymbol{x}=0$$ if $$\alpha \le 0$$. The worst-case value is hence $\begin{cases} \|\boldsymbol{b}\|^2 & \text{if \alpha \le 0,} \\ \|\boldsymbol{b}\|^2 + \alpha & \text{if \alpha \ge 0.} \end{cases}$

Now suppose we always play $$\boldsymbol{a}= \frac{\boldsymbol{b}}{1-\alpha}$$. Then $\max_{\substack{\boldsymbol{x}\in \mathbb R^d \\ \|\boldsymbol{x}\| \le 1}} ~ \frac{\|\boldsymbol{b}\|^2}{(1-\alpha)^2} + \alpha \|\boldsymbol{x}\|^2 - \frac{2 \alpha}{1-\alpha} \boldsymbol{b}^\intercal\boldsymbol{x}$ is maximised at $$\boldsymbol{x}= \frac{ - \frac{\alpha}{1-\alpha} \boldsymbol{b}}{\| - \frac{\alpha}{1-\alpha} \boldsymbol{b}\|}$$ if $$\alpha \ge 0$$ where it takes value $\frac{\|\boldsymbol{b}\|^2}{(1-\alpha)^2} + \alpha + 2 \frac{\alpha}{|1-\alpha|} \|\boldsymbol{b}\|$ or at $$\boldsymbol{x}=\frac{\boldsymbol{b}}{1-\alpha}$$ if $$\alpha \le 0$$ where it takes value $\frac{\|\boldsymbol{b}\|^2}{1-\alpha}$ as before.

# Violated norm bound

We now look at what happens if we play an algorithm designed to operate against data satisfying the constraint $$\|\boldsymbol{x}\| \le 1$$, and encounter data violating that constraint.

## Convex-concave

Say $$\alpha \le0$$ and we play $$\boldsymbol{a}= \frac{\boldsymbol{b}}{1-\alpha}$$. Then the objective is $%\frac{\norm{\b}^2}{(1-\alpha)^2} %+ \alpha \norm{\x}^2 %- 2 \frac{\alpha}{1-\alpha} \b^\top \x %~=~ \frac{\|\boldsymbol{b}\|^2 }{1-\alpha} + \alpha \Big\|\frac{\boldsymbol{b}}{1-\alpha} - \boldsymbol{x}\Big\|^2 ~\le~ \frac{\|\boldsymbol{b}\|^2 }{1-\alpha}$ regardless of $$\boldsymbol{x}$$.

## Convex-convex

Say $$\alpha \ge 0$$ and we play $$\boldsymbol{a}=\boldsymbol{b}$$. Then the objective is $\|\boldsymbol{b}\|^2 + \alpha \|\boldsymbol{x}\|^2$ which scales nicely with the actual data norm.

So in both cases we see that we do not need to assume the norm bound on the data. The convex-concave worst case is independent of any norm bound. The convex-convex case adapts to the norm bound without needing to know it.

Koolen, Wouter M., Alan Malek, Peter L. Bartlett, and Yasin Abbasi-Yadkori. 2015. “Minimax Time Series Prediction.” In Advances in Neural Information Processing Systems (NIPS) 28, to appear.