2016-09-01

Abstract

We study a simple type of power inequality that pops up in the analysis of learning algorithms. The inequality has no general analytic solution, and we find a tractable bound.

We look at inequalities of the form \begin{equation}\label{eq:first} x ~\le~ \alpha + \beta x^\kappa \end{equation} for \(\alpha \ge 0\), \(\beta \ge 0\) and \(\kappa \in (0,1)\). Intuitively, this inequality states that \(x\) cannot be too large, since \(x\) has to be below a smaller power of itself. But how large can \(x\) still be? Well, obviously, \[x ~\le~ v \qquad\text{where}\qquad v ~=~ \sup \left\{ x \middle| x \le \alpha + \beta x^\kappa \right\} .\] Now \(v\) satisfies \(\eqref{eq:first}\) with equality, i.e. \(v = \alpha + \beta v^\kappa\). But, unfortunately, that equality does not generally have an analytic solution. So we are interested in tight and tractable upper bounds on \(v\).

To set the stage, let’s start with a simple lower bound on \(v\). If we set \(\alpha = 0\), we can indeed solve for equality in \(\eqref{eq:first}\), yielding \[v ~\ge~ \beta^{\frac{1}{1-\kappa}} .\] We are hence (by monotonicity of \(v\) in \(\alpha\)) looking for an expression that is a little larger than this.

Since \(\kappa \in [0,1]\), the function \(x \mapsto x^\kappa\) is concave. The main idea will be to linearise it around some point \(y \ge 0\), i.e. \[x^\kappa ~\le~ y^\kappa + (x-y) \kappa y^{\kappa-1} ~=~ (1-\kappa) y^{\kappa} + \kappa x y^{\kappa-1}\] and hence \[v ~=~ \alpha + \beta v^\kappa ~\le~ \alpha + \beta \left( (1-\kappa) y^{\kappa} + \kappa v y^{\kappa-1} \right) .\] This inequality is trivial for \(y \le (\beta \kappa)^{\frac{1}{1-\kappa}}\). For larger \(y\) we may reorganise it to \[v ~\le~ \frac{ \alpha + \beta (1-\kappa) y^{\kappa} }{ 1 - \beta \kappa y^{\kappa-1} } .\] It then remains to pick \(y\). Optimising the above expression in \(y\) will not work, as it would solve \(\eqref{eq:first}\) exactly. But let us plug in the following reasonable choice (which would be optimal were \(\alpha=0\)) \[y ~=~ \arg\min_{y \ge 0} ~ \frac{ \beta (1-\kappa) y^{\kappa} }{ 1 - \beta \kappa y^{\kappa-1} } ~=~ \beta^{\frac{1}{1-\kappa}}\] and the final bound becomes \[v ~\le~ \frac{ \alpha }{ 1 - \kappa } + \beta^{\frac{1}{1-\kappa}} .\] That looks useful, especially for small \(\alpha\) (which happens to be the case that sparked my interest). Moreover, this bound is tight in the following way. If we write \(v(\alpha)\) for \(v\) regarded as a function of \(\alpha\), then \(v(0) = \beta^{\frac{1}{1-\kappa}}\) and \(v'(0) = \frac{1}{1-\kappa}\). So the expression above is the Taylor expansion of \(v(\alpha)\) around \(\alpha = 0\).