A converse to Jensen’s inequality for sums of powers

Wouter M. Koolen

2023-09-28

Abstract

We find a suitable converse to a certain instance of Jensen’s Inequality involving sums of powers.

Introduction

Fix any \(x \in \mathbb R^d\). By Jensen’s inequality for the square, \[\left(\sum_{i=1}^d x_i^6\right) \left(\sum_{i=1}^d x_i^2\right) ~\ge~ \left(\sum_{i=1}^d x_i^4\right)^2 .\] In this post we are looking for a converse. That is, we want to find a function \(f(d)\) such that for all \(x \in \mathbb R^d\) we have \[\left(\sum_{i=1}^d x_i^6\right) \left(\sum_{i=1}^d x_i^2\right) ~\le~ f(d) \left(\sum_{i=1}^d x_i^4\right)^2 .\] What type of inequality is this? We are asked to prove \[\|x^3\| \cdot \|x\| ~\le~ \sqrt{f(d)} \|x^2\|^2 ,\] where all powers of \(x\) are understood component-wise, and the norm is Euclidean. Equivalently, we may read this as a statement about the relation between \(p\)-norms for \(p \in \{2,4,6\}\), namely \[\|x\|_6^6 \cdot \|x\|_2^2 ~\le~ f(d) \|x\|_4^8 .\] We can further observe that \(f(d) \ge 1\) is necessary just by plugging in any constant \(x_i=c\) for all \(i\). It must be that \(f\) is increasing as we can pad \(x\) with zeroes to increase its dimension without changing the requirement. Finally, the question is invariant to scaling: scaling \(x\) by a constant \(\alpha\) scales both sides by \(\alpha^8\).

First step with max \(\to\) sum \(\to\) max

To see that the problem is not hopeless (i.e. forcing \(f(d)=\infty\)), we can use \begin{equation}\label{eq:max.and.sum} \max_i x_i^2 ~\le~ \sum_i x_i^2 ~\le~ d \max_i x_i^2. \end{equation} We find \[\left(\sum_i x_i^6\right) \left(\sum_i x_i^2\right) ~\le~ \left(d \max_i x_i^6\right) \left(d \max_i x_i^2\right) ~=~ d^2 \left(\max_i x_i^4\right)^2 ~\le~ d^2 \left(\sum_i x_i^4\right)^2\] showing that \(f(d) = d^2\) works. But is this tight? We should have some worry, as the two inequalities in \(\eqref{eq:max.and.sum}\) cannot both hold with equality for the same \(x\).

Hölder

We may apply Hölder with \(\|\cdot\|_1\) and \(\|\cdot\|_\infty\) to find \[\left(\sum_i x_i^6\right) \left(\sum_i x_i^2\right) ~\le~ \left(\sum_i x_i^4\right) \left( \max_i x_i^2 \right) d \max_i x_i^2 ~=~ d \left(\sum_i x_i^4\right) \left( \max_i x_i^4 \right) ~\le~ d \left(\sum_i x_i^4\right)^2\] Already a factor \(d\) won! But are we there now? Are there \(x\) for which this factor \(d\) actually manifests? What do hard \(x\) for our inequality look like?

Optimum

Let’s abbreviate \(y_i = x_i^2\) and consider the function defined on \(\mathbb R_+^d\) by \begin{equation}\label{eq:g} g(y) ~=~ \frac{\left(\sum_i y_i^3\right) \left(\sum_i y_i\right)}{\left(\sum_i y_i^2\right)^2} . \end{equation} With this notation, the optimal answer is \(f(d) = \max_{y \in \mathbb R_+^d} g(y)\).

An optimal point \(y\) for \(g\) needs to cancel the derivative, i.e. we must have for each \(i\) \[0 ~=~ 3 y_i^2 M_1 M_2 -4 y_i M_3 M_1 + M_3 M_2\] where \(M_r = \sum_i y_i^r\). Hence the solution must satisfy \[y_i ~=~ \frac{2 M_3}{3 M_2} \pm \sqrt{\left(\frac{2 M_3 }{3 M_2}\right)^2 - \frac{M_3}{3 M_1}}\] Cool, so there are only two levels in an optimal \(y\). Let’s call \(d_+\) and \(d_-\) the numbers of each type. To further find an optimal \(y\), we must then have \begin{align*} M_r ~=~ d_+ \left( \frac{2 M_3}{3 M_2} + \sqrt{\left(\frac{2 M_3 }{3 M_2}\right)^2 - \frac{M_3}{3 M_1}} \right)^r + d_- \left( \frac{2 M_3}{3 M_2} - \sqrt{\left(\frac{2 M_3 }{3 M_2}\right)^2 - \frac{M_3}{3 M_1}} \right)^r \end{align*} Let’s call these two levels \(x_i^2 \in \{1,y\}\). How many of each do we need? Let’s optimise over the fraction \(p = \frac{d_+}{d} \in [0,1]\). We have \[\max_p \frac{\left(p + (1-p) y^3\right) \left(p + (1-p) y\right)}{\left(p + (1-p) y^2\right)^2} ~=~ \frac{(1+y)^2}{4 y} \qquad \text{at} \qquad p ~=~ \frac{y^2}{1+y^2}\] Now an optimal \(p,y\) pair will hit the boundaries of \(p \in [1/d,1-1/d]\). For \(y \in [1, \sqrt{d-1}]\) the optimal \(p \le 1-1/d\), and the best value obtainable for such \(y\) is \[\frac{1}{4} \left( \sqrt{d-1} + \frac{1}{\sqrt{d-1}} + 2 \right).\] If we go higher with \(y\), we need to clamp \(p\) to \(p=1-1/d\). And indeed, we need to solve for \[\max_{y \ge 0}~ \frac{\left(d-1 + y^3\right) \left(d-1 + y\right)}{\left(d-1 + y^2\right)^2}\] We can bound this using \(\max\{a,b\} \le a+b \le 2 \max \{a,b\}\) by \[\max_{y \ge 0} \frac{ 4 \max \{d-1,y^3\} \max \{d-1,y\} }{ \max\{d-1,y^2\}^2 } ~=~ 4 \sqrt{d-1}\] which is maximised precisely at \(y = \sqrt{d-1}\) where it takes the value on the right. If we instead use the max vs sum relation to obtain a lower bound, we find that it is at least \(\frac{1}{4} \sqrt{d-1}\). So we sandwiched \(f(d)\) up to a small constant factor. Nice.