Multivariate Sub-Gaussian Deviation Inequality

Wouter M. Koolen



We derive a deviation inequality for multivariate sub-Gaussians using mixture supermartingales and Ville’s inequality.


Consider a random process \((X_t)_t\) taking values \(X_t \in \mathbb R^d\). Let us assume that \(X_t - \mu_t\) is \(\Sigma_t\) sub-Gaussian given the past. That is, at every \(t\) we have for every \(\eta \in \mathbb R^d\) that \begin{equation}\label{eq:subg} \operatorname*{\mathbb E}\left[ e^{ \left\langle\eta, X_t-\mu_t\right\rangle } \middle| \mathcal F_{t-1} \right] ~\le~ e^{\frac{1}{2} \left\|\eta\right\|^2_{\Sigma_t}} . \end{equation} Let us denote the total deviation vector from the mean by \(S_t = \sum_{s=1}^t (X_s-\mu_s)\) and the total covariance matrix by \(V_t = \sum_{s=1}^t \Sigma_s\). By \(\eqref{eq:subg}\), we can make a test super-martingale with value \[M_t^\eta ~=~ e^{ \left\langle\eta, S_t\right\rangle - \frac{1}{2} \left\|\eta\right\|^2_{V_t} } .\] Finally, by mixing over \(\eta\) (here we do the normal prior case) we can make a test super-martingale with value \[M_t ~=~ \operatorname*{\mathbb E}_{\eta \sim \mathcal N(0, \sigma^2 I)} \left[ M_t^\eta \right] ~=~ \int (2 \pi \sigma^2)^{-d/2} e^{ - \frac{1}{2 \sigma^2} \|\eta\|^2 + \left\langle\eta, S_t\right\rangle - \frac{1}{2} \left\|\eta\right\|^2_{V_t} } \,\textrm{d}\eta ,\] which results in \[\ln M_t ~=~ \frac{1}{2} \left\|S_t\right\|^2_{\left(V_t + \sigma^{-2} I\right)^{-1}} - \frac{1}{2} \ln \det\left( I + \sigma^2 V_t \right) .\] So all in all, using Ville’s inequality, we find \[\operatorname*{\mathbb P}\left\{ \exists t : \frac{1}{2} \left\|S_t\right\|^2_{\left(V_t + \sigma^{-2} I\right)^{-1}} > \ln \frac{1}{\delta} + \frac{1}{2} \ln \det\left( I + \sigma^2 V_t \right) \right\} \le \delta .\] This is a pretty nice result obtained in a clean way. We may go many ways from here. Trying to learn \(\sigma^2\). Or trying to choose \(\sigma^2\) of order \(1/V_t\) so that the log-determinant disappears. Going beyond sub-Gaussian. Or making things more interpretable. Next we will look at one way to simplify the result.

Special case of bounded deviations

In particular, if all \(X_t - \mu_t\) are bounded in a Euclidean norm ball of radius \(B\), then sub-Gaussianity holds with \(\Sigma_t = B^2 I\). So with probability at least \(1-\delta\), for all \(t\) \[\frac{1}{2 \left(t B^2 + \sigma^{-2}\right)} \left\|S_t\right\|^2 \le \ln \frac{1}{\delta} + \frac{d}{2} \ln \left( 1 + \sigma^2 B^2 t \right) .\] Setting \(\sigma = 1/B\) then gives \[\frac{1}{2 \left(t + 1\right)} \left\|S_t/B\right\|^2 \le \ln \frac{1}{\delta} + \frac{d}{2} \ln \left( t + 1 \right) .\]