2015-09-29
We compare two minimax strategies on the ball.
We consider minimax games on the ball. Each round, the learner plays a point \(\boldsymbol{a}\) with \(\left\|\boldsymbol{a}\right\| \le 1\), and the adversary plays a point \(\boldsymbol{x}\) with \(\left\|\boldsymbol{x}\right\| \le 1\). We consider two loss functions
Linear loss. The learner incurs \(\ell(\boldsymbol{a},\boldsymbol{x}) = \boldsymbol{a}^\intercal\boldsymbol{x}\).
Quadratic loss. The learner incurs \(\ell(\boldsymbol{a}, \boldsymbol{x}) = \left\|\boldsymbol{a}-\boldsymbol{x}\right\|^2\).
Consider a \(T\)-round game where the players played \(\boldsymbol{a}_1, \boldsymbol{x}_1, \ldots, \boldsymbol{a}_T, \boldsymbol{x}_T\). Then the cumulative loss of the learner is \[\sum_{t=1}^T \ell(\boldsymbol{a}_t, \boldsymbol{x}_t),\] and the cumulative loss of the best fixed action in hindsight is \[\min_{\boldsymbol{a}} \sum_{t=1}^T \ell(\boldsymbol{a}, \boldsymbol{x}_t) .\]
The best action in hindsight is different for the two loss functions. In both cases, let us abbreviate \(\boldsymbol{s}_T = \sum_{t=1}^T \boldsymbol{x}_t\).
For linear loss the best action is \(\frac{-\boldsymbol{s}_T}{\left\|\boldsymbol{s}_T\right\|}\), that is, the point on the unit sphere least aligned with direction \(\boldsymbol{s}_T\).
For quadratic loss the best action is \(\frac{\boldsymbol{s}_T}{T}\), that is, the data mean.
We call regret the overhead of the Learner compared to the best action \[\text{Regret}_T ~=~ \sum_{t=1}^T \ell(\boldsymbol{a}_t, \boldsymbol{x}_t) - \min_{\boldsymbol{a}} \sum_{t=1}^T \ell(\boldsymbol{a}, \boldsymbol{x}_t) .\] The minimax regret of a \(T\)-round game is given by \[\min_{\boldsymbol{a}_1} \max_{\boldsymbol{x}_1} \cdots \min_{\boldsymbol{a}_T} \max_{\boldsymbol{x}_T} ~ \text{Regret}_T .\] The minimax algorithm for either game is known. In either case we abbreviate \(\boldsymbol{s}_t = \sum_{q=1}^t \boldsymbol{x}_q\).
For linear loss, the minimax strategy was found by (Abernethy et al. 2008). It predicts with \[\boldsymbol{a}_t ~=~ \frac{- \boldsymbol{s}_{t-1}}{\sqrt{\left\|\boldsymbol{s}_{t-1}\right\|^2 + (T-t+1)}}\]
For quadratic loss, the minimax strategy was found by (Takimoto and Warmuth 2000). It predicts with \[\boldsymbol{a}_t ~=~ \alpha_t \boldsymbol{s}_{t-1}\] where the shrinkage \(\alpha_t\) is recursively defined by \(\alpha_T = 1/T\) and \(\alpha_t = \alpha_{t+1}^2 + \alpha_{t+1}\). (very roughly \(\alpha_t \approx 1/t\).)
In both cases, we see that the minimax strategy plays a dampened version of the best offline action thus far.
We might also consider the loss \(\ell(\boldsymbol{a},\boldsymbol{x}) = - (\boldsymbol{a}^\intercal\boldsymbol{x})^2\), still with \(\boldsymbol{a}\) and \(\boldsymbol{x}\) from the unit ball, which relates to PCA. Is the minimax algorithm tractable? Now the best offline action is the eigenvector of largest eigenvalue of the data second moment matrix \[\max_{\boldsymbol{a}} \sum_t (\boldsymbol{a}^\intercal\boldsymbol{x}_t)^2 ~=~ %\max_{\a} \sum_t \a^\top \x_t \x_t^\top \a %~=~ \max_{\boldsymbol{a}} \boldsymbol{a}^\intercal\left(\sum_t \boldsymbol{x}_t \boldsymbol{x}_t^\intercal\right) \boldsymbol{a} ~=~ \lambda_\text{max} \left(\sum_t \boldsymbol{x}_t \boldsymbol{x}_t^\intercal\right) .\] Rather than quadratic, this problem has a linear feel to it, and that is correct. It is the linear problem in the space of dyads (outer products \(\boldsymbol{a}\boldsymbol{a}^\intercal\) of unit vectors \(\boldsymbol{a}\)) where \((\boldsymbol{a}^\intercal\boldsymbol{x})^2 = \mathop{\mathrm{tr}}\left(\boldsymbol{a}\boldsymbol{a}^\intercal\boldsymbol{x}\boldsymbol{x}^\intercal\right)\) is the natural inner product.