2015-02-01

# Introduction

We consider two-player zero-sum matrix games. Such games have a single round in which the two players independently choose a strategy. If player one chooses $$i$$ and player two chooses $$j$$ then the loss of player one and the gain of player two are given by the entry $$A_{i,j}$$ of the so-called payoff matrix $$\boldsymbol{A}$$ that completely specifies the game.

A matrix game has a saddle point in mixed strategies, where both players randomise. We investigate how many strategies are used in optimal play for each player. This is interesting e.g. if $$\boldsymbol{A}$$ is not square. It turns out that there always is a saddle point where both players randomise over the same number of strategies.

# Setup

Fix a matrix $$\boldsymbol{A}$$, not necessarily square. We consider the problem $\min_{\boldsymbol{p}} \max_{\boldsymbol{q}} \, \boldsymbol{p}^\intercal\boldsymbol{A}\boldsymbol{q}$ where $$\boldsymbol{p}$$ and $$\boldsymbol{q}$$ range over the simplex. We first rewrite it as $\min_{\substack{ \boldsymbol{p}, \lambda \\ \forall j: \boldsymbol{p}^\intercal\boldsymbol{A}\boldsymbol{e}_j \le \lambda }} \lambda$ By the KKT conditions, there are primal variables $$(\boldsymbol{p}, \lambda)$$ and dual variables $$(\mu, \boldsymbol{\nu}, \boldsymbol{q})$$ such that \begin{align*} p_i &~\ge~ 0 \\ \sum_i p_i &~=~ 1 \\ \boldsymbol{p}^\intercal\boldsymbol{A}\boldsymbol{e}_j &~\le~ \lambda \\ q_j &~\ge~ 0 \\ \nu_i &~\ge~ 0 \\ - \nu_i p_i &~=~ 0 \\ q_j (\boldsymbol{p}^\intercal\boldsymbol{A}\boldsymbol{e}_j - \lambda) &~=~ 0 \\ - \boldsymbol{\nu}+ \boldsymbol{A}\boldsymbol{q}- \mu \boldsymbol{1}&~=~ 0 \\ 1 - \sum_j q_j &~=~ 0 \end{align*} Eliminating $$\boldsymbol{\nu}= \boldsymbol{A}\boldsymbol{q}- \mu \boldsymbol{1}$$ and reorganising, we get \begin{align*} p_i &~\ge~ 0 \\ \sum_i p_i &~=~ 1 \\ q_j &~\ge~ 0 \\ \sum_j q_j &~=~ 1 \\ \boldsymbol{p}^\intercal\boldsymbol{A}\boldsymbol{e}_j &~\le~ \lambda \\ \boldsymbol{e}_i^\intercal\boldsymbol{A}\boldsymbol{q}&~\ge~ \mu \\ p_i (\boldsymbol{e}_i^\intercal\boldsymbol{A}\boldsymbol{q}- \mu) &~=~ 0 \\ q_j (\boldsymbol{p}^\intercal\boldsymbol{A}\boldsymbol{e}_j - \lambda) &~=~ 0 \end{align*} Summing the last two equations over $$i$$ and $$j$$ we find $\mu ~=~ \lambda ~=~ \boldsymbol{p}^\intercal\boldsymbol{A}\boldsymbol{q}$ So that $$\boldsymbol{p}$$ and $$\boldsymbol{q}$$ form a saddle point iff \begin{align*} p_i &~\ge~ 0 \\ \sum_i p_i &~=~ 1 \\ q_j &~\ge~ 0 \\ \sum_j q_j &~=~ 1 \\ \boldsymbol{p}^\intercal\boldsymbol{A}\boldsymbol{e}_j &~\le~ \boldsymbol{p}^\intercal\boldsymbol{A}\boldsymbol{q} \\ \boldsymbol{e}_i^\intercal\boldsymbol{A}\boldsymbol{q}&~\ge~ \boldsymbol{p}^\intercal\boldsymbol{A}\boldsymbol{q} \\ p_i (\boldsymbol{e}_i^\intercal\boldsymbol{A}\boldsymbol{q}- \boldsymbol{p}^\intercal\boldsymbol{A}\boldsymbol{q}) &~=~ 0 \\ q_j (\boldsymbol{p}^\intercal\boldsymbol{A}\boldsymbol{e}_j - \boldsymbol{p}^\intercal\boldsymbol{A}\boldsymbol{q}) &~=~ 0 \end{align*} Now what does this say about the support size of $$\boldsymbol{p}$$ and $$\boldsymbol{q}$$?

Say the support of $$\boldsymbol{q}$$ is larger than that of $$\boldsymbol{p}$$, and say that $$\boldsymbol{p}$$ and $$\boldsymbol{q}$$ are both strictly positive (otherwise trim $$\boldsymbol{A}$$). So we must have for all $$i,j$$ that \begin{align*} \boldsymbol{e}_i^\intercal\boldsymbol{A}\boldsymbol{q}&~=~ \boldsymbol{p}^\intercal\boldsymbol{A}\boldsymbol{q} \\ \boldsymbol{p}^\intercal\boldsymbol{A}\boldsymbol{e}_j &~=~ \boldsymbol{p}^\intercal\boldsymbol{A}\boldsymbol{q} \end{align*} If we can find a $$\boldsymbol{v}\neq 0$$ such that $$\sum_i v_i = 0$$ and $$\boldsymbol{e}_i^\intercal\boldsymbol{A}\boldsymbol{v}= 0$$ for all $$i$$ then $$\boldsymbol{q}+ \epsilon \boldsymbol{v}$$ also satisfies the conditions, and we can move through solution space until we hit a new zero. Now when $$\boldsymbol{A}$$ has full row rank there are $$d$$ equations. So we have a non-zero solution for $$d+1$$ variables. If $$\boldsymbol{A}$$ is not full row rank then there are at most $$d$$ equations, and again from $$d+1$$ variables we have redundancy.

It follows that there is a saddle point with identical support sizes.