We show that any two-player zero-sum bimatrix game has a saddle point where both players randomise over the same number of strategies.

Introduction

We consider two-player zero-sum matrix games. Such games have a single round in which the two players independently choose a strategy. If player one chooses \(i\) and player two chooses \(j\) then the loss of player one and the gain of player two are given by the entry \(A_{i,j}\) of the so-called payoff matrix \({\boldsymbol{A}}\) that completely specifies the game.

A matrix game has a saddle point in mixed strategies, where both players randomise. We investigate how many strategies are used in optimal play for each player. This is interesting e.g. if \({\boldsymbol{A}}\) is not square. It turns out that there always is a saddle point where both players randomise over the same number of strategies.

Setup

Fix a matrix \({\boldsymbol{A}}\), not necessarily square. We consider the problem \[\min_{{\boldsymbol{p}}} \max_{{\boldsymbol{q}}} \, {\boldsymbol{p}}^{\intercal}{\boldsymbol{A}}{\boldsymbol{q}}\] where \({\boldsymbol{p}}\) and \({\boldsymbol{q}}\) range over the simplex. We first rewrite it as \[\min_{\substack{ {\boldsymbol{p}}, \lambda \\ \forall j: {\boldsymbol{p}}^{\intercal}{\boldsymbol{A}}{\boldsymbol{e}}_j \le \lambda }} \lambda\] By the KKT conditions, there are primal variables \(({\boldsymbol{p}}, \lambda)\) and dual variables \((\mu, {\boldsymbol{\nu}}, {\boldsymbol{q}})\) such that \begin{align*} p_i &~\ge~ 0 \\ \sum_i p_i &~=~ 1 \\ {\boldsymbol{p}}^{\intercal}{\boldsymbol{A}}{\boldsymbol{e}}_j &~\le~ \lambda \\ q_j &~\ge~ 0 \\ \nu_i &~\ge~ 0 \\ - \nu_i p_i &~=~ 0 \\ q_j ({\boldsymbol{p}}^{\intercal}{\boldsymbol{A}}{\boldsymbol{e}}_j - \lambda) &~=~ 0 \\ - {\boldsymbol{\nu}}+ {\boldsymbol{A}}{\boldsymbol{q}}- \mu {\boldsymbol{1}}&~=~ 0 \\ 1 - \sum_j q_j &~=~ 0 \end{align*} Eliminating \({\boldsymbol{\nu}}= {\boldsymbol{A}}{\boldsymbol{q}}- \mu {\boldsymbol{1}}\) and reorganising, we get \begin{align*} p_i &~\ge~ 0 \\ \sum_i p_i &~=~ 1 \\ q_j &~\ge~ 0 \\ \sum_j q_j &~=~ 1 \\ {\boldsymbol{p}}^{\intercal}{\boldsymbol{A}}{\boldsymbol{e}}_j &~\le~ \lambda \\ {\boldsymbol{e}}_i^{\intercal}{\boldsymbol{A}}{\boldsymbol{q}}&~\ge~ \mu \\ p_i ({\boldsymbol{e}}_i^{\intercal}{\boldsymbol{A}}{\boldsymbol{q}}- \mu) &~=~ 0 \\ q_j ({\boldsymbol{p}}^{\intercal}{\boldsymbol{A}}{\boldsymbol{e}}_j - \lambda) &~=~ 0 \end{align*} Summing the last two equations over \(i\) and \(j\) we find \[\mu ~=~ \lambda ~=~ {\boldsymbol{p}}^{\intercal}{\boldsymbol{A}}{\boldsymbol{q}}\] So that \({\boldsymbol{p}}\) and \({\boldsymbol{q}}\) form a saddle point iff \begin{align*} p_i &~\ge~ 0 \\ \sum_i p_i &~=~ 1 \\ q_j &~\ge~ 0 \\ \sum_j q_j &~=~ 1 \\ {\boldsymbol{p}}^{\intercal}{\boldsymbol{A}}{\boldsymbol{e}}_j &~\le~ {\boldsymbol{p}}^{\intercal}{\boldsymbol{A}}{\boldsymbol{q}}\\ {\boldsymbol{e}}_i{\boldsymbol{A}}{\boldsymbol{q}}&~\ge~ {\boldsymbol{p}}^{\intercal}{\boldsymbol{A}}{\boldsymbol{q}}\\ p_i ({\boldsymbol{e}}_i^{\intercal}{\boldsymbol{A}}{\boldsymbol{q}}- {\boldsymbol{p}}^{\intercal}{\boldsymbol{A}}{\boldsymbol{q}}) &~=~ 0 \\ q_j ({\boldsymbol{p}}^{\intercal}{\boldsymbol{A}}{\boldsymbol{e}}_j - {\boldsymbol{p}}^{\intercal}{\boldsymbol{A}}{\boldsymbol{q}}) &~=~ 0 \end{align*} Now what does this say about the support size of \({\boldsymbol{p}}\) and \({\boldsymbol{q}}\)?

Say the support of \({\boldsymbol{q}}\) is larger than that of \({\boldsymbol{p}}\), and say that \({\boldsymbol{p}}\) and \({\boldsymbol{q}}\) are both strictly positive (otherwise trim \({\boldsymbol{A}}\)). So we must have for all \(i,j\) that \begin{align*} {\boldsymbol{e}}_i^{\intercal}{\boldsymbol{A}}{\boldsymbol{q}}&~=~ {\boldsymbol{p}}^{\intercal}{\boldsymbol{A}}{\boldsymbol{q}}\\ {\boldsymbol{p}}^{\intercal}{\boldsymbol{A}}{\boldsymbol{e}}_j &~=~ {\boldsymbol{p}}^{\intercal}{\boldsymbol{A}}{\boldsymbol{q}}\end{align*} If we can find a \({\boldsymbol{v}}\neq 0\) such that \(\sum_i v_i = 0\) and \({\boldsymbol{e}}_i^{\intercal}{\boldsymbol{A}}{\boldsymbol{v}}= 0\) for all \(i\) then \({\boldsymbol{q}}+ \epsilon {\boldsymbol{v}}\) also satisfies the conditions, and we can move through solution space until we hit a new zero. Now when \({\boldsymbol{A}}\) has full row rank there are \(d\) equations. So we have a non-zero solution for \(d+1\) variables. If \({\boldsymbol{A}}\) is not full row rank then there are at most \(d\) equations, and again from \(d+1\) variables we have redundancy.

It follows that there is a saddle point with identical support sizes.