2015-05-13

Abstract

We investigate what happens to simple matrix games and their solutions if we either randomly or strategically choose between two games.

In this note we consider two-player zero-sum matrix games. Such games
are specified by a payoff matrix. We assume that we have two such games
\(\boldsymbol{A}\) and \(\boldsymbol{B}\). As our running examples
we will use \begin{align*}
\boldsymbol{A}
&~=~
\begin{bmatrix}
8 & 0
\\
4 & 6
\end{bmatrix}
&
& \text{and}
&
\boldsymbol{B}
&~=~
\begin{bmatrix}
10 & 6
\\
2 & 8
\end{bmatrix}
.
\end{align*} To name the rows and columns we may also write
\(\boldsymbol{A}\) as follows: \[\begin{array}{l|cc}
\boldsymbol{A}& L & R
\\
\hline
T & 8 & 0
\\
B & 4 & 6
\end{array}
.\] The interpretation of such a game is as follows. The row
player chooses either strategy T(op) or B(ottom), and, independently,
the column player chooses either strategy L(eft) or R(ight). The payoff
to the row player is then as specified in the entry of the matrix
determined by their joint choice. The row player aims to maximize his
payoff, while the column player aims to minimize it. Von Neumann’s
minimax theorem states that for any such game there are randomized
strategies \(\boldsymbol{p}_*\)
(probability distribution on rows) and \(\boldsymbol{q}_*\) (probability
distribution on columns) such that for any strategies \(\boldsymbol{p}\) and \(\boldsymbol{q}\) \[\boldsymbol{p}^\intercal\boldsymbol{A}\boldsymbol{q}_*
~\le~
\boldsymbol{p}_*^\intercal\boldsymbol{A}\boldsymbol{q}_*
~\le~
\boldsymbol{p}_*^\intercal\boldsymbol{A}\boldsymbol{q}
.\] That is, by playing \(\boldsymbol{p}_*\) the row player can force
expected payoff at least \(\boldsymbol{p}_*^\intercal\boldsymbol{A}\boldsymbol{q}_*\),
whereas by playing \(\boldsymbol{q}_*\)
the column player can force expected payoff at most \(\boldsymbol{p}_*^\intercal\boldsymbol{A}\boldsymbol{q}_*\).
Such optimal strategies \(\boldsymbol{p}_*\) and \(\boldsymbol{q}_*\) are called a
*saddle-point pair*, and the expected payoff \(\boldsymbol{p}_*^\intercal\boldsymbol{A}\boldsymbol{q}_*\)
resulting under such optimal play is called the *value of the
game*. We will denote the value by \(V(\boldsymbol{A})\). For the examples
above, we have values and saddle point pairs \begin{align*}
V(\boldsymbol{A}) &~=~ 4 + \frac{4}{5},
&
\boldsymbol{p}_* &~=~ \left(\frac{1}{5}, \frac{4}{5}\right),
&
\boldsymbol{q}_* &~=~ \left(\frac{3}{5}, \frac{2}{5}\right), \quad
\text{and} \\
%\intertext{and}
V(\boldsymbol{B}) &~=~ 6 + \frac{4}{5},
&
\boldsymbol{p}_* &~=~ \left(\frac{3}{5}, \frac{2}{5}\right),
&
\boldsymbol{q}_* &~=~ \left(\frac{1}{5}, \frac{4}{5}\right).
\end{align*}

The minimax theorem tells us what will happen in a single matrix game \(\boldsymbol{A}\). We are interested in what happens when we play either of two games \(\boldsymbol{A}\) and \(\boldsymbol{B}\). We first look at what happens if we choose between games \(\boldsymbol{A}\) and \(\boldsymbol{B}\) randomly. We then look at having one of the players control the choice of game. In either case, we study the effect of whether the players know which game they are playing or not.

To construct payoff matrices the following compact notation comes in
handy. The *Kronecker product* of an \(r \times c\) matrix \(\boldsymbol{A}\) and a \(p \times q\) matrix \(\boldsymbol{B}\) is the \(r p \times c q\) matrix given by \[\boldsymbol{A}\otimes \boldsymbol{B}
~=~
\begin{pmatrix}
a_{11} \boldsymbol{B}& \ldots & a_{1c} \boldsymbol{B}\\
\vdots & \ddots & \vdots \\
a_{r1} \boldsymbol{B}& \ldots & a_{rc} \boldsymbol{B}
\end{pmatrix}
.\]

In this section we have a look at what happens if either \(\boldsymbol{A}\) or \(\boldsymbol{B}\) is played, as determined by a coin flip. Throughout, we will use a coin with known bias \(p\) on \(\boldsymbol{A}\) and \(q := 1-p\) on \(\boldsymbol{B}\). In our running computational example we will use \(p = q = \frac{1}{2}\). What game results depends on whether the players see the outcome of the coin flip before they choose their strategy or not. There are hence four cases, of which two are symmetric. So we study all three case.

A coin is flipped that both players observe. Then the corresponding game is played. This game corresponds to the following \(4 \times 4\) matrix game: \[\begin{array}{l|cccc} \boldsymbol{C}_\text{both} & L,L & L,R & R,L & R,R \\ \hline T,T & p A_{TL} + q B_{TL} & p A_{TL} + q B_{TR} & p A_{TR} + q B_{TL} & p A_{TR} + q B_{TR} \\ T,B & p A_{TL} + q B_{BL} & p A_{TL} + q B_{BR} & p A_{TR} + q B_{BL} & p A_{TR} + q B_{BR} \\ B,T & p A_{BL} + q B_{TL} & p A_{BL} + q B_{TR} & p A_{BR} + q B_{TL} & p A_{BR} + q B_{TR} \\ B,B & p A_{BL} + q B_{BL} & p A_{BL} + q B_{BR} & p A_{BR} + q B_{BL} & p A_{BR} + q B_{BR} \end{array}\] where the notation \(T,B\) designates the strategy that plays \(T\) in game \(\boldsymbol{A}\) and \(B\) in game \(\boldsymbol{B}\).

We may succinctly write this matrix as, \[\boldsymbol{C}_\text{both} ~=~ p \boldsymbol{A}\otimes \begin{pmatrix} 1 & 1 \\ 1 & 1 \end{pmatrix} + q \begin{pmatrix} 1 & 1 \\ 1 & 1 \end{pmatrix} \otimes \boldsymbol{B} ~=~ \begin{bmatrix} 9&7&5&3 \\ 5&8&1&4 \\ 7&5&8&6 \\ 3&6&4&7 \end{bmatrix} .\] In a way this setting is somewhat trivial. As both players observe the coin flip, they can play the optimal strategy for the chosen game. And indeed, this setting has value \[V(\boldsymbol{C}_\text{both}) ~=~ p V(\boldsymbol{A}) + q V(\boldsymbol{B}) ~=~ 5 + \frac{4}{5}\] and saddle point pair \begin{align*} \boldsymbol{p}_* &~=~ \boldsymbol{p}_*(\boldsymbol{A}) \otimes \boldsymbol{p}_*(\boldsymbol{B}) %&& ~=~ \left( \frac{3}{25} , \frac{2}{25} , \frac{12}{25} , \frac{8}{25} \right) \\ \boldsymbol{q}_* &~=~ \boldsymbol{q}_*(\boldsymbol{A}) \otimes \boldsymbol{q}_*(\boldsymbol{B}) %&& ~=~ \left( \frac{3}{25} , \frac{12}{25} , \frac{2}{25} , \frac{8}{25} \right) . \end{align*}

Now we consider a coin flip that neither player observes. Then the corresponding game is played. This game is \[\boldsymbol{C}_\text{neither} ~=~ p \boldsymbol{A}+ q \boldsymbol{B} ~=~ \begin{bmatrix} 9&3 \\ 3&7 \end{bmatrix} ,\] that is, the average payoff matrix. The solution of \(\boldsymbol{C}_\text{neither}\) is \begin{align*} V(\boldsymbol{C}_\text{neither}) &~=~ 5 + \frac{2}{5}, & \boldsymbol{p}_* &~=~ \left(\frac{2}{5}, \frac{3}{5}\right), & \boldsymbol{q}_* &~=~ \left(\frac{2}{5}, \frac{3}{5}\right). \end{align*}

A coin is flipped that only the column player observes. Then the corresponding game is played.

Let us write down the payoff matrix. The strategies for row are top (T) and bottom (B) as before. The strategies for column are now conditioned on the coin flip. So they are left(L) or right(R) given either outcome. \[\begin{array}{l|cccc} \boldsymbol{C}_\text{col} & L,L & L,R & R,L & R,R \\ \hline T & p A_{TL} + q B_{TL} & p A_{TL} + q B_{TR} & p A_{TR} + q B_{TL} & p A_{TR} + q B_{TR} \\ B & p A_{BL} + q B_{BL} & p A_{BL} + q B_{BR} & p A_{BR} + q B_{BL} & p A_{BR} + q B_{BR} \end{array}\] That is, we have \[\boldsymbol{C}_\text{col} ~=~ p \boldsymbol{A}\otimes \begin{pmatrix} 1 & 1 \end{pmatrix} + q \begin{pmatrix} 1 & 1 \end{pmatrix} \otimes \boldsymbol{B} ~=~ \begin{bmatrix} 9&7&5&3\\ 3&6&4&7 \end{bmatrix} .\] The solution of \(\boldsymbol{C}_\text{col}\) is \begin{align*} V(\boldsymbol{C}_\text{col}) &~=~ 4 + \frac{3}{5}, & \boldsymbol{p}_* &~=~ \left(\frac{3}{5}, \frac{2}{5}\right), & \boldsymbol{q}_* &~=~ \left(0, 0, \frac{4}{5}, \frac{1}{5}\right). \end{align*}

In this section we look at what happens if one of the players determines which game is played. Let us give the choice to the column player. So now the column player has another component to his strategy, namely the choice of \(\boldsymbol{A}\) vs \(\boldsymbol{B}\).

The column player discloses his choice of game, which is then played. The game matrix becomes \[\begin{array}{l|cccc} \boldsymbol{D}_\text{seen} & \boldsymbol{A},L & \boldsymbol{A},R & \boldsymbol{B},L & \boldsymbol{B},R \\ \hline T,T & A_{TL} & A_{TR} & B_{TL} & B_{TR} \\ T,B & A_{TL} & A_{TR} & B_{BL} & B_{BR} \\ B,T & A_{BL} & A_{BR} & B_{TL} & B_{TR} \\ B,B & A_{BL} & A_{BR} & B_{BL} & B_{BR} \end{array}\] that is \[\boldsymbol{D}_\text{seen} ~=~ \begin{pmatrix}1 & 0 \end{pmatrix} \otimes \boldsymbol{A} \otimes \begin{pmatrix}1 \\ 1\end{pmatrix} + \begin{pmatrix}0 & 1 \end{pmatrix} \otimes \begin{pmatrix}1 \\ 1\end{pmatrix} \otimes \boldsymbol{B} ~=~ \begin{bmatrix} 8&0&10&6 \\ 8&0&2&8 \\ 4&6&10&6 \\ 4&6&2&8 \end{bmatrix} .\] The value of the game is the value of the best game \[V(\boldsymbol{D}_\text{seen}) ~=~ \min \big\{ V(\boldsymbol{A}), V(\boldsymbol{B})\big\} ~=~ 4 + \frac{4}{5} ,\] and the saddle point is \begin{align*} \boldsymbol{p}_* &~=~ \boldsymbol{p}_*(\boldsymbol{A}) \otimes \begin{pmatrix}1 \\ 0 \end{pmatrix} %&& ~=~ \left(\frac{1}{5}, 0, \frac{4}{5}, 0\right) , \\ \boldsymbol{q}_* &~=~ \begin{pmatrix}1 \\ 0 \end{pmatrix} \otimes \boldsymbol{q}_*(\boldsymbol{A}) %&& ~=~ \left(\frac{3}{5}, \frac{2}{5}, 0, 0\right) . \end{align*}

The column player picks a game without showing the row player. The payoff matrix of this setting is \[\begin{array}{l|cccc} \boldsymbol{D}_\text{unseen} & \boldsymbol{A},L & \boldsymbol{A},R & \boldsymbol{B},L & \boldsymbol{B},R \\ \hline T & A_{TL} & A_{TR} & B_{TL} & B_{TR} \\ B & A_{BL} & A_{BR} & B_{BL} & B_{BR} \end{array}\] i.e. the concatenation of the payoff tables \[\boldsymbol{D}_\text{unseen} ~=~ \begin{bmatrix} \boldsymbol{A}& \boldsymbol{B}\end{bmatrix} ~=~ \begin{bmatrix} 8&0&10&6 \\ 4&6&2&8 \end{bmatrix} .\] The solution is given by \begin{align*} V(\boldsymbol{D}_\text{unseen}) &~=~ 4 + \frac{2}{7}, & \boldsymbol{p}_* &~=~ \left(\frac{2}{7}, \frac{5}{7}\right), & \boldsymbol{q}_* &~=~ \left(0, \frac{4}{7}, \frac{3}{7}, 0\right) \end{align*} Now the column player randomizes over the worst columns from either game.

For our running example \begin{align*} V(\boldsymbol{C}_\text{both}) &~=~ 5 + \frac{4}{5} \\ V(\boldsymbol{C}_\text{neither}) &~=~ 5 + \frac{2}{5} \\ V(\boldsymbol{C}_\text{col}) &~=~ 4 + \frac{3}{5} \\ V(\boldsymbol{D}_\text{seen}) &~=~ 4 + \frac{4}{5} \\ V(\boldsymbol{D}_\text{unseen}) &~=~ 4 + \frac{2}{7} \end{align*} As more information is better, we expect \[V(\boldsymbol{C}_\text{col}) ~\le~ \min \{ V(\boldsymbol{C}_\text{neither}) , V(\boldsymbol{C}_\text{both}) \}\] (by symmetry the minimum on the right can go both ways) and \[V(\boldsymbol{D}_\text{unseen}) ~\le~ V(\boldsymbol{D}_\text{seen}) .\] And finally, because choosing the game trumps knowing which game we play at random, \[V(\boldsymbol{D}_\text{unseen}) ~\le~ V(\boldsymbol{C}_\text{col})\] and \[V(\boldsymbol{D}_\text{seen}) ~\le~ V(\boldsymbol{C}_\text{both}) .\] These inequalities are summarised in the figure below.