Unknown Games

Wouter M. Koolen

2015-05-13

Abstract

We investigate what happens to simple matrix games and their solutions if we either randomly or strategically choose between two games.

Introduction

In this note we consider two-player zero-sum matrix games. Such games are specified by a payoff matrix. We assume that we have two such games \(\boldsymbol{A}\) and \(\boldsymbol{B}\). As our running examples we will use \begin{align*} \boldsymbol{A} &~=~ \begin{bmatrix} 8 & 0 \\ 4 & 6 \end{bmatrix} & & \text{and} & \boldsymbol{B} &~=~ \begin{bmatrix} 10 & 6 \\ 2 & 8 \end{bmatrix} . \end{align*} To name the rows and columns we may also write \(\boldsymbol{A}\) as follows: \[\begin{array}{l|cc} \boldsymbol{A}& L & R \\ \hline T & 8 & 0 \\ B & 4 & 6 \end{array} .\] The interpretation of such a game is as follows. The row player chooses either strategy T(op) or B(ottom), and, independently, the column player chooses either strategy L(eft) or R(ight). The payoff to the row player is then as specified in the entry of the matrix determined by their joint choice. The row player aims to maximize his payoff, while the column player aims to minimize it. Von Neumann’s minimax theorem states that for any such game there are randomized strategies \(\boldsymbol{p}_*\) (probability distribution on rows) and \(\boldsymbol{q}_*\) (probability distribution on columns) such that for any strategies \(\boldsymbol{p}\) and \(\boldsymbol{q}\) \[\boldsymbol{p}^\intercal\boldsymbol{A}\boldsymbol{q}_* ~\le~ \boldsymbol{p}_*^\intercal\boldsymbol{A}\boldsymbol{q}_* ~\le~ \boldsymbol{p}_*^\intercal\boldsymbol{A}\boldsymbol{q} .\] That is, by playing \(\boldsymbol{p}_*\) the row player can force expected payoff at least \(\boldsymbol{p}_*^\intercal\boldsymbol{A}\boldsymbol{q}_*\), whereas by playing \(\boldsymbol{q}_*\) the column player can force expected payoff at most \(\boldsymbol{p}_*^\intercal\boldsymbol{A}\boldsymbol{q}_*\). Such optimal strategies \(\boldsymbol{p}_*\) and \(\boldsymbol{q}_*\) are called a saddle-point pair, and the expected payoff \(\boldsymbol{p}_*^\intercal\boldsymbol{A}\boldsymbol{q}_*\) resulting under such optimal play is called the value of the game. We will denote the value by \(V(\boldsymbol{A})\). For the examples above, we have values and saddle point pairs \begin{align*} V(\boldsymbol{A}) &~=~ 4 + \frac{4}{5}, & \boldsymbol{p}_* &~=~ \left(\frac{1}{5}, \frac{4}{5}\right), & \boldsymbol{q}_* &~=~ \left(\frac{3}{5}, \frac{2}{5}\right), \quad \text{and} \\ %\intertext{and} V(\boldsymbol{B}) &~=~ 6 + \frac{4}{5}, & \boldsymbol{p}_* &~=~ \left(\frac{3}{5}, \frac{2}{5}\right), & \boldsymbol{q}_* &~=~ \left(\frac{1}{5}, \frac{4}{5}\right). \end{align*}

The minimax theorem tells us what will happen in a single matrix game \(\boldsymbol{A}\). We are interested in what happens when we play either of two games \(\boldsymbol{A}\) and \(\boldsymbol{B}\). We first look at what happens if we choose between games \(\boldsymbol{A}\) and \(\boldsymbol{B}\) randomly. We then look at having one of the players control the choice of game. In either case, we study the effect of whether the players know which game they are playing or not.

Notation

To construct payoff matrices the following compact notation comes in handy. The Kronecker product of an \(r \times c\) matrix \(\boldsymbol{A}\) and a \(p \times q\) matrix \(\boldsymbol{B}\) is the \(r p \times c q\) matrix given by \[\boldsymbol{A}\otimes \boldsymbol{B} ~=~ \begin{pmatrix} a_{11} \boldsymbol{B}& \ldots & a_{1c} \boldsymbol{B}\\ \vdots & \ddots & \vdots \\ a_{r1} \boldsymbol{B}& \ldots & a_{rc} \boldsymbol{B} \end{pmatrix} .\]

Game determined at random

In this section we have a look at what happens if either \(\boldsymbol{A}\) or \(\boldsymbol{B}\) is played, as determined by a coin flip. Throughout, we will use a coin with known bias \(p\) on \(\boldsymbol{A}\) and \(q := 1-p\) on \(\boldsymbol{B}\). In our running computational example we will use \(p = q = \frac{1}{2}\). What game results depends on whether the players see the outcome of the coin flip before they choose their strategy or not. There are hence four cases, of which two are symmetric. So we study all three case.

Coinflip seen by both players

A coin is flipped that both players observe. Then the corresponding game is played. This game corresponds to the following \(4 \times 4\) matrix game: \[\begin{array}{l|cccc} \boldsymbol{C}_\text{both} & L,L & L,R & R,L & R,R \\ \hline T,T & p A_{TL} + q B_{TL} & p A_{TL} + q B_{TR} & p A_{TR} + q B_{TL} & p A_{TR} + q B_{TR} \\ T,B & p A_{TL} + q B_{BL} & p A_{TL} + q B_{BR} & p A_{TR} + q B_{BL} & p A_{TR} + q B_{BR} \\ B,T & p A_{BL} + q B_{TL} & p A_{BL} + q B_{TR} & p A_{BR} + q B_{TL} & p A_{BR} + q B_{TR} \\ B,B & p A_{BL} + q B_{BL} & p A_{BL} + q B_{BR} & p A_{BR} + q B_{BL} & p A_{BR} + q B_{BR} \end{array}\] where the notation \(T,B\) designates the strategy that plays \(T\) in game \(\boldsymbol{A}\) and \(B\) in game \(\boldsymbol{B}\).

We may succinctly write this matrix as, \[\boldsymbol{C}_\text{both} ~=~ p \boldsymbol{A}\otimes \begin{pmatrix} 1 & 1 \\ 1 & 1 \end{pmatrix} + q \begin{pmatrix} 1 & 1 \\ 1 & 1 \end{pmatrix} \otimes \boldsymbol{B} ~=~ \begin{bmatrix} 9&7&5&3 \\ 5&8&1&4 \\ 7&5&8&6 \\ 3&6&4&7 \end{bmatrix} .\] In a way this setting is somewhat trivial. As both players observe the coin flip, they can play the optimal strategy for the chosen game. And indeed, this setting has value \[V(\boldsymbol{C}_\text{both}) ~=~ p V(\boldsymbol{A}) + q V(\boldsymbol{B}) ~=~ 5 + \frac{4}{5}\] and saddle point pair \begin{align*} \boldsymbol{p}_* &~=~ \boldsymbol{p}_*(\boldsymbol{A}) \otimes \boldsymbol{p}_*(\boldsymbol{B}) %&& ~=~ \left( \frac{3}{25} , \frac{2}{25} , \frac{12}{25} , \frac{8}{25} \right) \\ \boldsymbol{q}_* &~=~ \boldsymbol{q}_*(\boldsymbol{A}) \otimes \boldsymbol{q}_*(\boldsymbol{B}) %&& ~=~ \left( \frac{3}{25} , \frac{12}{25} , \frac{2}{25} , \frac{8}{25} \right) . \end{align*}

Coinflip seen by neither player

Now we consider a coin flip that neither player observes. Then the corresponding game is played. This game is \[\boldsymbol{C}_\text{neither} ~=~ p \boldsymbol{A}+ q \boldsymbol{B} ~=~ \begin{bmatrix} 9&3 \\ 3&7 \end{bmatrix} ,\] that is, the average payoff matrix. The solution of \(\boldsymbol{C}_\text{neither}\) is \begin{align*} V(\boldsymbol{C}_\text{neither}) &~=~ 5 + \frac{2}{5}, & \boldsymbol{p}_* &~=~ \left(\frac{2}{5}, \frac{3}{5}\right), & \boldsymbol{q}_* &~=~ \left(\frac{2}{5}, \frac{3}{5}\right). \end{align*}

Coinflip seen only by column player

A coin is flipped that only the column player observes. Then the corresponding game is played.

Let us write down the payoff matrix. The strategies for row are top (T) and bottom (B) as before. The strategies for column are now conditioned on the coin flip. So they are left(L) or right(R) given either outcome. \[\begin{array}{l|cccc} \boldsymbol{C}_\text{col} & L,L & L,R & R,L & R,R \\ \hline T & p A_{TL} + q B_{TL} & p A_{TL} + q B_{TR} & p A_{TR} + q B_{TL} & p A_{TR} + q B_{TR} \\ B & p A_{BL} + q B_{BL} & p A_{BL} + q B_{BR} & p A_{BR} + q B_{BL} & p A_{BR} + q B_{BR} \end{array}\] That is, we have \[\boldsymbol{C}_\text{col} ~=~ p \boldsymbol{A}\otimes \begin{pmatrix} 1 & 1 \end{pmatrix} + q \begin{pmatrix} 1 & 1 \end{pmatrix} \otimes \boldsymbol{B} ~=~ \begin{bmatrix} 9&7&5&3\\ 3&6&4&7 \end{bmatrix} .\] The solution of \(\boldsymbol{C}_\text{col}\) is \begin{align*} V(\boldsymbol{C}_\text{col}) &~=~ 4 + \frac{3}{5}, & \boldsymbol{p}_* &~=~ \left(\frac{3}{5}, \frac{2}{5}\right), & \boldsymbol{q}_* &~=~ \left(0, 0, \frac{4}{5}, \frac{1}{5}\right). \end{align*}

Game determined strategically

In this section we look at what happens if one of the players determines which game is played. Let us give the choice to the column player. So now the column player has another component to his strategy, namely the choice of \(\boldsymbol{A}\) vs \(\boldsymbol{B}\).

Choice seen by row player

The column player discloses his choice of game, which is then played. The game matrix becomes \[\begin{array}{l|cccc} \boldsymbol{D}_\text{seen} & \boldsymbol{A},L & \boldsymbol{A},R & \boldsymbol{B},L & \boldsymbol{B},R \\ \hline T,T & A_{TL} & A_{TR} & B_{TL} & B_{TR} \\ T,B & A_{TL} & A_{TR} & B_{BL} & B_{BR} \\ B,T & A_{BL} & A_{BR} & B_{TL} & B_{TR} \\ B,B & A_{BL} & A_{BR} & B_{BL} & B_{BR} \end{array}\] that is \[\boldsymbol{D}_\text{seen} ~=~ \begin{pmatrix}1 & 0 \end{pmatrix} \otimes \boldsymbol{A} \otimes \begin{pmatrix}1 \\ 1\end{pmatrix} + \begin{pmatrix}0 & 1 \end{pmatrix} \otimes \begin{pmatrix}1 \\ 1\end{pmatrix} \otimes \boldsymbol{B} ~=~ \begin{bmatrix} 8&0&10&6 \\ 8&0&2&8 \\ 4&6&10&6 \\ 4&6&2&8 \end{bmatrix} .\] The value of the game is the value of the best game \[V(\boldsymbol{D}_\text{seen}) ~=~ \min \big\{ V(\boldsymbol{A}), V(\boldsymbol{B})\big\} ~=~ 4 + \frac{4}{5} ,\] and the saddle point is \begin{align*} \boldsymbol{p}_* &~=~ \boldsymbol{p}_*(\boldsymbol{A}) \otimes \begin{pmatrix}1 \\ 0 \end{pmatrix} %&& ~=~ \left(\frac{1}{5}, 0, \frac{4}{5}, 0\right) , \\ \boldsymbol{q}_* &~=~ \begin{pmatrix}1 \\ 0 \end{pmatrix} \otimes \boldsymbol{q}_*(\boldsymbol{A}) %&& ~=~ \left(\frac{3}{5}, \frac{2}{5}, 0, 0\right) . \end{align*}

Choice hidden from row player

The column player picks a game without showing the row player. The payoff matrix of this setting is \[\begin{array}{l|cccc} \boldsymbol{D}_\text{unseen} & \boldsymbol{A},L & \boldsymbol{A},R & \boldsymbol{B},L & \boldsymbol{B},R \\ \hline T & A_{TL} & A_{TR} & B_{TL} & B_{TR} \\ B & A_{BL} & A_{BR} & B_{BL} & B_{BR} \end{array}\] i.e. the concatenation of the payoff tables \[\boldsymbol{D}_\text{unseen} ~=~ \begin{bmatrix} \boldsymbol{A}& \boldsymbol{B}\end{bmatrix} ~=~ \begin{bmatrix} 8&0&10&6 \\ 4&6&2&8 \end{bmatrix} .\] The solution is given by \begin{align*} V(\boldsymbol{D}_\text{unseen}) &~=~ 4 + \frac{2}{7}, & \boldsymbol{p}_* &~=~ \left(\frac{2}{7}, \frac{5}{7}\right), & \boldsymbol{q}_* &~=~ \left(0, \frac{4}{7}, \frac{3}{7}, 0\right) \end{align*} Now the column player randomizes over the worst columns from either game.

Summary

For our running example \begin{align*} V(\boldsymbol{C}_\text{both}) &~=~ 5 + \frac{4}{5} \\ V(\boldsymbol{C}_\text{neither}) &~=~ 5 + \frac{2}{5} \\ V(\boldsymbol{C}_\text{col}) &~=~ 4 + \frac{3}{5} \\ V(\boldsymbol{D}_\text{seen}) &~=~ 4 + \frac{4}{5} \\ V(\boldsymbol{D}_\text{unseen}) &~=~ 4 + \frac{2}{7} \end{align*} As more information is better, we expect \[V(\boldsymbol{C}_\text{col}) ~\le~ \min \{ V(\boldsymbol{C}_\text{neither}) , V(\boldsymbol{C}_\text{both}) \}\] (by symmetry the minimum on the right can go both ways) and \[V(\boldsymbol{D}_\text{unseen}) ~\le~ V(\boldsymbol{D}_\text{seen}) .\] And finally, because choosing the game trumps knowing which game we play at random, \[V(\boldsymbol{D}_\text{unseen}) ~\le~ V(\boldsymbol{C}_\text{col})\] and \[V(\boldsymbol{D}_\text{seen}) ~\le~ V(\boldsymbol{C}_\text{both}) .\] These inequalities are summarised in the figure below.