Alpha beta... gamma

14 December 1998

Hi, does anybody know about min-max and alpha-beta search extended to three or more players?

I'm often reading it's far less simple because of psychological matters, like temporary cooperations and sensibility to others' blunders. But psychology was already a part of the game with two player games like chess, for which you could heavily rely on assumptions about the opponent's psychology, and AI and nice human players often choose to assume 1) the others are as good and fair as they are -- with the same meaning for those adjectives -- and 2) the others are assuming the same about them and no more. I bet you the same choice can be made with more players, resulting in a smart, solid, and still sensitive play style. I've played like that multi-player versions for Pente and Hexbo and am very happy with it.

A natural way to extend the min-max search then -- classically used by computer players in two player games -- to three (or more) players is to go on evaluating the board with a kind of additive measure from the point of view of each player. Let's say a position gets the triplet (x, y, z) where x measures how strong the first player is in some way we would like to be as absolute as possible, y how strong the second is, z how strong the third is. Then a better measure triplet for the position compares each value with the strongest of the two others, giving

( x - max(y, z),  y - max(x, z),  z - max(x, y) )

= ( min(x-y, x-z),  min(y-x, y-z),  min(z-x, z-y) )

and the first player for example must strive to maximize min(x-y, x-z). Well, she must if she doesn't look further to her first move and doesn't consider the other players' similar objective - and this can be obviously criticized as it may well appear that ignoring the weakest player choices would be fatal. But it's only how things look like at one ply deep in the search. With a deeper min-max search in the game tree, things get far more interesting, and the weakest player's moves are automatically taken into account, resulting into tight and subtle temporary implicit cooperations among the players.

Sorry if this is well-known. I tend to think programmers are blind to it when I see how badly the computer plays so far at classic multi- player games as Worms, Hearts, Chinese Checkers, Starcraft, while it's so good at Chess and others.

Now we can even extend the alpha-beta pruning technics to save time and not explore useless branches. Here is the evaluation function. See how close it is to the classical alpha-beta two-player case. You can bet it goes on like that with four or more players.

function FABC( Position, a, b, c, Depth ) returns a triplet of reals;
{  if Depth=0 or game over then
   {  (x, y, z) := some static additive evaluation of Position,
      x measuring how good is Position for the Player to move
      y measuring how good is Position for the next Player
      z measuring how good is Position for the further next Player;

      return ( min(x-y, x-z), min(y-x, y-z), min(z-x, z-y) );
   }
   Bestu := -infinite;
   for every valid move M on Position do
   {  (v, w, u) := FABC( M(Position), -c, -a, b, Depth-1 );
      if u > Bestu then
      {  BestValue := (Bestu, Best1, Best2) := (u, v, w);
         if u >= b or u >= c then return(BestValue);
         if u > a then a := u;
      }
   }
   return(BestValue);
}
The parameters in the first call must get initial values as follows:
 FABC( PositionToAnalyze, -infinite, +infinite, +infinite, WantedDepth);
Now here are some typical situations to show how pruning works with 3 players. It's a good start to show that FABC() above is sound and optimal in general.

Let's imagine Player 0 found a first move valued (r,s,t). Player 0 now tries a second move and Player 1 finds a first answer valued (u,v,w). Under which conditions doesn't Player 1 need to look for better answers because Player 0 is already sure to prefer his first move, seeing this first answer from Player 1? Under the following equivalent conditions:

Now, let's go one player further in the search tree. First, let's consider that Player 2 hasn't fully evaluated Player 1 first answer to Player 0 second move. Player 2 first considered an answer valued (i,j,k) and is now looking for others. Under which conditions doesn't Player 2 need to search further because any better answer would only result in a value (u,v,w) for Player 1 first answer satisfying the conditions above? Under the following equivalent conditions: which isn't a condition ever possible to satisfy. So there is no general way to save time in this situation. Then let's consider again that Player 1 first answer to Player 0 second move has been fully evaluated, but with a value (u,v,w) not satisfying (1). So Player 1 tries another answer and Player 2 is going to evaluate it. Player 2 finds a first answer to it valued (i,j,k). Let's assume that the condition (1) transposed to Player 1 and 2 moves doesn't hold, i.e. -min(k-i,k-j) > min(v-u,v-w). Under which conditions doesn't Player 2 need to search any further because any better answer for him would either gets a value worse than (u,v,w) for Player 1 or worse than (r,s,t) for Player 0? Under the following equivalent conditions: which, after some graphical reasoning for example, turns out to mean exactly

That's all for now. Does it rings a bell somewhere?

From: "Claude Chaunier" <chaunier@handy.univ-lyon1.fr>
Newsgroups: comp.ai.games,rec.games.abstract,rec.games.diplomacy

[Up to Multi-player game theory] [See a real program]