Components of Multi-Agent Choice

An Example: The Battle of the Bismarck Sea

To begin our dialogue of multi-agent choice, let's look at an example that can be found in both Luce and Raiffa and in Casti. For a map of the relevant geography, go here. During World War II, the northern half of the island of New Guinea was controlled by the Japanese, while the Allies controlled the southern half. Intelligence reports indicated that the Japanese were assembling a troop and supply convoy that would move from the port of Rabaul, which lies on the eastern part of the island of New Britain, to Lae which lies on the western part of New Guinea. The convoy could take one of two different routes: (1) north of New Britain, where rain and bad visibility were predicted, or (2) south, where the weather was expected to be fair. It was estimated that the trip would take 3 days on either route.

The allied forces were commanded by General George C. Kenney. His objective was to do as much damage as possible to the Japanese convoy, but he had to find them first. He could either start searching north of New Britain or south.   Assume that it will take one day to either find the convoy or determine that they took another route under stormy conditions, and assume that it will take almost no time at all to find the convoy under sunny conditions, but one day to determine if the convoy took another route under sunny conditions.  With these assumptions, we know the following:


If you were the Japanese commander,  what do you do?  If you were the Allied commander, what would you do?  We will revisit this problem in a few days and try to apply multi-agent reasoning to it.

In the war, the Japanese sailed north and the Allies searched north.  The convoy was sighted about one day after it sailed and the Japanese suffered severe losses.  Eventually (1944), the Japanese forces on the island  were isolated and, though the force withered, the force continued guerilla warfare until the war ended in 1945.


Shared Consequences

The Battle of the Bismarck Sea illustrates the key element of multi-agent problems.  It also illustrates the primary difference between single agent and multiple agent decision-making.This key difference is that consequences depend on the actions of both agents. In other words, what I experience as a result of my actions depends not only on how I act but also on how my neighbor acts (kind of like working with a lab partner).

We can diagram the problem of making a choice in a world with multiple decision-makers; we will use this diagram as a general framework for much of what we discuss in the remainder of the class.  In this framework, there are states denoted by S's, actions denoted by A's, consequences denoted by C's, preferences denoted by P's, goals denoted by G's, and utilities denoted by U's.  There is also an implicit decision process by which preferences are turned into utilities; this process is denoted by a D, but we will postpone talking about this process until next lecture.

States, S, include those elements of the state of nature that affect my choice.  In the Battle of the Bismarck Sea, the set S was the set of possible weather patterns.  In the problem as described, the weather was known which means that a particular state s from S was chosen by nature.

The set A1 is the set of actions that are available to me, and set A2 are the set of actions that are available to you (or to some collection of other agents).  If the Japanese are agent 1 and the Allies are agent 2, then the sets of actions are A1={Sail North, Sail South} and A2={Search North, Search South}.

The set C is the set of consequences that can possibly emerge.  In the Battle, we presented these consequences as days spent searching and days spent bombing.  If we assume that the cost of searching is negligible, then the key aspect of the set of consequences is the number of days spent bombing.  Thus, when s={cloudy north, sunny south) the set of possible consequences was C={1 day, 2 days, 3 days}.  The consequence that would actually occur depended on the choices of the other agents and the state of nature.

The goals of the Allies and the Japanese are diametrically opposed.  For the Allies, the more days spent bombing the better, but for the Japanese, the less days spent bombing the better.  Thus, if the Japanese are agent 1 and the Allies are agent 2, then the goals are G1={Minimize Bombing} and G2={Maximize Bombing}.

These goals induce a preference pattern over the set of consequences for each of the agents.  The Allies prefer more days of bombing to fewer days, and the Japanese prefer fewer days of bombing to more days.  Thus, if the Japanese are agent 1 and the Allies are agent 2 P1= 1 day > 2 days > 3 days and P2= 3 days > 2 days > 1 day.

There are many ways that preferences could be turned into utilities, and the proper way depends on how strongly the agents prefer some consequences to others.  For example, perhaps the allies think that if they do not bomb for at least two days then there is no reason to bomb, but the difference between two days of bombing and three days of bombing is not great.  A utility function that reflects this preference strength might be U2 (3 days)=10,  U2 (2 days)=9, and U2 (1 day)=1.  This utility function preserves the order over consequences, but the difference in utility between 3 days and 2 days is smaller than the difference between 2 days and 1 day.
 


Course Objectives


The sharing of consequences has a funny effect on how I make my decisions; my future is now tied to the future of another agent.  This gets complex  for just two agents, but just imagine how complex it can get when we have three, four, or N agents.  If I am part of a society, I should obviously let the existence of other agents influence how I make my decisions.  We will work hard this semester to try and understand how to solve such multi-agent problems.

The primary objective of this class is to equip you to understand and research multi-agent systems.  We will talk about how to model multi-agent systems, review different definitions of what it means to solve such systems, and discuss algorithms for discovering these solutions.  We will not, however, spend very much time discussing agents as a programming paradigm.

You probably noticed that I mentioned that there are several different definitions of what is a proper solution.  This makes the field of multi-agent systems very broad.  Trying to cover all of multi-agent theory in class is a bit like trying to cover all animals except elephants in class.  Just like there is much more to say about other-than-elephant animals than there is to say about elephants, there is also much more to say about multi-agent systems than there is to say about single agent systems.