Frameworks For Decision Making

Changes from the original copy are highlighted in bold red.
Last Update: Jan 7, 2003

Book Stuff


Decision Theory Framework

These decision components are fit together in the following figure.  Note how consequences are produced when actions and states are combined.  In essence, when an agent chooses action a from the set A and nature chooses state s from the set S then a consequence c from the set C is produced.  After this consequence is produced, the agent can evaluate how much it liked the consequence.
Elements of a decision
Note further that when we omit goal dependence, the utility function is a mapping from consequences to the real line u: U --> R.  However, we often use the shorthand notation that we can obtain via the composition of the c function and the u function which gives
u'(a,s) = u ° c(a,s).
We can thus treat utility as a mapping from the state-action pair to the real line without loss of generality.  This is especially convenient in sequential decision problems.  A few examples will help illustrate these ideas.
 
 
 

Example 1: Movement in a dynamic world
Consider a problem of trying to steer a spacecraft from some point in space to a docking station.  The relevant states in the world are the [x,y,z] positions of the spacecraft in 3 dimensions.  An action is a triple of forces in the [x,y,z] directions, which translate into accelerations in each of these directions.  The consequences that are produced are new [x,y,z] positions.  The goal is to make [x,y,z] approach zero in a smooth way (where we have assumed that the location of the docking station is at the origin).  We prefer consequences that are close to the docking station and that are far from collisions with things in space.  We can encode the utility of each decision via a distance metric where small distances are preferred.

Example 2: Estimating the true state of the world
Another problem of interest is to try to figure out what the true state of the world is.  For this problem, the action is nothing more than our best guess (or estimate) of the world's state.  We can denote this action as a=sg (where the subscript g indicates that our action is a guess about the state of the world).  For each possible state of the world, we can either get the guess right or wrong, so consequences are error or being correct.  Our goal is to be correct, and a utility function can be 1 if we get the guess correct and 0 if we miss it.
 


Decisions in a Context

This formalism is worthless unless we put it in a context.  This context is the real world.  When an agent exists in the real world and must solve problems in this world, we refer to the agent as situated in the world.  Our objective in artificial intelligence is to translate something we sense into a choice: agents map X into A.  We can do this in may ways, but the standard approach is to take a goal, consider possible actions and the consequences they might produce, and then make a choice of which action is most likely to produce a consequence that will bring about a desired goal.
Decision elements in a context
For situated agents, it helps to segment the decision elements discussed above into three sub components:
  1. Sensory Perception: this processes what is sensed into a belief about likely states.  Thus, we introduce the set B of beliefs about the world.  This set is usually encoded into a probability or set of probabilities, and we often assume that it follows Bayes rule for updating beliefs.
  2. World Model: this consists of what we know and believe about the world.  Note that it includes the set of beliefs produced by the sensory-perception module.
  3. Decision Maker: this processes consequences into utilities, and utilities plus beliefs into decisions.  The de facto model of decision making is the principal of maximizing expected utility.  The module labeled D is a function that takes utilities and beliefs and spits out a choice (from the set of actions A).
We should note that purely deductive systems (predicate logic and first order logic) can be squished into this framework.  One way to squish these things is to let X be percepts in the world, let S be internal states, restrict beliefs in B to be either 0 or 1 (corresponding to predicates that return true and false, respectively), and let U be a set of deduced actions (that are encoded into our rule base via a system designer who has goals in mind).  A rule base is then invoked to say how actions produce consequences; this means that the decision maker must deduce which consequences will be produced by which actions or, alternatively, to deduce a choice by observing the world and then deducing which action is compatible with the world (an implicit model of the world is often found in the rule base).  Deduction, in this context, is the process of taking the inputs, making internal inferences efficiently, and applying the correct action to the world whence D, the decision function, is tantamount to the deduction operation.
 

Example 1 revisited: Movement in a dynamic world
In our spacecraft, we have sensors that return inertial navigation information.  We translate this information into beliefs about the state of the world.  We then choose a path that minimizes our cost to reach the goal, subject to constraints that exist on spacecraft and space station dynamics.

Example 2 revisited: Estimating conditions of the world
In our estimation problem, we make an observation x about the world, translate this observation into a belief about a particular state, and then choose (for example) the estimate sg that maximizes the resulting probability function.


Uncertainty

Uncertainty can enter the world in several places.