Social Welfare: Fairness and Fairness Mechanisms
Last updated Aug 29, 2005.
In this section, we will try and answer the question of how we can
fairly
combine the preferences of a group of individuals into a choice that is
fair.
Part 1: Framework
Concept 1: Review of the multi-agent choice problem
Recall
how
we set up the multi-agent choice problem at the start of the
semester.
We defined a multi-agent choice problem as one where the consequences
experienced
by each agent are a function of the choices (and possibly the state of
nature)
of all the agents. If the game is fully cooperative and if
everybody
shares the same information, then this problem is just a single agent
problem
with a vector of choices. However, if goals differ or if agents
have
different information, then the problem gets interesting. This is
illustrated in the figure where two agents make choices, a consequence
results, and then agents evaluate this consequence in terms of the
agent's
goals. Performing this evaluation for all possible consequences
allows
an agent to create an ordering over this set. For interesting
games,
this ordering differs between agents (either because of partial
information
or differing goals).
Concept 2: The Social Welfare problem: Generating consequences from
individual
choices
The social welfare problem is typically thought of in a canonical form
of the general problem outlined in the above figure. In this
canonization,
all consequences are represented by a set of choices. This
statement
probably doesn't make sense, but I'll use some examples to illustrate
what
I mean.
Example 1: When we talked about the prisoner's dilemma, we
discussed
how the payoff matrix was created. We did this by thinking about
how each player's actions would produce a consequence, and then we
looked
at how each player evaluated that consequence. For example, if
player
1 defects (talks to the police) and player 2 cooperates (remains
silent)
then the following consequence is produced: player 1 gets a very short
prison sentence and player 2 gets a very long prison sentence.
For
every action pair, we can identify what consequence is produced.
Each player then forms a utility function by assigning a numerical
value
to the consequence. If player 1 defects and player 2 cooperates,
then player 1 creates a utility function for the resulting consequence
by setting the utility equal to the number of years he or she will
spend in prison
as the numerical utility value u1(D,C)=1 meaning player 1
will
spend one year in jail. Similarly, player 2 creates a utility
function
in the same manner yielding u2(D,C)=4 meaning player 2 will
spend four years in jail.
In this example, notice how the consequence included a dimension for
what happened to player 1 and another dimension for what happened to
player
2. The utilities assigned by each player ignored one of these
dimensions
so that player 1 only payed attention to its dimension and similarly
for
player 2. These consequence structures can get much more
complicated
than this. Consider the following example.
Example 2: In the repeated play prisoner's dilemma, the
consequences
of an action include the payoff for the current round and the set of
possible
future payoffs for subsequent rounds. For example, when an Always
Defect agent plays against a Tit fot Tat agent, the choice to defect on
the first round not only yields the temptation payoff on that round, it
also guarantees that the Always Defect agent will get the punishment
payoff
on the next round. This is because the consequence of the Always
Defect agent consisted not only in the immediate payoff, but also in a
change to the internal state of the Tit for Tat agent that caused this
agent to act differently in future rounds of play. Similarly, the
choice to play a Tit for Tat strategy has a consequence structure that
includes the payoff for the first round and for future possible
rounds.
This consequence structure also includes the likely survivability in a
society of agents meaning that the choice to be a Tit for Tat-er may
mean
that future societies are more compatible with this kind of play.
In summary, consequences can include different dimensions across agents
and different dimensions across time.
Usually (in the stuff that I've read) social welfare of the type
addressed
by Arrow's impossibility theorem frames the social choice problem in
something
of a canonical form. In this canonical form, all consequences are
represented by an associated choice. Let me illustrate by an
example.
Example 3: Electing the President of the United States is a good
example of the canonization usually associated with social choice
theory.
Typically, there are between three and fifteen candidates for president
(with only two or three having any real chance of winning). The
choices
available to voters are precisely the candidates, but each candidate is
actually a representation of a potentially very complicated set of
consequences.
For example, in the 2000 presidential election, let's restrict
attention
to the two most popular choices, Al Gore and George Bush. A vote
for either candidate included the normal consequences on issues such
abortion
rights, national security, labor, etc., but also included more subtle
consequences
such as whether Supreme Court appointees (and the next twenty years of
Supreme Court decisions) would have liberal leanings or conservative
leanings.
In summary, the social choice problem is typified by a set of
discrete
options available to each agent in the system. Each agent will
specify
a preference pattern or a utility function over this set of options,
and
then a social choice mechanism will be invoked that transforms these
patterns/functions
into a choice of one of the options.
Concept 3: Combining choices, types, and mechanisms
There are a zillion different ways of combining preferences and
utilities.
The conventional voting procedure used in presidential elections
translates
each voters preference pattern into a single choice, and then counts
all
of the choices. This is not the only procedure that could be
used,
however. Elections could be decided instead by giving each voter
10 points and then letting them distribute these points among the
various
candidates; the candidate with the highest number of total points would
be the winner.
The process of designing a system whereby agents can express their
preferences/utilities
and then transforming what the agent expresses is called mechanism
design.
In other words, somebody has to design a mechanism by which
individual's
preferences are translated into a choice among the options.
Arrow's
impossibility theorem essentially states that there is no fair way to
combine
the preference patterns of each agent into a social choice. In
other
words, there is no fair social choice mechanism. We will prove
this
by formalizing what we mean by fair and then showing that all
of
the dimensions of fairness cannot be simultaneously satisfied.
It is interesting to note that since no social choice mechanism is
fair,
in the sense of Arrow, the process of mechanism design is essentially
equivalent
to "picking a poison" --- one dimension of inequity must be admitted,
and
the decision of the mechanism designer is to choose the least
unpalatable
dimension. This difficulty is confounded by the fact that once
the
mechanism is chosen, agents may not tell the truth about how they
really
feel if they think lying will help their real preference pattern be
selected
by the mechanism. For example, consider the 10 point voting
system
that I described above. Suppose that a voter feels that Bush
deserves
six points and Gore deserves four, but this voter also believes that,
on
the average, other voters will give Bush and Gore five points
each.
This voter might want the choice to reflect his or her preference
pattern
so he or she may lie and give Bush all ten points. This doesn't
reflect
his or her true preferences, but in order to manipulate the mechanism
the
voter may lie. A type is what the voter chooses to give
as
input to the social choice mechanism; sometimes a type is the voter's
true
preference pattern and sometimes a type is some distortion of
this.
When we talk about different social choice functions, we will revisit
the
idea of a type again.
Part 2: Preferences and Utilities
Since we already talked about this in a previous lecture, you can skip
this if you want. I've chosen to keep it in the notes because (a)
the
notes evolve from semester to semester and so historical impetus keeps
them here and (b) these concepts are important to review at this time
in the semester. If you haven't masterd this material, please
take the time to do so now.
Concept 1: Preferences and preference patterns
Consider a set of choices {c1, c2, . . . , cn}.
(Note how I use c's to represent choices and c's to represent
consequences.
This is OK given that we are using the canonization described above
wherein
the choices encode consequences. In other words, since the
choices
are representations of consequences, using the same notation is no
problem.)
A preference pattern is a relation between pairs of choices such that
every
set of choices can be compared to every other. (I
will
use the greater than symbol to represent the notion of preferred to,
but note that this is not typically done. I'm stuck with >
because
I cannot figure out how to get netscape to give me the real symbol that
I want. I'll show you this symbol in class and I want you to use
it on exams and in our discussions.) A preference pattern is a
total
ordering that is transitive. This means that the following axioms
hold:
-
"ci, cj either
ci>cj, ci<cj, or ci~cj.
In other words, for all choices I prefer one to the other (>) or I'm
indifferent
(~) between them. The greater than or equal to symbol (written
>~ herein)
is read as at least as preferred as so that writing ci>~cj
is read ci is at least as preferred as cj.
This property is known as total ordering.
-
ci>cj and cj>ck
implies ci>ck. This is known as transitivity.
Concept 2: Utilities and the principle of maximum expected utility
What we want to do is take a total preference ordering over a set of
consequences
and turn it into a numerical function. The way that we will do
this
is by realizing that preferences only tell us that we like one choice
more
than another. In reality, we may have strengths of preference
meaning that we like choice a much more than choice b, and choice b
just
a little more than choice c. There are a set of axioms that allow
us to encode how strong our preferences are in an almost unique utility
function. The assumption behind turning these axioms into a
utility
function is that we will maximize this utility function. More
explicitly,
we will maximize the expected value of this utility function. The
key to finding such a utility function is using lotteries.
Concept 3: Lotteries and preferences over lotteries
A lottery is simply a probabilistic outcome. Most concretely,
when
you buy a lottery ticket, you are buying a probabilistic outcome ---
with
a small probability you can win a huge prize, and with a large
probability
you win nothing. Formally, we will adopt Russell and Norvig's
notation
in Artificial Intelligence: A Modern Approach and write a
lottery
as follows: [p,A; 1-p,B]. This notation is read as "With
probability
p I get outcome A and with probability 1-p I get outcome B.
Concretely
put, [p,A; 1-p,B] is a lottery ticket with a probability p of winning A
(the jackpot) and probability 1-p of winning B (nothing).
Constructing
utility functions is based on finding probabilities such that the human
is indifferent between a sure outcome and a lottery ticket. We
will
now formalize this notion using the axioms explained in Russel and
Norvig.
I will quote them directly in what follows, except that I will use
>, >~,
and ~ instead of the correct shape for preference.
Concept 4: The axioms of utility theory
- Total ordering. See concept 1 above.
-
Transitivity. See concept 1 above.
-
Continuity. If some state B is between A and C in
preference, then
there is some probability p for which the rational agent will be
indifferent
between getting B for sure and the lottery that yields A with
probability
p and C with probability 1-p.
A>B>C ==> there exists a p
in (0,1) such that [p,A; 1-p, C] ~ B
-
Substitutability: If an agent is indifferent between two
lotteries,
A and B, then the agent is indifferent between two more complex
lotteries
that are the same except that B is substituted for A in one of
them.
This holds regardless of the probabilities and the other outcome(s) in
the lotteries.
A~B ==> [p,A; 1-p,C] ~ [p,B; 1-p,C]
-
Monotonicity: Suppose that there are two lotteries
that have
the same two outcomes, A and B. If an agent prefers A to B, then
the agent must prefer the lottery that has a higher probability for A
(and
vice versa).
A>B ==> (p>=q <==> [p,A; 1-p,B] >~ [q,A; 1-q,B])
This is kind of a funny one, so I'm going to comment on it
here.
The form of this statement is that the left side (A is more preferred
than
B) implies an equivalence. This equivalence states that the
existence
of a probability p and q such that p is no less than q is equivalent to
saying that the lottery with odds equal to p is at least as preferred
as
the lottery with odds equal to q.
-
Decomposability: Compound lotteries can be reduced to
simpler ones
using the laws of probability. This has been called the "no fun
in
gambling" rule because it says that an agent should not prefer
one lottery just because it has more choice points than another.
[p,A; 1-p,[q,B; 1-q,C]] ~ [p,A; (1-p)q,B; (1-p)(1-q),C]
Again, this is kind of funny. The left side of the
equivalence
is a lottery consisting of an outcome A and another outcome which is a
lottery [q,B; 1-q,C]. This is a bit like buying a lottery ticket
where with probability p you win $5, and with probability 1-p you get
another
lottery ticket for different outcomes with different odds. The
right
side of the equation is a compound lottery ticket with odds p of
winning
A, odds (1-p)q of winning B, and odds (1-p)(1-q) of winning C.
At this point, we are no longer quoting from Russell and Norvig.
In the next concept, we will show how to use these axioms to construct
a utility function.
Concept 5: Building a utility from preferences over lotteries
Given these axioms, can we build a utility function? The answer
is,
of course, yes. In fact, there is a theorem that says we
can.
The following is a statement of this theorem. It is partially
taken
from Russell and Norvig, and partially stated in my own terms.
Theorem. If an agent's preferences obey the axioms
of utility, then there exists a real-valued function U such that
u(A)>u(B) <==> A>B,
u(A)=u(B) <==> A~B.
How would we prove this? By constructing such a function.
Assuming
that we have a discrete set of choices (like in the canonization of the
social welfare problem), here's how.
-
Find the most preferred and least preferred outcome. We will call
these A and Z, respectively. We know that a most preferred and
least
preferred outcome exist because the set of outcomes is ordered.
-
Make up a number u(A) and u(Z) that will represent the utility of the
most
preferred and least preferred outcome. If we are indifferent
between
A and Z (i.e., A~Z) then set u(A)=u(Z). Otherwise, we can choose
any number we want so long as u(A)>u(Z).
-
Choose any other option. Let's call it B. If B~A then set
u(B)=u(A).
If B~Z then set u(B)=u(Z). Otherwise, use continuity to find the
value of p such that [p,A; 1-p,Z]~B, and then set u(B)=pu(A)+(1-p)u(Z).
-
Repeat for all options.
The third step is really the key one because it is where we transform
our
preference A>B>Z into a probability that we can use to create the
utility.
You should probably know that scientists have used this technique for
figuring
out what people's utilities are, but that the resulting utilities don't
match what people actually choose; this means that utilities are nice
for
doing designs, but they probably don't (in their formal form) represent
how people make decisions.
Can you prove the theorem now? You should probably try since
I've
asked other students to do this proof on an exam.
Concept 6: Uniqueness: positive affine transformations
The philosophy behind these axioms, the theorem, and the algorithm is
that
the utility will be maximized in some way: Which outcome
should
I chose? The one that maximizes utility. What if the
outcome
is actually probabilistic? Then I maximize expected utility.
Thus, constructing a utility function assumes that this utility
function
will be used to make a choice as follows:
c* = arg maxc u(c),
a* = arg maxa Sums
p(s) u(a,s) = arg maxa Sums
p(s) u(c(a,s))
The first equation says that the choice I make is the one that has
highest
utility. The second equation says that the optimal action to
choose,
a*, is the one that, on average across a set of
probabilistic
states, produces consequences with high utility. I realize that
this
equation is a bit difficult to discern, but I believe it is worth your
time to really figure out what it means. The key is that the
outcomes
are consequences, and that these consequences are functions of actions
and states.
Shifting focus a bit, look at the algorithm we used to find the
utility
function. In step 2, we just made up numbers for u(A) and
u(Z).
Since we are only going to take the argument that maximizes the
resulting
function, it doesn't really matter what the actual values of u() are as
long as they obey the axioms of preference. This does, however,
mean
that the utility function is not unique. In terms of finding the
argument that maximizes the utility (or expected utility), if u()
satisfies
the axioms than so does au+b
where a is greater than zero and where b
can
be anything. Another way of saying this is that the utility
function
is unique only up to a positive affine transformation. The
transformatin
is positive since a is greater than zero
and
it is affine since b can be
anything.
Let's check and see if we get the same maximal solution for u and au+b.
c* = arg maxc u(c) = arg a[maxc
u(c)] = arg maxc[a u(c)] = arg maxc[a
u(c) +b].
Thus, we get the same solution even if the utility is unique only up to
a postivie affine transformation. Unfortunately, this
nonuniqueness
causes problems when get into the world of multi-agent choice.
Part 3: Social Welfare Concepts and Problems
Concept 1: Combining preference patterns
Suppose that you are given the job of finding a mechanism that takes
the
preference patterns of multiple agents and combines them fairly.
Let {P1,P2, . . ., Pn} denote the
preference
patterns for agents 1, 2, . . . n. Your job is to come up with a
way to combine these preference patterns into a preference pattern for
society. Let society's preference pattern be denoted P (with no
subscript),
and let Pset
denote the set of all possible preference
patterns for a given set of choices. In other words, your job is
to find a function that does the following:
f: Pset1
x Pset2
x . . . Psetn
x --> Pset,
P = f(P1,P2,
. . ., Pn).
It is helpful to give some examples of just what I mean here.
Suppose that we have a problem with two agents and three choices.
Denote these choices A, B, and C. Assuming, for simplicity, that
the agents will not be indifferent between any two options, there are
several types of preferences patterns that an agent can have.
These possible preference patters include A>B>C, and A>C>B
and so on. Thus,
Pset
= { A>B>C, A>C>B, B>A>C, B>C>A, C>A>B,
C>B>A }.
No matter which set of preference patterns are held by the two agents,
the function f must be able
to take these and return a preference pattern for society.
What is the domain of f?
For the two agent, three choice problem the domain is Pset x Pset (the cross product of
the two sets). What is the range of f? It is Pset.
Concept 2: Arrow's impossibility theorem: a foreshadowing
Arrow's impossibility theorem states that there is NO function
f that "fairly"
combines these preferences. This means that if we are going to
use
preferences as the things that we use to make social choices, we are
going
to have to compromise a bit. We will talk about some different
kinds
of compromises.
Concept 3: Strengths of preference: overcoming the impossible
If we are willing to use something besides preferences, we may be able
to circumvent Arrow's impossibility. One way to do this is to use
the information about strength's of preferences encoded as part of a
utility
function. This allows us to let those who feel most strongly
about
an issue have a greater say in the social outcome. For example,
we
can make a social choice by summing up all of the individual agent's
utilties
to create a social welfare utility
u(c) = Sumk
uk(c).
This social welfare utility now encodes the preferences of the agents.
However...
Concept 4: Interpersonal comparison of utility: uniqueness and
types revisited
Utilities are unique only up to a positive affine transformation.
This means that I can manipulate the system to mimic my preferences by
choosing a really big a that causes my
utilities
to dwarf the sum total of everybody else's utilities. In other
words,
if everybody else's utilities are in the range from 0 to 10, and if I
know
that there are only 10 agents, then an a=100,000
means
u(c) = Sk
uk(c) = ui(c) + Sk
!= i uk(c)
u'(c) = aui(c) + Sk
!= i uk(c) = 100,000 ui(c)
+ Sk != i uk(c)
which approximately equals 100,000 ui(c).
If I am player i, then I can manipulate the system by choosing a
representation
of my utilities which is out of scale with everybody elses. This
problem, caused by the nonuniqueness of utilities, is called the
problem
of interpersonal comparison of utilities. In essence, it
is
a problem of finding a common currency for expressing utilities that
make
my utilities comparable to yours. You will get to play around in
a lab where you use different social choice functions that are based on
preferences and on utilities.