608 research outputs found
Game theory of mind
This paper introduces a model of ‘theory of mind’, namely, how we represent the intentions and goals of others to optimise our mutual interactions. We draw on ideas from optimum control and game theory to provide a ‘game theory of mind’. First, we consider the representations of goals in terms of value functions that are prescribed by utility or rewards. Critically, the joint value functions and ensuing behaviour are optimised recursively, under the assumption that I represent your value function, your representation of mine, your representation of my representation of yours, and so on ad infinitum. However, if we assume that the degree of recursion is bounded, then players need to estimate the opponent's degree of recursion (i.e., sophistication) to respond optimally. This induces a problem of inferring the opponent's sophistication, given behavioural exchanges. We show it is possible to deduce whether players make inferences about each other and quantify their sophistication on the basis of choices in sequential games. This rests on comparing generative models of choices with, and without, inference. Model comparison is demonstrated using simulated and real data from a ‘stag-hunt’. Finally, we note that exactly the same sophisticated behaviour can be achieved by optimising the utility function itself (through prosocial utility), producing unsophisticated but apparently altruistic agents. This may be relevant ethologically in hierarchal game theory and coevolution
Recommended from our members
On the origin of utility, weighting, and discounting functions: How they get their shapes and how to change their shapes
We present a theoretical account of the origin of the shapes of utility, probability weighting, and temporal discounting functions. In an experimental test of the theory, we systematically change the shape of revealed utility, weighting, and discounting functions by manipulating the distribution of monies, probabilities, and delays in the choices used to elicit them. The data demonstrate that there is no stable mapping between attribute values and their subjective equivalents. Expected and discounted utility theories, and also their descendants such as prospect theory and hyperbolic discounting theory, simply assert stable mappings to describe choice data and offer no account of the instability we find. We explain where the shape of the mapping comes from and, in describing the mechanism by which people choose, explain why the shape depends on the distribution of gains, losses, risks, and delays in the environment
Occasional errors can benefit coordination
The chances solving a problem that involves coordination between people are increased by introducing robotic players that sometimes make mistakes. This finding has implications for real-world coordination problems
Are groups more rational than individuals? A review of interactive decision making in groups
Many decisions are interactive; the outcome of one party depends not only on its decisions or on acts of nature but also on the decisions of others. In the present article, we review the literature on decision making made by groups of the past 25 years. Researchers have compared the strategic behavior of groups and individuals in many games: prisoner's dilemma, dictator, ultimatum, trust, centipede and principal-agent games, among others. Our review suggests that results are quite consistent in revealing that groups behave closer to the game-theoretical assumption of rationality and selfishness than individuals. We conclude by discussing future research avenues in this area
Recommended from our members
Are individuals more risk and ambiguity averse in a group environment or alone? Results from an experimental study
Most decision-making research in economics focuses on individual decisions. Yet, we know, from psychological research in particular, that individual preferences can be sensitive to social pressures. In this paper, we study the impact of a group environment on individual preferences for risky (i.e., known probabilities) and ambiguous (i.e., unknown probabilities) prospects. In our experiment, each participant was invited to make a series of lottery-choice decisions in two different conditions. In the Alone condition, individuals made private choices, whereas in the Group condition, individuals belonged to a three-person group and group members' choices were aggregated according to either a majority or unanimity rule. This design allows us to study the impact of a group environment on individuals' attitude towards both risky and ambiguous prospects, while controlling for the decision rule used in the group. Our experimental results show that when individuals are in the Group condition, they tend to be less risk averse and more ambiguity averse than when they are not part of a group (Alone condition). Our experiment also suggests that the decision rule matters as it shows that these two trends tend to be stronger when the group implements a unanimity rule. Specifically, we found that individuals who belong to a group implementing a unanimity rule are significantly less risk averse than individuals who belong to a group that relies on the majority rule. We obtained a similar-but non-significant-result under ambiguity
Reinforcement learning or active inference?
This paper questions the need for reinforcement learning or control theory when optimising behaviour. We show that it is fairly simple to teach an agent complicated and adaptive behaviours using a free-energy formulation of perception. In this formulation, agents adjust their internal states and sampling of the environment to minimize their free-energy. Such agents learn causal structure in the environment and sample it in an adaptive and self-supervised fashion. This results in behavioural policies that reproduce those optimised by reinforcement learning and dynamic programming. Critically, we do not need to invoke the notion of reward, value or utility. We illustrate these points by solving a benchmark problem in dynamic programming; namely the mountain-car problem, using active perception or inference under the free-energy principle. The ensuing proof-of-concept may be important because the free-energy formulation furnishes a unified account of both action and perception and may speak to a reappraisal of the role of dopamine in the brain
Time preferences and risk aversion: tests on domain differences
The design and evaluation of environmental policy requires the incorporation of time and risk elements as many environmental outcomes extend over long time periods and involve a large degree of uncertainty. Understanding how individuals discount and evaluate risks with respect to environmental outcomes is a prime component in designing effective environmental policy to address issues of environmental sustainability, such as climate change. Our objective in this study is to investigate whether subjects' time preferences and risk aversion across the monetary domain and the environmental domain differ. Crucially, our experimental design is incentivized: in the monetary domain, time preferences and risk aversion are elicited with real monetary payoffs, whereas in the environmental domain, we elicit time preferences and risk aversion using real (bee-friendly) plants. We find that subjects' time preferences are not significantly different across the monetary and environmental domains. In contrast, subjects' risk aversion is significantly different across the two domains. More specifically, subjects (men and women) exhibit a higher degree of risk aversion in the environmental domain relative to the monetary domain. Finally, we corroborate earlier results, which document that women are more risk averse than men in the monetary domain. We show this finding to, also, hold in the environmental domain
Cooperation and Contagion in Web-Based, Networked Public Goods Experiments
A longstanding idea in the literature on human cooperation is that
cooperation should be reinforced when conditional cooperators are more likely
to interact. In the context of social networks, this idea implies that
cooperation should fare better in highly clustered networks such as cliques
than in networks with low clustering such as random networks. To test this
hypothesis, we conducted a series of web-based experiments, in which 24
individuals played a local public goods game arranged on one of five network
topologies that varied between disconnected cliques and a random regular graph.
In contrast with previous theoretical work, we found that network topology had
no significant effect on average contributions. This result implies either that
individuals are not conditional cooperators, or else that cooperation does not
benefit from positive reinforcement between connected neighbors. We then tested
both of these possibilities in two subsequent series of experiments in which
artificial seed players were introduced, making either full or zero
contributions. First, we found that although players did generally behave like
conditional cooperators, they were as likely to decrease their contributions in
response to low contributing neighbors as they were to increase their
contributions in response to high contributing neighbors. Second, we found that
positive effects of cooperation were contagious only to direct neighbors in the
network. In total we report on 113 human subjects experiments, highlighting the
speed, flexibility, and cost-effectiveness of web-based experiments over those
conducted in physical labs
Belief formation in a signaling game without common prior: an experiment
Using belief elicitation, the paper investigates the process of belief formation and evolution in a signaling game in which a common prior is not induced. Both prior and posterior beliefs of Receivers about Senders' types are elicited, as well as beliefs of Senders about Receivers' strategies. In the experiment, subjects often start with diffuse uniform beliefs and update them in view of observations. However, the speed of updating is influenced by the strength of initial beliefs. An interesting result is that beliefs about the prior distribution of types are updated slower than posterior beliefs, which incorporate Senders' strategies. In the medium run, for some specifications of game parameters, this leads to outcomes being significantly different from the outcomes of the game in which a common prior is induced. It is also shown that elicitation of beliefs does not considerably change the pattern of play in this game
- …
