459 research outputs found
Dopamine signals for reward value and risk: basic and recent data.
BACKGROUND: Previous lesion, electrical self-stimulation and drug addiction studies suggest that the midbrain dopamine systems are parts of the reward system of the brain. This review provides an updated overview about the basic signals of dopamine neurons to environmental stimuli. METHODS: The described experiments used standard behavioral and neurophysiological methods to record the activity of single dopamine neurons in awake monkeys during specific behavioral tasks. RESULTS: Dopamine neurons show phasic activations to external stimuli. The signal reflects reward, physical salience, risk and punishment, in descending order of fractions of responding neurons. Expected reward value is a key decision variable for economic choices. The reward response codes reward value, probability and their summed product, expected value. The neurons code reward value as it differs from prediction, thus fulfilling the basic requirement for a bidirectional prediction error teaching signal postulated by learning theory. This response is scaled in units of standard deviation. By contrast, relatively few dopamine neurons show the phasic activation following punishers and conditioned aversive stimuli, suggesting a lack of relationship of the reward response to general attention and arousal. Large proportions of dopamine neurons are also activated by intense, physically salient stimuli. This response is enhanced when the stimuli are novel; it appears to be distinct from the reward value signal. Dopamine neurons show also unspecific activations to non-rewarding stimuli that are possibly due to generalization by similar stimuli and pseudoconditioning by primary rewards. These activations are shorter than reward responses and are often followed by depression of activity. A separate, slower dopamine signal informs about risk, another important decision variable. The prediction error response occurs only with reward; it is scaled by the risk of predicted reward. CONCLUSIONS: Neurophysiological studies reveal phasic dopamine signals that transmit information related predominantly but not exclusively to reward. Although not being entirely homogeneous, the dopamine signal is more restricted and stereotyped than neuronal activity in most other brain structures involved in goal directed behavior.RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are
Rewarding properties of visual stimuli
The behavioral functions of rewards comprise the induction of learning and approach behavior. Rewards are not only related to vegetative states of hunger, thirst and reproduction but may also consist of visual stimuli. The present experiment tested the reward potential of different types of still and moving pictures in three operant tasks involving key press, touch of computer monitor and choice behavior in a laboratory environment. We found that all tested visual stimuli induced approach behavior in all three tasks, and that action movies sustained consistently higher rates of responding compared to changing still pictures, which were more effective than constant still pictures. These results demonstrate that visual stimuli can serve as positive reinforcers for operant reactions of animals in controlled laboratory settings. In particular, the coherently animated visual stimuli of movies have considerable reward potential. These observations would allow similar forms of visual rewards to be used for neurophysiological investigations of mechanisms related to non-vegetative reward
Dopamine responses comply with basic assumptions of formal learning theory
According to contemporary learning theories, the discrepancy, or error, between the actual and predicted reward determines whether learning occurs when a stimulus is paired with a reward. The role of prediction errors is directly demonstrated by the observation that learning is blocked when the stimulus is paired with a fully predicted reward. By using this blocking procedure, we show that the responses of dopamine neurons to conditioned stimuli was governed differentially by the occurrence of reward prediction errors rather than stimulus−reward associations alone, as was the learning of behavioural reactions. Both behavioural and neuronal learning occurred predominantly when dopamine neurons registered a reward prediction error at the time of the reward. Our data indicate that the use of analytical tests derived from formal behavioural learning theory provides a powerful approach for studying the role of single neurons in learning
The Human Brain Encodes Event Frequencies While Forming Subjective Beliefs
To make adaptive choices, humans need to estimate the probability of future events. Based on a Bayesian approach, it is assumed that probabilities are inferred by combining a priori, potentially subjective, knowledge with factual observations, but the precise neurobiological mechanism remains unknown. Here, we study whether neural encoding centers on subjective posterior probabilities, and data merely lead to updates of posteriors, or whether objective data are encoded separately alongside subjective knowledge. During fMRI, young adults acquired prior knowledge regarding uncertain events, repeatedly observed evidence in the form of stimuli, and estimated event probabilities. Participants combined prior knowledge with factual evidence using Bayesian principles. Expected reward inferred from prior knowledge was encoded in striatum. BOLD response in specific nodes of the default mode network (angular gyri, posterior cingulate, and medial prefrontal cortex) encoded the actual frequency of stimuli, unaffected by prior knowledge. In this network, activity increased with frequencies and thus reflected the accumulation of evidence. In contrast, Bayesian posterior probabilities, computed from prior knowledge and stimulus frequencies, were encoded in bilateral inferior frontal gyrus. Here activity increased for improbable events and thus signaled the violation of Bayesian predictions. Thus, subjective beliefs and stimulus frequencies were encoded in separate cortical regions. The advantage of such a separation is that objective evidence can be recombined with newly acquired knowledge when a reinterpretation of the evidence is called for. Overall this study reveals the coexistence in the brain of an experience-based system of inference and a knowledge-based system of inference
Dissociating the Role of the Orbitofrontal Cortex and the Striatum in the Computation of Goal Values and Prediction Errors
To make sound economic decisions, the brain needs to compute several different value-related signals. These include goal values that measure the predicted reward that results from the outcome generated by each of the actions under consideration, decision values that measure the net value of taking the different actions, and prediction errors that measure deviations from individuals' previous reward expectations. We used functional magnetic resonance imaging and a novel decision-making paradigm to dissociate the neural basis of these three computations. Our results show that they are supported by different neural substrates: goal values are correlated with activity in the medial orbitofrontal cortex, decision values are correlated with activity in the central orbitofrontal cortex, and prediction errors are correlated with activity in the ventral striatum
Herding and Social Pressure in Trading Tasks: A Behavioural Analysis
We extend the experimental literature on Bayesian herding using evidence from a financial decision-making experiment. We identify significant propensities to herd increasing with the degree of herd-consensus. We test various herding models to capture the differential impacts of Bayesian-style thinking versus behavioural factors. We find statistically significant associations between herding and individual characteristics such as age and personality traits. Overall, our evidence is consistent with explanations of herding as the outcome of social and behavioural factors. Suggestions for further research are outlined and include verifying these findings and identifying the neurological correlates of propensities to herd
Neuronal Distortions of Reward Probability without Choice
Reward probability crucially determines the value of outcomes. A basic phenomenon, defying explanation by traditional decision theories, is that people often overweigh small and underweigh large probabilities in choices under uncertainty. However, the neuronal basis of such reward probability distortions and their position in the decision process are largely unknown. We assessed individual probability distortions with behavioral pleasantness ratings and brain imaging in the absence of choice. Dorsolateral frontal cortex regions showed experience dependent overweighting of small, and underweighting of large, probabilities whereas ventral frontal regions showed the opposite pattern. These results demonstrate distorted neuronal coding of reward probabilities in the absence of choice, stress the importance of experience with probabilistic outcomes and contrast with linear probability coding in the striatum. Input of the distorted probability estimations to decision-making mechanisms are likely to contribute to well known inconsistencies in preferences formalized in theories of behavioral economics
Choice mechanisms for past, temporally extended outcomes.
Accurate retrospection is critical in many decision scenarios ranging from investment banking to hedonic psychology. A notoriously difficult case is to integrate previously perceived values over the duration of an experience. Failure in retrospective evaluation leads to suboptimal outcome when previous experiences are under consideration for revisit. A biologically plausible mechanism underlying evaluation of temporally extended outcomes is leaky integration of evidence. The leaky integrator favours positive temporal contrasts, in turn leading to undue emphasis on recency. To investigate choice mechanisms underlying suboptimal outcome based on retrospective evaluation, we used computational and behavioural techniques to model choice between perceived extended outcomes with different temporal profiles. Second-price auctions served to establish the perceived values of virtual coins offered sequentially to humans in a rapid monetary gambling task. Results show that lesser-valued options involving successive growth were systematically preferred to better options with declining temporal profiles. The disadvantageous inclination towards persistent growth was mitigated in some individuals in whom a longer time constant of the leaky integrator resulted in fewer violations of dominance. These results demonstrate how focusing on immediate gains is less beneficial than considering longer perspectives.This research was supported by the Wellcome Trust Grants 095495 and 093270 and European Research Council Advanced Grant ERC-2011-AdG 293549.This is the final version. It was first published by Royal Society Publishing at http://rspb.royalsocietypublishing.org/content/282/1810/20141766
Economic choices reveal probability distortion in macaque monkeys.
Economic choices are largely determined by two principal elements, reward value (utility) and probability. Although nonlinear utility functions have been acknowledged for centuries, nonlinear probability weighting (probability distortion) was only recently recognized as a ubiquitous aspect of real-world choice behavior. Even when outcome probabilities are known and acknowledged, human decision makers often overweight low probability outcomes and underweight high probability outcomes. Whereas recent studies measured utility functions and their corresponding neural correlates in monkeys, it is not known whether monkeys distort probability in a manner similar to humans. Therefore, we investigated economic choices in macaque monkeys for evidence of probability distortion. We trained two monkeys to predict reward from probabilistic gambles with constant outcome values (0.5 ml or nothing). The probability of winning was conveyed using explicit visual cues (sector stimuli). Choices between the gambles revealed that the monkeys used the explicit probability information to make meaningful decisions. Using these cues, we measured probability distortion from choices between the gambles and safe rewards. Parametric modeling of the choices revealed classic probability weighting functions with inverted-S shape. Therefore, the animals overweighted low probability rewards and underweighted high probability rewards. Empirical investigation of the behavior verified that the choices were best explained by a combination of nonlinear value and nonlinear probability distortion. Together, these results suggest that probability distortion may reflect evolutionarily preserved neuronal processing.This work was supported by the Wellcome Trust, European Research Council (ERC) and Caltech Conte Center.This is the final version of the article. It was first published by the Society for Neuroscience at http://www.jneurosci.org/content/35/7/3146.ful
- …
