Journal of Modern Foreign Psychology
2016. Vol. 5, no. 4, 85–96
ISSN: 2304-4977 (online)
Reinforcement learning in probabilistic environment and its role in human adaptive and maladaptive behavior
General Information
Keywords: reinforcement learning, uncertainty, prediction error, frontal cortex, dopamine, serotonin, norepinephrine, mental disorders
Journal rubric: Educational Psychology and Pedagogical Psychology
For citation: Kozunova G.L. Reinforcement learning in probabilistic environment and its role in human adaptive and maladaptive behavior [Elektronnyi resurs]. Sovremennaia zarubezhnaia psikhologiia = Journal of Modern Foreign Psychology, 2016. Vol. 5, no. 4, pp. 85–96. DOI: 10.17759/jmfp.2016050409. (In Russ., аbstr. in Engl.)
- Sagvolden T. et al. A dynamic developmental theory of attention-deficit/hyperactivity disorder (ADHD) predominantly hyperactive/impulsive and combined subtypes. Behavioral and Brain Sciences, 2005. Vol. 28, no. 3, pp. 397–418. doi: 10.1017/S0140525X05000075
- Steinberg E.E. et al. A causal link between prediction errors, dopamine neurons and learning. Nature neuroscience, 2013. Vol. 16, no. 3, pp. 966–973. doi: 10.1038/nn.3413
- Qi J. et al. A glutamatergic reward input from the dorsal raphe to ventral tegmental area dopamine neurons. Nature communications, 2014. Vol. 5, art. 5390. doi: 10.1038/ncomms6390
- Alloy L.B., Tabachnik N. Assessment of covariation by humans and animals: The joint influence of prior expectations and current situational information. Psychological review, 1984. Vol. 91, no. 1, pp. 112–149. doi: 10.1037/0033-295X.91.1.112
- Der-Avakian A. et al. Assessment of reward responsiveness in the response bias probabilistic reward task in rats: implications for cross-species translational research. Translational psychiatry, 2013. Vol. 3, no. 8. doi: 10.1038/tp.2013.74
- Aston-Jones G., Cohen J.D. An integrative theory of locus coeruleus-norepinephrine function: Adaptive gain and optimal performance. Annual Review of Neuroscience, 2005. Vol. 28, pp. 403–450. doi: 10.1146/annurev.neuro.28.061604.135709
- Balsam P.D., Drew M.R., Yang C. Timing at the start of associative learning. Learning and Motivation, 2002. Vol. 33, no. 1, pp. 141–155. doi: 10.1006/lmot.2001.1104
- Bayer H.M., Glimcher P.W. Midbrain dopamine neurons encode a quantitative reward prediction error signal. Neuron, 2005. Vol. 47, no. 1, pp. 129–141. doi: 10.1016/j.neuron.2005.05.020
- Bayer H.M., Lau B., Glimcher P.W. Statistics of midbrain dopamine neuron spike trains in the awake primate. Journal of Neurophysiology, 2007. Vol. 98, no. 3, pp. 1428–1439. doi: 10.1152/jn.01140.2006
- Bouret S., Richmond B.J. Sensitivity of locus ceruleus neurons to reward value for goal-directed actions. The Journal of Neuroscience, 2015. Vol. 35, no. 9, pp. 4005–4014. doi: 10.1523/JNEUROSCI.4553-14.2015
- Bourgeois A., Chelazzi L., Vuilleumier P. How motivation and reward learning modulate selective attention. Progress in Brain Research, 2016. Vol. 229, pp. 325–342. doi: 10.1016/bs.pbr.2016.06.004
- Cartoni E., Puglisi-Allegra S., Baldassarre G. The three principles of action: A Pavlovian-instrumental transfer hypothesis. Frontiers in behavioral neuroscience, 2013. Vol. 7, pp. 1–11. doi: 10.3389/fnbeh.2013.00153
- Conway C.M., Christiansen M.H. Sequential learning in non-human primates. Trends in cognitive sciences, 2001. Vol. 5, no. 12, pp. 539–546. doi: 10.1016/S1364-6613(00)01800-3
- Corbetta M., Patel G., Shulman G.L. The reorienting system of the human brain: From environment to theory of mind. Neuron, 2008. Vol. 58, no. 3, pp. 306–324. doi: 10.1016/j.neuron.2008.04.017
- Cytawa J., Trojniar W. The state of pleasure and its role in instrumental conditioning. Activitas nervosa superior, 1976. Vol. 18, no. 1–2, pp. 92–96.
- Dayan P., Berridge K.C. Model-based and model-free Pavlovian reward learning: Revaluation, revision, and revelation. Cognitive, Affective, & Behavioral Neuroscience, 2014. Vol. 14, no. 2, pp. 473–492. doi: 10. /3758s13415-014-0277-8
- Dickinson A., Watt A., Griffiths W.J.H. Free-operant acquisition with delayed reinforcement. Comparative and Physiological Psychology, 1992. Vol. 45, no. 3, pp. 241–258.
- Heinz A. et al. Dimensional psychiatry: Mental disorders as dysfunctions of basic learning mechanisms. Journal of Neural Transmission, 2016. Vol. 123, no. 8, pp. 809–821. doi: 10.1007/s00702-016-1561-2
- Roiser J.P. et al. Do patients with schizophrenia exhibit aberrant salience? Psychological medicine, 2009. Vol. 39, no. 2, pp. 199–209. doi: 10.1017/s0033291708003863
- Liu Z. et al. Dorsal raphe neurons signal reward through 5-HT and glutamate. Neuron, 2014. Vol. 81, no. 6, pp. 1360–1374. doi: 10.1016/j.neuron.2014.02.010
- Frank M.J., Seeberger L.C., O'reilly R.C. By carrot or by stick: Cognitive reinforcement learning in parkinsonism. Science, 2004. Vol. 306, no. 5703, pp. 1940–1943. doi: 10.1126/science.1102941
- VanElzakker M.B. et al. From Pavlov to PTSD: The extinction of conditioned fear in rodents, humans, and anxiety disorders. Neurobiology of learning and memory, 2014. Vol. 113, pp. 3–18. doi: 10.1016/j.nlm.2013.11.014
- Gallistel C.R., Fairhurst S., Balsam P. The learning curve: Implications of a quantitative analysis. Proceedings of the national academy of Sciences of the united States of america, 2004. Vol. 101, no. 36, pp. 13124-13131. doi: 10.1073/pnas.0404965101
- Gershman S.J. A Unifying Probabilistic View of Associative Learning. PLoS Computational Biology, 2015. Vol. 11, no. 11, pp. 1–20. doi: 10.1371/journal.pcbi.1004567
- Guillin O., Abi‐Dargham A., Laruelle M. Neurobiology of dopamine in schizophrenia. International review of neurobiology, 2007. Vol. 78, pp. 1–39. doi: 10.1016/S0074-7742(06)78001-1
- Hinson J.M., Staddon J.E.R. Matching, maximizing, and hill‐climbing. Journal of the experimental analysis of behavior, 1983. Vol. 40, no. 3, pp. 321–331. doi: 10.1901/jeab.1983.40-321
- Hofmeister J., Sterpenich V. A role for the locus ceruleus in reward processing: encoding behavioral energy required for goal-directed actions. The Journal of Neuroscience, 2015. Vol. 35, no. 29, pp. 10387–10389. doi: 10.1523/JNEUROSCI.1734-15.2015
- Holroyd C.B., Coles M.G.H. The neural basis of human error processing: reinforcement learning, dopamine, and the error-related negativity. Psychological review, 2002. Vol. 109, no. 4, pp. 679–709. doi: 10.1037/0033-295X.109.4.679
- Homberg J.R. Serotonin and decision making processes. Neuroscience & Biobehavioral Reviews, 2012. Vol. 36, no. 1, pp. 218–236. doi: 10.1016/j.neubiorev.2011.06.001
- Kirkpatrick K., Balsam P.D. Associative learning and timing. Current opinion in behavioral sciences, 2016. Vol. 8, pp. 181–185. doi: 10.1016/j.cobeha.2016.02.023
- Ma W.J., Jazayeri M. Neural coding of uncertainty and probability. Annual review of neuroscience, 2014. Vol. 37, pp. 205–220. doi: 10.1146/annurev-neuro-071013-014017
- Maia T.V., Frank M.J. From reinforcement learning models to psychiatric and neurological disorders. Nature neuroscience, 2011. Vol. 14, no. 2, pp. 154–162. doi: 10.1038/nn.2723
- Molet M., Miller R.R. Timing: An attribute of associative learning. Behavioural processes, 2014. Vol. 101, pp. 4–14. doi: 10.1016/j.beproc.2013.05.015
- Crone E.A. et al. Neural mechanisms supporting flexible performance adjustment during development. Cognitive, Affective, & Behavioral Neuroscience, 2008. Vol. 8, no. 2, pp. 165–177. doi: 10.3758/CABN.8.2.165
- Garbusow M. et al. Pavlovian-to-instrumental transfer in alcohol dependence: A pilot study. Neuropsychobiology, 2014. Vol. 70, no. 2, pp. 111–121. doi: 10.1159/000363507
- Palminteri S. et al. Pharmacological modulation of subliminal learning in Parkinson's and Tourette's syndromes. Proceedings of the National Academy of Sciences, 2009. Vol. 106, no. 45, pp. 19179–19184. doi: 10.1073/pnas.0904035106
- Reddy L.F. et al. Probabilistic reversal learning in schizophrenia: Stability of deficits and potential causal mechanisms. Schizophrenia bulletin, 2016. Vol. 42, no. 4, pp. 942–951. doi: 10.1093/schbul/sbv226
- Nieuwenhuis S. et al. Reinforcement-related brain potentials from medial frontal cortex: Origins and functional significance. Neuroscience & Biobehavioral Reviews, 2004. Vol. 28, no. 4, pp. 441–448. doi: 10.1016/j.neubiorev.2004.05.003
- Robinson J.S. Stimulus substitution and response learning in the earthworm. Journal of comparative and physiological psychology, 1953. Vol. 46, no. 4, pp. 262–266. doi: 10.1037/h0056151
- Saffran J.R., Aslin R.N., Newport E.L. Statistical learning by 8-month-old infants. Science. 1996. Vol. 274, no. 5294, pp. 1926–1928.
- Schultz W. Predictive reward signal of dopamine neurons. Journal of neurophysiology, 1998. Vol. 80, no. 1, pp. 1–27.
- Izquierdo A. et al. The neural basis of reversal learning: An updated perspective. Neuroscience, 2016. doi: 10.1016/j.neuroscience.2016.03.021
- Ferdinand N.K. et al. The processing of unexpected positive response outcomes in the mediofrontal cortex. The Journal of Neuroscience, 2012. Vol. 32, no. 35, pp. 12087–12092. doi: 10.1523/JNEUROSCI.1410-12.2012
- Thorndike E.L. Animal intelligence: Experimental studies. Transaction Publishers, 1965.
- Walsh M.M., Anderson J.R. Learning from delayed feedback: Neural responses in temporal credit assignment. Cognitive, Affective, & Behavioral Neuroscience, 2011. Vol. 11, no. 2, pp. 131–143. doi: 10.3758/s13415-011-0027-0
- Weismüller B., Bellebaum C. Expectancy affects the feedback‐related negativity (FRN) for delayed feedback in probabilistic learning. Psychophysiology, 2016. Vol. 53, no. 11, pp. 1739–1750. doi: 10.1111/psyp.12738
- Wolford G., Miller M.B., Gazzaniga M. The left hemisphere’s role in hypothesis formation [Electronic resource]. Journal of Neuroscience, 2000. Vol. 20, no. 6, pp. 1–4. URL: (Accessed 27.12.2016).
- Yellott J.I. Probability learning with noncontingent success. Journal of mathematical psychology, 1969. Vol. 6, no. 3, pp. 541–575. doi: 10.1016/0022-2496(69)90023-6
Information About the Authors
Total: 1820
Previous month: 15
Current month: 7
Total: 1114
Previous month: 2
Current month: 0