Conclusions Simulation results from a 2-armed bandit task Theory ...

Recommend Documents

van Dokkum, Pieter G.; LabbÃ©, Ivo; Moorwood, Alan; RÃ¶ttgering, Huub; van der Wel, Arjen; van der Werf, Paul; van Starkenburg, Lottie,2006, ApJ, 650,18-41.

RESULTS RESULTS Continued CONCLUSIONS ...

the summer 2014 session. ... meetings. A small portion of participants were recruited through ... communication, small group contact, Webex/Skype, and phone.

Introduction: Methodology: Results: Conclusions:

SMM 2011 ... as part of the Irish Government's National Development Plan (NDP), the Beaufort Marine Research Award is grant aided by the Department of.

Background Results Conclusions

projects. â¢ A large effect size describes the association between. APA participants and the sensation seeking trait. â¢ The mode of adventure physical activity ...

introduction results discussion & conclusions

Percentagelptxladmittedlwithoutl severelunderlyingldiseasel:m-. [llllll[O[llllll]llllll]O[llllllIllllllIO[llllllzllllllzO[llllllUllllllUO[llllll(v. â¬aselinelLOSl:days-. â¬(vvvl. â¬Uvv. â¬zvv.

Results and Conclusions

Acknowledgement: The project is a product under the IPY core project PPS Arctic ... No v. D ec. Jan. F eb. M ar. Ap r. M ay. Ju n. Ju l. Au g. -1.0. -0.5. 0.0. 0.5. 1.0.

Methods Conclusions Objective Results

$25 Apple gift card. While coloring therapy does not incorporate all elements of art or relaxation therapy, it may help promote a state of engagement brought ...

CONCLUSIONS WITHIN MATHEMATICAL TASK ENACTMENTS: A ...

mathematical tasks framework to include a separate conclusion phase is proposed. Implications ... language involves all three metafunctions, though not always to the same degree. .... Ms. Doss then used the entirety of the third day for the task.

background methods results conclusions

Naloxegol was the most frequent PAMORA used for the treatment of OIC in cancer patients (92,4%). OIC is a frequent opioid side effect and is considered a ...

CONCLUSIONS AND IMPLICATIONS Results

ABO blood group, classification of human blood based on ... A and B are distinct carbohydrate expressed on red blood cells ... ABO Blood group is associated with function effects ... (defined as myocardial necrosis, myocardial infarction,.

Background Methods Results Conclusions

181 (Â±57.6).

Results Introduction Methods Conclusions

Nocturnal Pulse Oximetry as an Abbreviated Testing Modality for Pediatric Obstructive Sleep Apnea, Brouillette, R.T. et al. Pediatrics 2000;105:405â412. 2.

results conclusions references

Development and validation of in vitro release tests for semisolid dosage forms- Case Study. Dissolution Technologies , 10-15. 2003. Ref Type: Journal (Full). 2.

results conclusions

pages, videos of personal testimonials from tobacco victims and related pack warnings, as well as cessation support messages from a clinician. RESULTS.

Results and conclusions

Peroxisome Proliferator-activated Receptor Î±; SARDH, Sarcosine dehydrogenase; Sarc, Sarcosine: SHMT1, cytosolic serine hydroxy methyltransferase; SHMT2 ...

Results and Conclusions

phase, the âOREOS 55Hâ hybrid midi-bus, manufactured by GÃ©pebus, was .... of the major variables monitored during the demonstration action. > 1A. V. E. I. R. O.

Methods Results Conclusions

19, 15, and 13 mm water was applied to 100, 75, 60, and 50% ETc treatments .... 27-May16-Jun 6-Jul 26-Jul 15-Aug 4-Sep 24-Sep 14-Oct 3-Nov. R ain fall &.

RESULTS CONCLUSIONS BACKGROUND AIM

(1997). Neurotoxicity associated with dual actions of homocysteine at the N-methyl-D-aspartate receptor. ... Betaine-homocysteine methyltransferase (BHMT).

CONCLUSIONS RESULTS ACKNOWLEDGEMENTS REFERENCES ...

extend to drugs with other modes of activity including, for example, core inhibitors. RESULTS. ACKNOWLEDGEMENTS. We grea

CONCLUSIONS WITHIN MATHEMATICAL TASK ENACTMENTS: A ...

Columbus, OH: The. Ohio State University. Volume VI .... for 20 minutes"), mathematical content of the tasks (e.g., "area of triangles"), and general impressions of ...

Background Methods Results Conclusions

Reward-US: lottery tickets, representing a prize of participants' .... es the pa inful mov emen t is ch os en. Pain-Avoidance. Equally Important. Reward-Seeking.

Results Introduction Methods Conclusions

(399.3 per 100,000). Table 1. TIME INPUTS INTO HIV SERVICES AND THEIR COMPONENTS. 0. 20. 40. 60. 80. 100. 120. 140. 2017 2018 2019 2020.

Tomorrows buildings today â results, conclusions and learnings from a ...

Reality check - real people in real houses. All the buildings have been monitored in use, on technical measures and user feedback and experience (Eleb, 2013) ...

Spatio-temporal Visualization of Simulation Results using a task ...

task-oriented 'Windows 8' former 'Windows Metro style' design philosophy highlighting the advantages of modern user ..... smartphone app development. Hence ...

Conclusions Simulation results from a 2-armed bandit task Theory ...

Download PDF

0 downloads 0 Views 2MB Size Report

Comment

[1] Eliasmith, Chris, and Charles H. Anderson. Neural engineering: computation, representation, and dynamics in neurobiological systems. MIT Press, 2003.

A general error-modulated STDP learning rule applied to reinforcement learning in the basal ganglia Trevor Bekolay, Chris Eliasmith {tbekolay,celiasmith}@uwaterloo.ca Centre for Theoretical Neuroscience, University of Waterloo

Theory: Learning mathematical functions

Introduction ●

●

●

Error-modulated rules have typically assumed simple scalar representations of error To learn complex tasks, models have employed attention or sparse local connectivity to do RL in high-dimensional spaces

Error

Desired output

Dopamine

Modulatory x

Simulation results from a 2-armed bandit task

Input

Learned connection weights

1. Animal waits (Delay phase). 2. Bridge lowers, allowing animal to move (Go phase). 3. Animal reaches decision point, either turns left or right (Approach phase). 4. Reward is stochastically delivered at reward site (Reward phase). 5. Animal returns to delay area (Return phase).

Output

We propose an STDP rule that exploits multidimensional error signals to learn complex tasks without additional constraints

Experimental rat data P(reward) [left right] [.21 .63] [.63 .21] [.12 .72] [.72 .12]

[5]

Simulated rat data P(reward) [left right] [.21 .63] [.63 .21] [.12 .72] [.72 .12]

Methods ●

Simulations use the Neural Engineering Framework

[1]

Encoding:

Application: Dynamic action selection in a biologically plausible basal ganglia model

Decoding:

●

Aggregated data of 100 sessions for two simulated rats

Reinforcement learning: value function and reward prediction error

Original model[4] Excitatory

●

●

●

Dopamine release has been shown to closely resemble reward prediction error ( ) Applied to NEF: A triplet based STDP rule:[3]

[2]

●

Inhibitory

Input utility

●

Output spikes

Combined:

R et ur n

Conclusions

Augmented model

Modulatory

●

p Ap

ch a ro

R ew

D el ay

Incorporating reward prediction error in a synaptic learning rule

ar d

Neuron

Inhibitory

●

With a multidimensional error signal, the rule can learn arbitrary transformations in vector spaces The rule can be applied to a biologically plausible model of the basal ganglia without adding constraints The augmented basal ganglia can replicate behavioural and neural results from an experimental reinforcement learning task

[1] Eliasmith, Chris, and Charles H. Anderson. Neural engineering: computation, representation, and dynamics in neurobiological systems. MIT Press, 2003. [2] Macneil, David, and Chris Eliasmith. "Fine-tuning and stability of recurrent neural networks." J.Neuroscience, 2010 (submitted). [3] Pfister, Jean-Pascal, and Wulfram Gerstner. "Triplets of spikes in a model of spike timing-dependent plasticity." J.Neuroscience, 2006. [4] Stewart, Terrence C., Xuan Choo, and Chris Eliasmith. "Dynamic behaviour of a spiking model of action selection in the basal ganglia." ICCM 2010. [5] Kim, Hoseok, Jung Hoon Sul, Namjung Huh, Daeyeol Lee, and Min Whan Jung. "Role of striatum in updating values of chosen actions." J.Neuroscience, 2009.