A network model that can learn reward timing using reinforced ...

Recommend Documents

Can a Nation Learn? American Technology as a Network ... - CiteSeerX

What is the relationship between the performance of business firms and the growth of the national ...... little contact between petroleum companies and the chemical industry. With the shift of .... Eastman Kodak, and AT&T. Indeed, protecting that ...

Can we evaluate network brokerage initiatives using data that are

Sep 1, 2009 - epe nde nce within basin. s b y clusterin g an d on. e a dju ste d fo r non. 4inde ...... development strategy. newbury Park, cA: sage. Grootaert, c.

How Accurately Can We Model Timing In A Placement Engine?

Jun 17, 2005 - We propose a set of differential timing analysis equations that ... Permission to make digital or hard copies of all or part of this work for personal ...

Factors that Can Help Select the Timing for

Mar 6, 2018 - hyperosmolar therapy, neurological signs and time of herniation, and .... its significance compared to uncal herniation when adjusted for other ...

A Resilient Network that Can Operate Under Duress: To Support ...

information, exchanging email messages and transferring data asynchronously. ... According to a Gartner Group report [4], B2B commerce will grow from $919 .... network: each peer maintains a list of neighboring nodes that may satisfy the ...

A Neural Network that Can Tell Camera ... - Semantic Scholar

Camera calibration is a primary crucial step in ... difficult problem of calibrating cameras with automated ..... the orthogonality constraints taken care of by 15 .

Make a DNA model that you can eat!

living things. When isolated from a cell and stretched out, DNA looks like a twisted ladder (double helix). The sides of

Development of a CAMA Model That Can Partition ...

Historical Value. E. Size of finished living Area. F. Building type. G. Building condition. H. Type of floor finishing. (i). Marble. (ii). Ceramic Tiles. (iii). Terrazzo. (iv).

Initial Kernel Timing Using a Simple PIM Performance Model

Feb 1, 2005 - Syntax issue: how to tell thread where to send parcels when thread creator doesn't have info about thread's frame. â¢. Other issue: creating this ...

A Drosophila model for alcohol reward

Apr 17, 2011 - Quantitative RT-PCR showed that the sca5-120 mutation sig- nificantly decreased sca ..... Claridge-Chang, A. et al. Writing memories with ...

A Markov Reward Model Checker - CiteSeerX

Joost-Pieter Katoenâ¡,â , Maneesh Khattriâ and Ivan S. Zapreevâ . â University of ... all for a specified set of statesâand (iv) the cumulated reward over a time ...

Can Machines Learn Logics?

In his argument on âlearning machinesâ in [14], Alan Turing wrote: Instead of trying to .... 3 The technique is used for a logic with the law of excluded middle.

Can Machines Learn Logics?

for deduction, while it is equipped with a machine learning algorithm C. ... this end, we represent a formal system L using metalogic programming which allows.

Everyone can learn mathematics

The Department of Education (1998) describes outcomes-based education (OBE), a fundamental principal behind Curriculum 2005, as ... Paper presented at the 5th Annual Congress of the Association for ... understanding of integers and will thus be exclu

Engines of Change: What Civic Tech Can Learn ... - Omidyar Network

post published on the same date and viable at: https://blogs.microsoft.com/on-the-issues/2016/ ..... TOP CITIES FOR CIVI

Teaching designed so all students can learn - Vermont Family Network

children have different learning styles, rates and ... learners can make a huge difference in the quality and .... Tomli

A Bacterium That Can Grow by Using Arsenic Instead

Wolfe-Simon et al. (Science Express Research Article, published online 2 December 2010; 10.1126/science. 1197258) argued that the bacterial strain GFAJ-1 ...

A Bacterium That Can Grow by Using Arsenic Instead

May 27, 2011 - Instead of Phosphorusâ. David W. Borhani. Wolfe-Simon et al. (Research Articles, 3 June 2011, p. 1163; published online 2 December 2010).

What Can a Neuron Learn with Spike-Timing-Dependent Plasticity? - IGI

Spiking neurons are very flexible computational modules, which can im- plement with different values of their adjustable synaptic parameters an enormous ...

A Neural Network Model that Calculates Dynamic Distance Transform ...

the problem of path planning for an object with two degrees of freedom (DOFs), formulated ..... ning in Robotics, Handbook of theoretical computer science (ed.

A Bayesian Belief Network Cost Estimation Model that ...

United Arab Emirates University. Al Ain, UAE ... required for software development is recognized as a key factor for ... organizations in the UAE concur with the significance of each of the ..... of a Telecommunication Company, 12th Panhellenic.

CAN communication network, using the module ...

Jul 15, 2004 - microcontroller (Freescale) and uses the transceiver. PCA82C250 (Phillips) MCUs .... Freescale Semiconductor SLK0103UG, User. Guide Rev.

The Apprenticeship Model: What We Can Learn ... - Susannah Sheffer

8, No. 3 (1989). Gareth Matthews has already been widely praised for his ability to ... 1 David Deutsch, “Becoming Experts,” Growing Without Schooling 29.

Making a Robot Learn to Play Soccer Using Reward and ... - CiteSeerX

Remi Munos and Andrew Moore. Variable resolution discretization for high- ... Richard S. Sutton and Andrew G. Barto. Reinforcement Learning : An Introduc-.

A network model that can learn reward timing using reinforced ...

Download PDF

3 downloads 0 Views 183KB Size Report

Comment

Address: 1Department of Neurobiology and Anatomy the University of Texas Medical ... from Sixteenth Annual Computational Neuroscience Meeting: CNS*2007.

BMC Neuroscience

BioMed Central

Open Access

Poster presentation

A network model that can learn reward timing using reinforced expression of synaptic plasticity Jeffrey P Gavornik*1,2, Yonatan Loewenstein3 and Harel Z Shouval1 Address: 1Department of Neurobiology and Anatomy the University of Texas Medical School in Houston, TX, USA, 2Department of Electrical and Computer Engineering the University of Texas. Austin, TX, USA and 3Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA Email: Jeffrey P Gavornik* - [email protected] * Corresponding author

from Sixteenth Annual Computational Neuroscience Meeting: CNS*2007 Toronto, Canada. 7–12 July 2007 Published: 6 July 2007 BMC Neuroscience 2007, 8(Suppl 2):P103

doi:10.1186/1471-2202-8-S2-P103

Sixteenth Annual Computational Neuroscience Meeting: CNS*2007

William R Holmes Meeting abstracts – A single PDF containing all abstracts in this Supplement is available here http://www.biomedcentral.com/content/pdf/1471-2202-8-S2-info.pdf

© 2007 Gavornik et al; licensee BioMed Central Ltd.

Recent experimental results indicate that cells within the primary visual cortex can learn to predict the time of rewards associated with visual cues [1]. In this work, different visual cues were paired with rewards at specific temporal offsets. Before training, neurons in visual cortex were active only during the duration of the visual cue. After sufficient training neurons developed persistent activity for a time period correlated with the timing of reward. Recurrent connections in a neural network can be constructed to set a desired network time constant that is different from the time constants of the constituent neurons. However, it is not known how such a network can learn the appropriate recurrent weights. A plasticity model that is able to accomplish this must be sensitive to the timing of reward events that, at least initially, occur seconds after the activity in the network returns to its basal level. In order to learn the appropriate dynamics, this network needs to solve a temporal credit assignment problem. In our model plasticity is an ongoing process changing the recurrent synaptic weights as a function of their activity; in the absence of a reward signal this plasticity rapidly decays. External reward signals allow permanent expression of preceding plasticity events, reinforcing only those which predict the reward. As a result, the network dynamics are altered and it develops time constants correlated with the timing of different rewards. As in other reinforce-

ment learning models the reward signal is inhibited by the network activity to produce a stable activity pattern. We have implemented these ideas in both abstract passive integrator networks and in more realistic integrate and fire networks and obtained results that are qualitatively similar to the experimental results. Further, we examine the implications of different possible biophysical mechanisms and propose experiments to test which specific mechanism are involved. Support: NSF CRCNS grant number 0515285.

References 1.

Schuler MG, Bear MF: Reward timing in the primary visual cortex. Science 2006, 311:1606-1609.

Page 1 of 1 (page number not for citation purposes)