GQ(Î») Quick Reference Guide - CiteSeerX

Recommend Documents

quick reference guide ict quick reference guide - Innovative ...

5th Edition ... prepared this technical reference book for use by its .... Verify that CO level drops below 10 ppm. 26. Calibration complete. G. NI. T. C. U. D. R. A. L.

T2i Quick Reference Guide

You can take this guide with you when shooting. For detailed instructions, refer to the. EOS REBEL T2i/EOS 550D Instruction Manual. Red index. White index. 1.

Quick Reference User's Guide

page. 4. The app automatically navigates to the Cars tab so you can program your DSM50BTP1 system to the app. There are two ways to do this, depending on.

Quick Reference Guide - Mobileye

It IS RECOMMENDED tO aLSO REaD thE USER MaNUaL FOUND at www. ... Mobileye C2-270/C2-200 in accordance with the Safety Instructions and warnings ...

HINO Quick Reference Guide

Buzzer, Air Horn, Display, Ignition, and DPR Manual Operation ... 6. HINO 2005- 2010 Quick Reference Guide. PAGE. Engine. Valves, Springs and Guides.

LFP Quick Reference Guide

EPSON LARGE FORMAT PRINTER QUICK REFERENCE GUIDE ... Epson printers are also provided with native drivers for specialist software like AutoCad.

Quick Reference Guide

REMOTE CONTROL OPERATION . .... Be sure that there are no obstructions between the remote and controller. ○ The remote ... Webasto Product N.A., Inc.

DCG Quick Reference Guide

LENOVO DATA CENTER. Dual Processor Servers: Mainstream (Skylake-SP) ... 12 x 3.5" HS SAS/SSD. 24 x 2.5" HS SAS/ .... Len

Quick Reference Guide

Information in this document is subject to change without notice. © 2005–2006 .... Running the Dell™ IDE Hard Drive Diagnostics . .... Regulatory model information and chassis type. • DCTR — Mini tower chassis. • DCNE — Desktop chassis.

Quick Reference Guide

Corsa. Refer to Owner's Manual for detailed information. Note: some items ... Open tank flap, unscrew fuel filler cap, ... fuel tank has a limiting system which.

Quick Reference Guide

HP LaserJet 1020. 4. HP LaserJet P2035. 7. HP LaserJet M1120. 11. HP LaserJet 1022. 4. HP LaserJet P2055. 7. HP LaserJet M1132. 11. HP LaserJet 1100. 4.

Quick Reference Guide

... carefully before operating your set and retain it for future reference. Quick REFERENcE GuidE. LED TV. * LG LED TV applies LCD screen with LED backlights.

Quick Reference Guide

Extensive and growing range of cartridges servicing ... Xerox has created a range of print cartridges for use with HP® and Brother® ..... LaserJet P1005, P1006.

Quick Reference Guide

HP LaserJet 8100/8150. 6. HP LaserJet 4100MFP. 9 ... HP LaserJet P1005/ P1006. 7. HP LaserJet 9055MFP. 11 .... RG5-1833. Service Manual. C3916- 90984.

QUICK REFERENCE GUIDE - MotoBoom

dealers use original Kawasaki parts that guarantee top-quality. .... 125 - 250 cc. 9 ...... 430821086. 430821086. 920571360. CHECK EPC. EL250D7. EL250. 250.

quick reference guide - Bitly

The Colour of Custard â a mobile learning experiencehttp://bit.ly/1mp5vYw ... and upload a selfie to the game, how clo

Quick Reference Guide

Driving while distracted can result in loss of vehicle control, crash and injury. .... front of you during highway drivi

Quick Reference Guide

Driving while distracted can result in loss of vehicle control, crash and injury. We strongly recommend ... For addition

Quick Reference Guide

CANON INC. 2012. English. Quick Reference Guide. This quick reference guide explains the basic function settings and how to shoot and play back images.

ITIL® Quick Reference Guide

ITIL®. Quick Reference. Guide pliers on behalf of the business. DHS — Definitive Hardware Store: secure storage are for hardware spares to be used and ...

Quick Reference Guide

The charge port is located on the front left fender. Press ... upgrading to the optional 240V charging .... latest software updates that add new places, directions .... Change your map views from Heading Up, North Up or 3D. Set a Destination.

Quick Reference Guide - Autodesk

APA Quick Reference Guide

The UNC School of Social Work follows the writing standards of the American Psychological ... The basic structure for a paper in APA style includes four components: ... To introduce a word used as a term; drop italics on subsequent use of term ...

Quick Reference Guide - Goodreads

Nightwalkers, roughly patterned upon aspects of the four elements: Earth, ... to raise the dead and restore the souls of Nightwalkers, in exchange for physical.

GQ(Î») Quick Reference Guide - CiteSeerX

Download PDF

0 downloads 27 Views 80KB Size Report

Comment

Aug 9, 2010 - b(st,at). (1) and Â¯Ït denote the expected next feature vector: Â¯ Ït = â a Ï(st,a)Ï(st,a). (2). The

GQ(λ) Quick Reference Guide Adam White and Richard S. Sutton August 9, 2010

This document should serve as a quick reference for the linear GQ(λ) off-policy learning algorithm. We refer the reader to Maei and Sutton (2010) for a more detailed explanation of the intuition behind the algorithm and convergence proofs. If you questions or concerns about the content in this document or the attached java code please email [email protected].

1

Notation

For each use of GQ(λ) you need to provide the following four question functions. (In the following S and A denote the sets of states and actions.) • π : S × A → [0, 1]; target policy to be learned. If π is chosen as the greedy policy with respect to the learned value function, the algorithm will implement a generalization of The Greedy-GQ algorithm as described in the recent ICML-10 paper (Maei, Szepesvari, Bhatnagar & Sutton, 2010). • γ : S → [0, 1]; termination function (γ(s) = 1 − β(s) in GQ paper) • r : S × A × S → φ δ ← r + (1 − γ)z + γθ > φ e ← ρe +Iφ ¯ θ ← θ + α(δe − γ(1 − λ)(w> e)φ) > w + w + αη(δe − (w φ)φ) e ← γλe

Initialize θ arbitrarily and w = 0 Repeat (for each episode): Initialize e = 0 s ← initial state of episode Repeat (for each step of episode): a ← action selected by policy b in state s Take action a, observe next state, s0 ¯ ←0 φ For all a ∈ A(s) : ¯ ←φ ¯ + P 0 π(s0 , a0 )φ 0 0 φ s ,a a ρ = π(s,a) b(s,a) ¯ λ(s0 ), γ(s0 ), z(s0 ), r(s, a, s0 ), ρ, I(s, a) ) GQLearn(φs,a , φ, 0 s←s 0 until s is terminal

4

Code

The file GQLambda.java contains an implementation of the GQLearn function described above. We have deliberately excluded optimizations (e.g., binary features or efficient trace implementation) to ensure the code is simple and easy to understand. We leave it to the reader to provide environment code for interfacing to GQ(λ) (e.g., using RL-Glue).

5

References

Maei, H. R., Szepesvari, Cs., Bhatnagar, S., Sutton, R. S. (2010). Toward Off-Policy Learning Control with Function Approximation. In Proceedings of the 27th International Conference on Machine Learning, Haifa, Israel.

3

Maei, H. R. and Sutton, R. S. GQ(λ): A general gradient algorithm for temporal-difference prediction learning with eligibility traces. In Baum, E., Hutter, M., and Kitzelmann, E. (eds.), AGI 2010, pp. 9196. Atlantis Press, 2010. Sutton, R. S., Barto, A. G. (1998). Reinforcement Learning: An Introduction. MIT Press.

4