"Markov Decision Processes", by Lodewijk Kallenberg

Recommend Documents

No documents

4 downloads 163 Views 5MB Size Report

Comment

reward at decision time point t for an action a in state i will be denoted by rt i(a); if the reward is independent of t

Our partners will collect data and use cookies for ad personalization and measurement. Learn how we and our ad partner Google, collect and use data. Agree & close