Partitioning Input Space for Reinforcement Learning ...

Recommend Documents

Partitioning Input Space for Control-Learning - CiteSeerX

SONNET systems have separate subsystems for learning input and output. The input subsystem learns input-space partitionings through self-organization and ...

Partitioning Input Space for Control-Learning Dean ... - users.cs.umn.edu

... for Control-Learning. Dean F. Hougen, Maria Gini, and James Slagle ..... N eural computation and self-organiz ing maps : an introduction. Addison- 7 esley, R ...

Parallel Reinforcement Learning with State Action Space Partitioning

Parallel Reinforcement Learning (PRL) is an emerging paradigm within .... The Slave agents impart their experience to the Master Agent by means of a shared ...

Input Partitioning Based on Correlation for Neural Network Learning

This paper employs Pearson correlation coefficients. We use P value and significant level (Î±) to identify whether two attributes are correlated to each other. If P

Deep Reinforcement Learning with a Combinatorial Action Space for ...

Nov 1, 2016 - combinations is not too large and by further reduc- ing costs by random ... deep reinforcement learning is reviewed in section 3,. Details for the ...

Metric State Space Reinforcement Learning for a Vision-Capable

Mar 7, 2006 - Instance-based methods use a neighborhood of explicitly stored .... At each time step t the agent stores an experience triples (at,ot,Rt) of its current ..... scenarios with the robot, obstacle, and target (blue plastic teapot) in their

Deep Reinforcement Learning with a Combinatorial Action Space for ...

Jun 16, 2016 - [Branavan et al.2009] S.R.K. Branavan,. H. Chen, ... August. [Branavanetal.2011] S.R.K. Branavan, D. Silver, and ... over the twitter stream.

DCOB: Action space for reinforcement learning of ... - Akihiko Yamaguchi

Mar 29, 2013 - Abstract Reinforcement learning (RL) for robot control is an important ..... a reference trajectory is generated, then it is abbreviated. The reference ...... tion to robots, machine learning, and artificial intelligence. From 2010,.

Partitioning Logical Space

Definition 1 A question Q entails a question R iff for every index i it holds that, ..... b. Î»j[Î»xÎ»y[girl(i)(x) â§ boy(i)(y) â§ kiss(i)(x, y)] = ...... John, who really loves her. c.

Reinforcement Learning

we present reinforcement learning methods, where the transition and reward functions ... way to determine through incremental experience the best way to look after the car ...... The SARSA (Î») algorithm [RUM 94] is a first illustration. As shown ...

Partitioning Logical Space

(13) John knows that Peter left for Paris, and whether Mary went with him. The simplest ...... b. Direct reading: John knows who is such that two girls love him ...

Reinforcement Learning for Humanoid Robotics

systems like humanoid robots remains an unsolved problem. In this paper, we ... methods are not likely to scale into the domain humanoid robotics as they are ...

Benchmarking for Bayesian Reinforcement Learning

Sep 14, 2015 - the MDP is in state xt at time t and action ut is selected, the agent moves instantaneously to a next state xt+1 with a probability of P(xt+1|xt,ut) ...

Ant System Reinforcement Learning for

analyzed, and then appropriate strategies implemented to optimally impact the ... For example, a customer who likes coupon promotion may not purchase.

Transfer Learning for Multiagent Reinforcement Learning ... - IJCAI

a Transfer Learning (TL) framework to accelerate learning by exploiting two ... Many RL domains can be treated as Multiagent Systems. (MAS), in which multiple ...

Reinforcement learning for board games

s5 = sN. Figure 1: Exemplary course of a game for TicTacToe. The game is finished after N = 5 moves. â¢ State coding: .... learned weights returned from the previous call. (In the first ..... In GECCO '09: Proceedings of the 11th Annual Conference.

REINFORCEMENT LEARNING FOR COORDINATED ... - CiteSeerX

Larry D. Pyeatt. Adele E. Howe. Computer ... higher levels are responsible for long term planning and goal ... are not very e ective at pursuing long term goals.

Reinforcement learning for board games

Abstract. This technical report shows how the ideas of reinforcement learning (RL) and temporal ..... The TD(1) method has compared to the original Monte-Carlo.

Reinforcement Learning for Railway Scheduling

SBB â¢ Solution Center Infrastructure â¢ Research & Innovation â¢ October 2017. 2. Swiss Railway Network. A Complex Dynamical System. Influencing Factors.

Cylindrical Space Partitioning for Ray-Tracing

solution between rectangular grids 5] and spherical pyramids 15]. Keywords: Computer Graphics, Ray Tracing, Optimization, Space Partitioning. 1 Introduction.

Different Methods for Partitioning the Phase Space

systems to extract the best partition associated to Markov process. This paper is organized as follows: the first section describes various phase space partition ...

Space Partitioning and Classification for Multi-target

â James Marshall Wells Professor, head of Vehicle Dynamics Lab, Mechanical Engineering Department, ..... SNI := {x â S : f(x) < U + Ç«} ... To test the performance of dynamic partitioning classification, a simulation environment was constructed.

Barycentric Interpolators for Continuous Space & Time Reinforcement ...

Barycentric Interpolators for Continuous. Space & Time Reinforcement Learning. R emi Munos & Andrew Moore. Robotics Institute, Carnegie Mellon University.

Using a Binary Space Partitioning Tree for

of information for decision making in support of numerous applications .... ate geometric parameters (perimeter, intersecting angle, area, orientation .... angles (a, b, g) are computed for all the points using a .... Figure 4. Compass Line Filter: (

Partitioning Input Space for Reinforcement Learning ...

Download PDF

1 downloads 0 Views 118KB Size Report

Comment

Input-space partitioning in SONNET systems takes place through the self-organization principles introduced by. Kohonen [8]. For this, a SONNET system may ...

Partitioning Input Space for Reinforcement Learning for Control Dean F. Hougen, Maria Gini, and James Slagle Department of Computer Science University of Minnesota Minneapolis, MN 55455 hougen—gini—[email protected]

This paper considers the effect of input-space partitioning on reinforcement learning for control. In many such learning systems, the input space is partitioned by the system designer. However, input-space partitioning could be learned. Our objective is to compare learned and fixed input-space partitionings in terms of the overall system learning speed and proficiency achieved. We present a system for unsupervised control-learning in temporal domains with results for both fixed and learned input-space partitionings. The trailer-backing task is used as an example problem.

1. The learning system Many classic control-learning systems, such as Michie and Chambers’ BOXES [11] and Barto, Sutton, and Anderson’s ASE/ACE system [2], rely on the partitioning of a space of continuous input variables into a fixed number of discrete regions. More recent systems, such as Fuzzy BOXES [18], have blurred the boundaries between the input regions but nonetheless rely on a partitioning of the input space. To use these systems, researchers have typically partitioned the space manually, prior to the application of the learning system. In these cases, the designer must analyze the problem to discover a suitable partitioning, or face a trade-off between a fine partitioning that permits accurate approximation of complex functions or a gross partitioning that allows for rapid learning. The Self-Organizing Neural Network with Eligibility Traces (SONNET) scheme introduced by Hougen [4] is a general paradigm for the construction of connectionist networks that learn to control systems with a temporal component. In order to form mappings from input parameters to output responses, the system discretizes the input space by learning a partitioning of it, and learns an output response for each resulting discrete input region. SONNET systems have separate subsystems for learning input

and output. The input subsystem learns input-space partitionings through self-organization and the output subsystem learns responses through the use of eligibility traces. Both input and output learning make use of topological ordering of the neurons and associated neighborhoods, as in Kohonen’s Self-Organizing Topological Feature Maps [8].

!#"%$ & &#' Each SONNET subsystem consists of one or more artificial neural networks. For each network there is a topological ordering of the neurons that remains constant as the network learns. Let be the dimension of a network. Each neuron is associated with a -tuple that defines its coordinates. Each neuron is assigned an integer tuple of the same dimensionality as the network that uniquely defines its coordinates in topology space. The existence of a network topology allows for the definition of a distance function for the neurons. This is typically defined as the Euclidean distance between coordinates in topology space. For the sake of computational efficiency, however, we use the maximum difference between coordinates of the neurons. E.g. for neurons (1,1) and (2,3) in a planar network, the distance between them would be or 2. The distance function is used indirectly through the definition of neighborhoods. A neighborhood may have any width from zero (the neighborhood is restricted to the neuron itself) to the maximum distance between neurons in topology space (the entire network is in the neighborhood) and may vary with time, typically starting large and subsequently decreasing. Formally, if is the set of units in the network, the distance function defined on topology space, and a time dependent function specifying the width of the neighborhood, then the neighborhood of neuron at time is defined as

(

(

@

)+*-,.0/213546/278/213:9;/

A

AEDF.GCH