Freeway Ramp-Metering Control based on

0 downloads 0 Views 2MB Size Report
Jun 11, 2014 - The action space is modeled as consisting only of two actions: red & green. Reward Function. RLCA agent's goal is to keep the freeway density ...
Missing:
Freeway Ramp-Metering Control based on Reinforcement Learning Ahmed Fares and Prof. Walid Gomaa Egypt-Japan University for Science and Technology (E-JUST) [email protected], [email protected]

June 11, 2014

Fares, Prof. Walid (E-JUST)

Freeway Ramp-Metering Control

June 11, 2014

1 / 18

Overview 1

Challenges and Motivation

2

Objectives

3

Introduction

4

Freeway Traffic Flow Model

5

Reinforcement Learning Q-Learning

6

Reinforcement Learning Density Control Agent

7

Experiments: Results and Analysis

8

Conclusion and Future Direction Fares, Prof. Walid (E-JUST)

Freeway Ramp-Metering Control

June 11, 2014

2 / 18

Challenges and Motivation Challenges of the traffic control problem come from the inherent dynamic nature, non-linearity, and uncertainty of the traffic network system. Traffic congestion is a challenging problem faced in everyday life. It has multiple negative effects on: Average speed Overall total travel time Fuel consumption Safety - primary cause of accidents Environment - air pollution.

Hence, comes the need for an intelligent reliable traffic control system. Fares, Prof. Walid (E-JUST)

Freeway Ramp-Metering Control

June 11, 2014

3 / 18

Objectives The Aim of This Research The aim of this research is to control the amount of vehicles entering the mainstream freeway from the ramp merging area.

Consequent This keeps the freeway density below the critical density. Consequently, this leads to maximum utilization of the freeway without entering in congestion while maintaining the optimal freeway operation. Fares, Prof. Walid (E-JUST)

Freeway Ramp-Metering Control

June 11, 2014

4 / 18

Introduction I The huge increase in the number of vehicles all over the world leads to random occurrences of congestion. The sophistication of traffic network demands, as well as their severity, have also increased recently. Consequently the need for an optimal and reliable traffic control, both for urban and freeway networks, has become more and more critical. Controlling the number of vehicles entering the freeway from the ramp is called the rate of ramp metering , which could be measured by a ramp metering device. By using optimal ramp metering control, the freeway flow can be improved. Congestion on a particular section of the freeway is reached when the demand flow exceeds the capacity of this section. Fares, Prof. Walid (E-JUST)

Freeway Ramp-Metering Control

June 11, 2014

5 / 18

Introduction II In this paper, we propose a new model of ramp metering based on modelling by Markov Decision Process and an associated

algorithm. In order to avoid: the computational complexity and the risk of being trap in local optimum of the MCP and the solid knowledge of the system considered to extract the rules in fuzzy control system. Extensive analysis is conducted in order to assess the proposed definition of the (state and action) pairs, as well as the reward function. Assuming dynamic traffic demand, with the aim of satisfying the maximum utilization of the freeway capacity, and with avoiding congestion our approach can satisfy the need of freeway optimal control. Fares, Prof. Walid (E-JUST)

Freeway Ramp-Metering Control

June 11, 2014

6 / 18

Freeway Traffic Flow Model I

N(t + ∆t) = N(t) + ∆t × [λqu (t) − λqd (t) + qr (t)]

(1)

Dividing both sides of equation (1) by λ∆x. ρ(t + ∆t) = ρ(t) +

∆t qr (t) [qu (t) − qd (t) + ] ∆x λ

qd (t) = vff (1 − Fares, Prof. Walid (E-JUST)

ρ(t) )ρ(t) ρjam

Freeway Ramp-Metering Control

(2) (3)

June 11, 2014

7 / 18

Freeway Traffic Flow Model II Fundamental Diagram of Traffic Flow 1

At the jam density the flow is zero.

2

At the critical density the flow is maximum.

3

Below the critical density the flow increases with the density and above the critical density the flow decreases with the density.

Equations (1), (2) and (3) complete the description of the freeway model. Equation (3) states that qd = f (ρ) which can be substituted into (1) and (2). Therefore, the objective of the ramp metering control is to keep the freeway density around the critical density ρ ∈ [ρc − ε, ρc + ε]. Fares, Prof. Walid (E-JUST)

Freeway Ramp-Metering Control

June 11, 2014

8 / 18

Reinforcement Learning Reinforcement learning (RL): is a machine learning technique that addresses the question of how an autonomous agent, can learn to choose optimal actions to achieve its goals. The control agent needs to optimize its behavior according to some objective function. This introduces the concept of reward received from the environment. The agent can choose any action from a set of actions A to perform. An agent is learning by trial and error interactions with the surrounding environment. Every time the agent chooses an action at to be performed in some state st , the environment responds by a reward or penalty ri as an evaluation of the quality of this immediate transition. These interactions lead to a sequence of state-action pairs (si , ai ) and immediate rewards ri . Fares, Prof. Walid (E-JUST)

Freeway Ramp-Metering Control

June 11, 2014

9 / 18

Q-Learning Algorithm 1: Q-learning ˆ a) (for example, by random small values); Initialize the lookup table Q(s, forall the s ∈ S , a ∈ A do repeat Start in an initial state s; repeat Choose action a and execute; Receive reward r and successor state s 0 ; ˆ 0 , s 0 ); ˆ a) := r (s, a) + γ maxa0 Q(a Q(s, s := s 0 ; until s is a terminal state or time limit reached; ˆ converges; until Q end

Fares, Prof. Walid (E-JUST)

Freeway Ramp-Metering Control

June 11, 2014

10 / 18

RL Density Control Agent State Space 1

The number of vehicles in the mainstream N(t + 1) (Equation (1)).

2

The number of vehicles that entered the freeway from ramp ∆N(t + 1) during the last time step.

3

The ramp traffic signal at the previous time step Ts(t).

Action Space The action space is modeled as consisting only of two actions: red & green

Reward Function RLCA agent’s goal is to keep the freeway density ρ around the critical density ρcr . r is designed so as to depend on the current ρ and how much it deviates from ρcr . 1 r (s, a) = (4) |ρ − ρcr | Fares, Prof. Walid (E-JUST)

Freeway Ramp-Metering Control

June 11, 2014

11 / 18

Experiments: Results and Analysis I First Case Study 1 KM

1 KM

1 KM

1 KM

1 KM

1 KM

O1

D1 S2 ramp metering

S1 O2

The network parameters are:

This network consists of: 1

A mainstream freeway with two lanes & a metered on-ramp with one lane.

The freeway capacity is 4000 veh/h. The on-ramp capacity is 2000 veh/h.

2

Two sources of inflow,O1 mainstream & O2 on-ramp; one discharge point D1 .

Vff = 60 km2 /h.

3

Two sections S1 & S2 . S1 is 4 km long. S2 is 2 km long.

Fares, Prof. Walid (E-JUST)

ρ(0) = 40 veh/km/lane, ρjam = 180 veh/km/lane, ρcr = 37.5 veh/km/lane. ∆X = 6 km.

Freeway Ramp-Metering Control

June 11, 2014

12 / 18

Experiments: Results and Analysis II

4500 up stream flow ramp flow

4000 3500 Flow q ( Veh/KM)

The demand scenario follows the following distribution: the mainstream demand remains constant for 2 hours at high level (3500 veh/h ) near the capacity and finally drops to low level (1000 veh/h) in the last 15 minuets.

3000 2500 2000 1500 1000 500 0

0

50

100

150

length Lenght of simulation step (1 Min )

The on-ramp demand begins at low level (500 veh/h), in 7.5 minuets increases to high level (1500 veh/h) near the capacity, remains constant for 15 minuets and finally drops to low level (500 veh/h) in 7.5 minuets and remains constant for 2 hours. Fares, Prof. Walid (E-JUST)

Freeway Ramp-Metering Control

June 11, 2014

13 / 18

Experiments: Results and Analysis III 5500 qt without control

5000

qd without control qt smart control

4500

qd smart control

Flow(veh/h)

4000 3500 3000 2500 2000 1500 1000 0

50

100

150

length Lenght of simulation step (1 Min) 2

120

1.8

without control

110

1.6 1 mean grean 2 mean red

Density

of mainline(veh/km)

smart control

100 90 80 70 60 50 40

1.2 1 0.8 0.6 0.4

30 20

1.4

0.2 0

50

100

length Lenght of simulation step(1 Min)

Fares, Prof. Walid (E-JUST)

150

0

20

40

60

80

100

120

140

Lenght of simulation step(10s) length

Freeway Ramp-Metering Control

June 11, 2014

14 / 18

Experiments: Results and Analysis IV Second Case Study ΔX = 500m O1

D1 ramp metering O2

This network consists of: 1

A mainstream freeway of 500 m long and one metered on-ramp.

2

Two sources of inflow O1 mainstream, O2 on-ramp & one source of outflow D1 .

3

The freeway has 3 lanes and the ramp has only one lane.

Fares, Prof. Walid (E-JUST)

The network parameters are: Vff = 80 km2 /h. ρ(0) = 30 veh/km/lane, ρjam = 110 veh/km/lane, ρcr = 55 veh/km/lane. ∆X = 500 m.

Freeway Ramp-Metering Control

June 11, 2014

15 / 18

Experiments: Results and Analysis V 2400 upstream flow ramp flow

2200

Traffic inflow(veh/h)

2000 1800 1600 1400 1200 1000 800 600 0

50

100

150

200

250

300

350

400

450

500

length Lenght of simulation step(10s) 160

2 without control

150

1.6

1 mean grean 2 mean red

Density of mainline(veh/km)

1.8

smart control

140 130 120 110 100 90

1.2 1 0.8 0.6 0.4

80

0.2

70 60

1.4

5

10

15

20 25 30 35 length Lenght of simulation step(10s)

Fares, Prof. Walid (E-JUST)

40

45

50

0

5

10

15

20

25

30

35

40

length Lenght of simulation step(10s)

Freeway Ramp-Metering Control

June 11, 2014

45

50 *10

16 / 18

Conclusion and Future Direction A reinforcement learning based density control agent is proposed. The RCLA objective function is to optimize the density of the freeway mainstream in order to maximize the flow and minimize the total travel time (TTT). RLCA is tested against two different case studies with two different network architectures and demands. In the first case study - the dense network: the proposed RLCA guarantees a good performance in terms of keeping the flow close to the freeway capacity. In the second case study - the light network: the optimal control sequence of actions of the RLCA are always green. This case study illustrates that our RLCA always does the optimal action. We extend this work to more complicated networks. Such networks contain several ramps. In the future we plan to incorporate the model with dynamic speed limits. Adaption to changing road conditions such as sudden occurrence of accidents and weather hazards. Fares, Prof. Walid (E-JUST)

Freeway Ramp-Metering Control

June 11, 2014

17 / 18

Thanks.. Your Question

Fares, Prof. Walid (E-JUST)

Freeway Ramp-Metering Control

June 11, 2014

18 / 18

Suggest Documents