Conference Papers - Year 2020

Year 2020

W. Joe and H. C. Lau. Deep Reinforcement Learning Approach to Solve Dynamic Vehicle Routing Problem with Stochastic Customers. In Proc. 30th International Conference on Automated Planning and Scheduling (ICAPS 2020), Nice, France, June 2020.

In real-world urban logistics operations, changes to the routes and tasks occur in response to dynamic events. To ensure customers’ demands are met, planners need to make these changes quickly (sometimes instantaneously). This paper proposes the formulation of a dynamic vehicle routing problem with time windows and both known and stochastic customers as a route-based Markov Decision Process. We propose a solution approach that combines Deep Reinforcement Learning (specifically neural networks-based Temporal-Difference learning with experience replay) to approximate the value function and a routing heuristic based on Simulated Annealing, called DRLSA. Our approach enables optimized re-routing decision to be generated almost instantaneously. Furthermore, to exploit the structure of this problem, we propose a state representation based on the total cost of the remaining routes of the vehicles. We show that the cost of the remaining routes of vehicles can serve as proxy to the sequence of the routes and time window requirements. DRLSA is evaluated against the commonly used Approximate Value Iteration (AVI) and Multiple Scenario Approach (MSA). Our experiment results show that DRLSA can achieve on average, 10% improvement over myopic, outperforming AVI and MSA even with small training episodes on problems with degree of dynamism above 0.5.

 

A. Singh, A. Kumar and H. C. Lau. Hierarchical Multiagent Reinforcement Learning for Maritime Traffic Management. In Proc. 19th International Conference on Autonomous Agents and Multiagent Systems(AAMAS 2020), Auckland, New Zealand, May 2020.

Increasing global maritime traffic coupled with rapid digitization and automation in shipping mandate developing next generation maritime traffic management systems to mitigate congestion, increase safety of navigation, and avoid collisions in busy and geographically constrained ports (such as Singapore’s). To achieve these objectives, we model the maritime traffic as a large multiagent system with individual vessels as agents, and VTS (Vessel Traffic Service) authority as a regulatory agent. We develop a hierarchical reinforcement learning approach where vessels first select a high level action based on the underlying traffic flow, and then select the low level action that determines their future speed. We exploit the nature of collective interactions among agents to develop a policy gradient approach that can scale up to large real world problems. We also develop an effective multiagent credit assignment scheme that significantly improves the convergence of policy gradient. Extensive empirical results on synthetic and real world data from one of the busiest port in the world show that our approach consistently performs significantly better than the previous best approach.

 

 

SUBSCRIBE TO OUR NEWSLETTER

Keep up to date with what's happening at the Singapore Management University

Newsletter checkboxes