Asynchronous Deep Q-Network

Implemented an asynchronous DQN in which multiple actor workers interact with parallel environment instances and push transitions to a shared replay buffer, while a separate learner thread performs Q-learning updates concurrently. Decoupled environment stepping from gradient computation to hide simulator latency and keep GPU utilisation high, achieving substantially higher throughput than synchronous DQN at matched hardware. Validated on a 4-way intersection navigation scenario in the CARLA simulator: trained a DQN-based acceleration controller that consumes a bird’s-eye-view of the intersection and selects longitudinal actions to safely negotiate cross-traffic, converging in fewer training hours than the synchronous baseline.

Dixant Mittal
Dixant Mittal

My research interests include reinforcement learning, planning & search, large language models, and decision-making under uncertainty.