Learning to Search from Demonstration Sequences

Dixant Mittal, Liwei Kang, Wee Sun Lee

April, 2025

Abstract

We study the problem of learning to search from demonstration sequences. Online search algorithms, such as Monte Carlo Tree Search (MCTS), iteratively simulate trajectories and update action-values to find good actions at decision time. Designing the right search heuristics to guide these simulations is challenging and typically requires substantial domain knowledge. In this work, we propose to learn the search heuristics directly from expert demonstration sequences, enabling the search algorithm to focus on the most promising parts of the search space without hand-crafted domain-specific knowledge. We show that our approach achieves strong performance on planning benchmarks including Sokoban and grid-world navigation, consistently outperforming baselines that do not leverage demonstration data.

Type

Conference paper

Publication

In International Conference on Learning Representations