Epistemically-guided forward-backward exploration

09 August 2025 · 1 min read

Zero-shot RL methods like forward-backward representations (FB) have so far been decoupled from the exploration problem. We design exploration policies that arise naturally from the FB representation by minimizing its posterior variance, hence its epistemic uncertainty, and show that this considerably improves sample complexity over other exploration methods.