Energy-based models (EBMs) are powerful yet conceptually simple generative models used across various scientific fields, including biology, neuroscience, and signal processing. These models encode statistical information in the effective Hamiltonian of a Boltzmann distribution, enabling analysis through statistical physics methods. However, training and sampling EBMs on highly structured (clustered) datasets pose significant challenges, as the evolving probability landscape becomes increasingly difficult to explore, leading to slow convergence in Monte Carlo simulations.
Traditional techniques like parallel tempering often fail in addressing this issue due to their high computational cost and inability to overcome first-order phase transitions emerging in clustered datasets when the temperature is lowered. In this talk, I will introduce the “trajectory annealing” paradigm using the Restricted Boltzmann Machine as a representative EBM. This approach significantly reduces the mixing time of Monte Carlo simulations by leveraging the model’s training trajectory.
First, I will demonstrate how trajectory annealing improves sampling efficiency by drastically reducing equilibration times while mitigating the limitations of parallel tempering. Second, I will show how this method enables online estimation of the model’s log-likelihood at minimal computational cost, providing an effective metric for evaluating training quality. This approach is not only more accurate but also substantially faster than Annealed Importance Sampling.
Marco Baiesi