Irreversible local Markov chains with rapid convergence towards equilibrium Sebastian C. Kapfer1, ∗ and Werner Krauth2, 3, † Theoretische Physik 1, FAU Erlangen-N¨ urnberg, Staudtstr. 7, 91058 Erlangen, Germany 2 Laboratoire de Physique Statistique, D´epartement de physique de l’ENS, Ecole Normale Sup´erieure, PSL Research University, Universit´e Paris Diderot, Sorbonne Paris Cit´e, Sorbonne Universit´es, UPMC Univ. Paris 06, CNRS, 75005 Paris, France 3 Department of Physics, Graduate School of Science, The University of Tokyo, 7-3-1 Hongo, Bunkyo, Tokyo, Japan (Dated: May 19, 2017)
We study the continuous one-dimensional hard-sphere model and present irreversible local Markov chains that mix on faster time scales than the reversible heatbath or Metropolis algorithms. The mixing time scales appear to fall into two distinct universality classes, both faster than for reversible local Markov chains. The event-chain algorithm, the infinitesimal limit of one of these Markov chains, belongs to the class presenting the fastest decay. For the lattice-gas limit of the hard-sphere model, reversible local Markov chains correspond to the symmetric simple exclusion process (SEP) with periodic boundary conditions. The two universality classes for irreversible Markov chains are realized by the totally asymmetric simple exclusion process (TASEP), and by a faster variant (lifted TASEP) that we propose here. Lifted Markov chains and the recently introduced factorized Metropolis acceptance rule extend the irreversible Markov chains discussed here to general pair interactions and to higher dimensions.
versible ones, and that they fall into two universality classes. The first one mixes in O(N 5/2 ) steps, and is related to the totally asymmetric simple exclusion process (TASEP, see [17–19]). The other mixes in O(N 2 log N ) and comprises the event-chain algorithm (ECMC) [20] and a modified lifted TASEP that we propose in this work. We refer to it as the lifted TASEP class. The conceptual framework for our approach to irreversible Markov chains is provided by the lifting concept [21, 22] together with a factorized Metropolis filter [23], which considerably extend the range of applications for irre-
a)
c)
N
N 3 logN
N 5/2 N 2 logN
τmix
4 Var u
The hard-sphere model plays a central role in statistical mechanics. In three spatial dimensions (3D), the classical hard-sphere crystal melts in a first-order phase transition [1, 2], whereas 2D hard spheres undergo a sequence of two phase transitions that have been characterized only recently [3, 4]. Hard spheres have established paradigms for order-from-disorder phenomena driven by the depletion interaction [5, 6], and for two-dimensional melting with its dissociation of orientational and positional order [7, 8]. The dynamics of the hard-sphere model has also been the focus of great attention, from the first algorithmic implementation of Newtonian mechanics through event-driven molecular dynamics [9] and the discovery of algebraically decaying velocity autocorrelations [10] to insights into the glass transition [11] and granular materials [12], and from the first definition of Markovchain Monte Carlo dynamics [13] to rigorous proofs of convergence rates towards equilibrium in some special cases [14]. In 1D, the thermodynamics and the static correlation functions of finite hard-sphere systems can be computed exactly [6, 15]. Newtonian dynamics is pathological for equal sphere masses, because colliding spheres simply exchange their velocities without mixing them. Stochastic dynamics, however, may converge to equilibrium. For example, reversible heatbath dynamics mixes (that is, converges towards equilibrium from an arbitrary starting configuration) in at most ∼ N 3 log N individual steps for N spheres [16]. In the present work, we study irreversible local Markov chains for 1D hard-sphere systems that violate the detailed-balance condition yet still converge towards equilibrium. We show by numerical simulation that these irreversible Markov chains typically mix faster than re-
2 0 0.0
1010 0.1
0.2
t /N3
Heatbath
108
b)
Seq. Heatbath
4 Var u
arXiv:1705.06689v1 [cond-mat.stat-mech] 18 May 2017
1
Metropolis
2 0 0.00
Forward Metropolis
106
Lifted Metropolis Event-chain
0.02
0.04
t / (N 3 logN )
102
103
104
N
FIG. 1. Mixing of local 1D Markov chains. a) Relaxation of Var ui from the compact initial state under heatbath dynamics (x axis rescaled by N 3 , y axis by equilibrium value). b) Rescaling of x axis with an additional logarithm illustrates O(N 3 log N ) time scale. c) Mixing times for the Markov chains discussed in this work. The step ε is uniformly sampled from (0; 2.5`free ) (for the event-chain algorithm, τmix is measured in lifting moves).
2 versible Markov chains (see Ref. [24] for a review). Some evidence for reduced mixing time scales has already been obtained [25]. For concreteness, we restrict ourselves to N hard spheres of diameter d on a circle of length L, so that the free space is Lfree = L − N d, and the mean gap between spheres `free = Lfree /N . A configuration a is defined by ordered particle positions a = (. . . xi−1 , xi , xi+1 , . . . ) with gaps δi = xi − xi−1 − d ≥ 0, using appropriate periodic boundary conditions. The partition function is analytic for all densities in the thermodynamic limit, and no phase transition takes place [6, 15]. The model is isomorphic to N point particles on a circle of length Lfree with the same gap variables and an interaction V (δi ≤ 0) = ∞ and V (δi > 0) = 0 implementing both the non-overlap and the ordering constraint. Because the 1D hard-sphere system can be mapped onto a gas of free particles with mean gap `free , and because all parameters of our Markov chains scale with `free , their dynamics and in particular their mixing times do not depend on density. This is despite the fact that the spatial correlation length of the hard-sphere model diverges in the close-packing limit. We first consider the reversible heatbath algorithm, which moves at each time step t = 0, 1, . . . , a random sphere i to a random position between i − 1 and i + 1 (xi is uniformly sampled from (xi−1 + d, xi+1 − d)). When studying the mixing dynamics, we ignore trivial uniform rotations of the configuration (which only mix in O(N 4 ) [16]). We thus restrict our attention to quantities that can be expressed in terms of the δi and focus on the slow, large-scale density fluctuations. The reversible heatbath algorithm is known to mix in at most ∼ N 3 log N steps. We assume that the slowest timescale is exposed by tracking the distribution π(ui ) of any half-system distance ui = δi + δi+1 + . . . + δi+N/2 from a compact initial configuration at t = 0, where the variance of ui equals L2free /4, towards equilibrium at t ∼ τmix , where Var ui = L2free /(4N + 4). Our simulations indeed show, in agreement with the rigorous bounds [16], that O(N 3 ) steps are insufficient for mixing (see Fig. 1a), while τmix = O(N 3 log N ) steps suffice to mix (see Fig. 1b). Our numerical method thus reliably identifies logarithmic corrections to the polynomial scaling laws. Plots analogous to Fig. 1a,b obtain the mixing time scales for all Markov chains studied in the present work (see Fig. 1c). For the heatbath algorithm, a simple scaling argument for the discrete lowest-k Fourier mode of the sphere positions U = (u1 +u2 +· · ·+uN/2 )/N 3/2 yields the dominant O(N 3 ) behavior: In the limit N → ∞, the standard deviation of π(U ) is O(1), and one heatbath step changes U by O(1/N 3/2 ). A random walk in t yields: 1 √ N 3/2
τmix ∼ 1 =⇒ τmix ∼ N 3 .
(1)
The reversible Metropolis algorithm mixes on the same O(N 3 log N ) scale as the reversible heatbath algorithm
FIG. 2. Metropolis flow into configuration a by moves of sphere i. For given ε, Fi± (ε), R∓ ∈ {[1, 0], [0, 1]} i (ε) as a move by ε is either accepted or rejected. Flows here are integrated over the step distribution p(ε). The rela tion Fi+ (ε), R+ (ε) ∈ {[1, 0], [0, 1]} justifies the forward i−1 Metropolis algorithm.
(see Fig. 1c). At each step, it considers a move from a towards a configuration a ˜ with x ˜i = xi + σε, where σ = ±1 samples the forward or backward direction and ε > 0 samples the step from some distribution p(ε) on the scale `free . If a ˜ is invalid, the move a → a results. The reversible Metropolis algorithm satisfies detailed balance between a and any a ˜, π(a)p(a → a ˜) = π(˜ a)p(˜ a → a), with weights π(a) = 1 for valid configurations and π(a) = 0 for all others, simply because the moves a ˜ → a and a → a ˜ are equally likely. The flow into a has four components for each sphere iR (see Fig. 2), namely accepted forward flow Fi+ = dε p(ε)Fi+ (ε), corresponding to σ = +1 and analogously accepted backward flow Fi− for σ = −1. Rejected − forward and backward flows R+ i and Ri from a towards invalid configurations a ˜ also contribute to the flow into a. For given ε, these flows, Fi+ (ε) = Θ(δi − ε)
R+ i (ε) = Θ(ε − δi+1 )
Fi− (ε) = Θ(δi+1 − ε)
R− i (ε) = Θ(ε − δi ),
(2)
(with Θ the Heaviside step function) are either unity or zero. Moreover, they add up to unity in pairs: Fi+ (ε) + − + R− i (ε) = 1, Fi (ε) + Ri (ε) = 1, because each move is accepted (or rejected) under the same condition as its return move. It follows that for any distribution of ε, the sum of the four flows equals 2. Global balance requires the total flow into a configuration a to equal π(a) and this trivially follows for the reversible Metropolis algorithm from 1 X + − − Fi + R+ = 1. i + Fi + Ri 2N i | {z } 2
Here, the normalization 2N corresponds to the sampling choices of the direction σ = ±1 for each i. Moreover, global balance is satisfied even if only the sphere i is updated (the flow into the configuration, after the up− − date, is then 12 Fi+ + R+ + F + R = 1). Together i i i with irreducibility and aperiodicity, this proves the validity of the sequential Metropolis algorithm, the historically first irreversible Markov chain [13], where spheres are updated sequentially, say, in ascending order in i. The sequential Metropolis algorithm mixes faster than
a) TASEP
3 Local 1D Hard-sphere Markov chains Mixing Continuous Discrete time scale Heatbath [16], Metropolis Symm. SEP N 3 log N Forward & Lifted Metrop.a TASEP [17, 26] N 5/2 Event-chain, Lifted Metrop. Lifted TASEP N 2 log N
t+1
b) Lifted TASEP
t
a
t+1
t
FIG. 3. Update rules for discrete models. a) The TASEP advances a random sphere, if possible, and it mixes in O(N 5/2 ). b) The lifted TASEP, without restarts, is deterministic. With restarts, it mixes in O(N 2 log N ).
the reversible Metropolis algorithm, but with the same O(N 3 log N ) scaling. This indicates that the extra logarithm is not due to the fact that it takes O(N log N ) trials to move each sphere at least once (the coupon collector’s problem). The relation Fi+ (ε) + R− i (ε) = 1 from Eq. 2 can be expressed as Fi+ (ε) + R+ i−1 (ε) = 1. Summed over i and integrated over an arbitrary step distribution p(ε), 1 X + Fi + R+ i−1 = 1, N i | {z }
(3)
=1
this provides the incoming flow for the forward Metropolis algorithm, which thus satisfies global balance. The algorithm attempts at each time step a forward move (σ ≡ +1) sampled from the probability distribution p(ε), for a randomly sampled sphere i. (In contrast, the sequential version of the forward Metropolis algorithm is invalid, as the incoming flow after the update of a sphere i, Fi+ + R+ i , in general differs from unity.) Remarkably, we find that the forward Metropolis algorithm mixes on a time scale O(N 5/2 ), that is, faster than the reversible local Markov chains (see Fig. 1c). The same O(N 5/2 ) time scale governs the relaxation of the totally asymmetric simple exclusion process (TASEP), a lattice transport model which converges to equilibrium under periodic boundary conditions. For the TASEP, the spectral gap can be computed explicitly [17, 26]. An individual TASEP step attempts to move a randomlysampled sphere one site to the right (Fig. 3a). Indeed, the TASEP agrees with the forward Metropolis algorithm restricted to integral positions and steps. By tracking the lattice equivalent of Var ui , we indeed find O(N 5/2 ) mixing for the TASEP, while a symmetric SEP, itself the lattice version of the Metropolis algorithm, mixes in O(N 3 log N ). The relation Fi+ (ε) + R+ i−1 (ε) = 1, without summing over i (see Eq. 3), corresponds to the total incoming flow of a special sequential Markov chain, the lifted Metropolis algorithm. In contrast to the forward Metropolis algorithm, the active sphere i at time t + 1 is determined from the outcome at time t (see Fig. 3b): If sphere i
without restarts
TABLE I. Mixing time scales for local 1D hard-sphere algorithms on the continuum and on the lattice. The Markov chains on the lowest row all incorporate restarts.
can move, it remains active for the next step. If it cannot, a lifting move i → i + 1 takes place instead and i + 1 becomes the active sphere. The incoming flow to a configuration (a, i), integrated over the step distribution p(ε), is Fi+ + R+ i−1 = 1. Global balance again holds. We find that the lifted Metropolis algorithm, run as an uninterrupted Markov chain without restarts (see below) mixes in O(N 5/2 ) steps (see Fig. 1c). It thus belongs to the TASEP universality class. It is advantageous to restart the lifted Metropolis algorithm, after O(N ) steps, by resampling the active sphere i. We then observe mixing on a time scale O(N 2 log N ), much faster than all previous Markov chains (see Fig. 1c). The ∼ N 2 time scale is again brought out by a scaling argument for the discrete Fourier mode U : One chain (sequence of O(N ) moves between restarts) of the lifted Metropolis algorithm can change U by O(1/N 1/2 ). This change is of random sign. A random walk in t/N then yields τmix ∼ N 2 . The lifted Metropolis algorithm also has a discrete counterpart in the lifted TASEP (see Fig. 3b): A single sphere (occupied lattice site) is active and attempts to advance in forward direction. The sphere remains active if its move is accepted. Otherwise, the lifting index advances to the right-hand neighbor site. The lifted TASEP satisfies global balance, but without restarts fails to be irreducible (it cannot separate adjacent holes, for example). With restarts every O(N ) steps, the lifted TASEP also mixes on a O(N 2 log N ) time scale. The infinitesimal limit of the lifted Metropolis algorithm, ε → 0, is the event-chain algorithm (ECMC), which also scales in O(N 2 log N ) lifting moves (see Fig. 1c). It thus also belongs to the lifted TASEP universality class. A synopsis of our results is shown in Table I. The lifted TASEP universality class is particularly intriguing, as with O(N 2 log N ) mixing each sphere only moves O(N log N ) times to reach equilibrium. This almost saturates a lower limit for local algorithms with O(`free ) moves, as O(N ) moves per sphere are required to detach from the compact initial state. The irreversible Markov chains presented here are best viewed from their deterministic roots, both on the lattice and in the continuum. Indeed, the lifted TASEP without restarts is a deterministic lattice-gas automaton satisfying global balance, rather than a Markov chain
4 (see Fig. 3b). Irreducibility and aperiodicity call for an element of randomness that can most easily be imported into the algorithms by restarts, that is, by resampling the active sphere i with probability pres. . Here, pres. ∼ 1/N (lifted TASEP) and pres. ∼ 1 (TASEP) represent different universality classes. In the continuum, the deterministic root is an algorithm without restarts and invariant step ε, which satisfies global balance. All presented Metropolistype Markov chains may be obtained from this root by resampling of i or ε. Algorithms belong to different universality classes, depending on the resampling rate: Resampling ε at every step leads to the lifted Metropolis algorithm without restarts which is in the TASEP class. The additional resampling of i with rate ∼ 1/N yields the lifted Metropolis algorithm with restarts which is in the lifted TASEP class. More frequent resampling of i, with rate ∼ 1, yields the forward Metropolis algorithm (TASEP class). A particular limit of the continuum root algorithm is ECMC without restarts. In 1D, this algorithm agrees with Newtonian dynamics with a single sphere of non-zero velocity, and never mixes. Resampling i with rate ∼ 1/N (corresponding to an occasional restart of Newtonian dynamics) turns the deterministic dynamics into a very fast local algorithm, the ECMC. ECMC is in the lifted TASEP class, and it mixes in O(N 2 log N ) lifting moves (see Fig. 1c). The infinitesimal displacements of ECMC are crucial for its generalization to higher dimensions, where particles simultaneously interact with several others. As in the present work, this generalization calls for modifications of the Metropolis algorithm [23, 27, 28]. ECMC has been applied successfully (see Ref. [24] for a review), but prior to the present work, its mixing behavior was not characterized in detail, beyond some partial evidence for faster mixing time scales [25]. In conclusion, it will be important to clarify whether the different universality classes identified in the present work carry over to higher dimensions and to general interactions, as well as to partially asymmetric step distributions. Furthermore, the propagation of excitations will have to be understood, in particular as the fast mixing necessarily involves the asymmetric motion of density shocks. Exact solutions of some of the models in Table I may be possible. This would help establish the paradigms of lifting and of irreversible Markov chains beyond the single-particle level [22], and provide a conceptual framework for Markov chains beyond the detailedbalance paradigm introduced by Metropolis et al. in 1953 [13].
ACKNOWLEDGEMENT
∗ †
[1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13]
[14]
[15] [16]
[17] [18] [19] [20] [21]
[22] [23] [24] [25] [26] [27] [28]
SCK thanks the Institut Philippe Meyer for support.
[email protected] [email protected] J. G. Kirkwood and E. Monroe, J. Chem. Phys. 9, 514 (1941). W. G. Hoover and F. H. Ree, J. Chem. Phys. 49, 3609 (1968). B. J. Alder and T. E. Wainwright, Phys. Rev. 127, 359 (1962). E. P. Bernard and W. Krauth, Phys. Rev. Lett. 107, 155704 (2011). S. Asakura and F. Oosawa, J. Chem. Phys. 22, 1255 (1954). W. Krauth, Statistical Mechanics: Algorithms and Computations (Oxford University Press, 2006). B. I. Halperin and D. R. Nelson, Phys. Rev. Lett. 41, 121 (1978). A. P. Young, Phys. Rev. B 19, 1855 (1979). B. J. Alder and T. E. Wainwright, J. Chem. Phys. 27, 1208 (1957). B. J. Alder and T. E. Wainwright, Phys. Rev. A 1, 18 (1970). G. Parisi and F. Zamponi, Rev. Mod. Phys. 82, 789 (2010). S. Torquato and F. H. Stillinger, Rev. Mod. Phys. 82, 2633 (2010). N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller, and E. Teller, J. Chem. Phys. 21, 1087 (1953). R. Kannan, M. W. Mahoney, and R. Montenegro, in Proc. 14th annual ISAAC , Lecture Notes in Computer Science (Springer, Berlin, Heidelberg, 2003) pp. 663–675. L. Tonks, Phys. Rev. 50, 955 (1936). D. Randall and P. Winkler, “Mixing points on a circle,” in Approximation, Randomization and Combinatorial Optimization, Lecture Notes in Computer Science, Vol. 3624, edited by C. Chekuri et al. (Springer, Berlin, Heidelberg, 2005). L.-H. Gwa and H. Spohn, Phys. Rev. Lett. 68, 725 (1992). T. Chou, K. Mallick, and R. K. P. Zia, Rep. Prog. Phys. 74, 116601 (2011). J. Baik and Z. Liu, J. Stat. Phys. 165, 1051 (2016). E. P. Bernard, W. Krauth, and D. B. Wilson, Phys. Rev. E 80, 056704 (2009). F. Chen, L. Lov´ asz, and I. Pak, Proceedings of the 17th Annual ACM Symposium on Theory of Computing , 275 (1999). P. Diaconis, S. Holmes, and R. M. Neal, Annals of Applied Probability 10, 726 (2000). M. Michel, S. C. Kapfer, and W. Krauth, J. Chem. Phys. 140, 054116 (2014). S. C. Kapfer and W. Krauth, To appear in EPJ B, 2017. Y. Nishikawa, M. Michel, W. Krauth, and K. Hukushima, Phys. Rev. E 92, 063306 (2015). D. Dhar, Phase Transitions 9, 51 (1987). E. A. J. F. Peters and G. de With, Phys. Rev. E 85, 026703 (2012). S. C. Kapfer and W. Krauth, Phys. Rev. E 94, 031302 (2016).