Engines for predictive work extraction from memoryful quantum stochastic processes

Quantum information-processing techniques enable work extraction from a system's inherently quantum features, in addition to the classical free energy it contains. Meanwhile, the science of computational mechanics affords tools for the predictive modeling of non-Markovian classical and quantum stochastic processes. We combine tools from these two sciences to develop a technique for predictive work extraction from non-Markovian stochastic processes with quantum outputs. We demonstrate that this technique can extract more work than non-predictive quantum work extraction protocols, on one hand, and predictive work extraction without quantum information processing, on the other. We discover a phase transition in the efficacy of memory for work extraction from quantum processes, which is without classical precedent. Our work opens up the prospect of machines that harness environmental free energy in an essentially quantum, essentially time-varying form.


Introduction
In the earliest heat engines, a combustible fuel was burned to maintain a temperature gradient Ruo Cheng Huang: ruocheng001@e.ntu.edu.sgPaul M. Riechers: pmriechers@gmail.comMile Gu: mgu@quantumcomplexity.orgVarun Narasimhachar: varun.achar@gmail.combetween hot and cold heat reservoirs.The second law of thermodynamics holds that no engine can sustainably function with a single reservoir [1,2,3,4].While thought experiments such as Maxwell's demon and Szilard's engine initially appear to defy this law [5], a more complete understanding of thermodynamics resolved the apparent paradox: the resource powering the engine need not be a temperature gradient, but may be any form of free energy-even information [6,7,8,9,10].The emerging field of quantum thermodynamics has continued to expand the scope of "fuel" to increasingly general forms of free energy [11,12].There has been both theoretical and experimental advancement in constructing engines that can harness the free energy locked up in quantum coherence, over and above classical free energy [13,14,15,16,17].
The story does not stop there-in addition to static fuel, there is also a dynamical fuel-like resource embodied by complex thermodynamic processes.The framework of computational mechanics in complexity science offers powerful techniques for the characterization and manipulation of stochastic processes.The future behaviour of such a process in general cannot be known perfectly using data from its past.Nevertheless, temporal correlations, i.e., patterns in a process's behaviour over time, enable prediction.These correlations may even be non-Markovian, whereby the future of a process depend not only on its present, but also on its distant past.Epsilon machines and their quantum extensions [18,19,20] perform memory-optimal predictive modelling of stochastic processes.Pattern ex-

Our contributions
Here, we develop the theoretical prototype for a predictive quantum engine: a machine that charges a battery by feeding on a multipartite quantum system whose parts are temporally correlated via a classical stochastic process.In other words, the engine's fuel is a classical stochastic process with quantum outputs.It can extract free energy beyond what is accessible to current quantum engines or classical predictive engines.We present a systematic construction of such an engine for arbitrary classical processes and quantum output states.We illustrate its application on example stochastic processes of correlated non-orthogonal qubits.(Fig. 1).We also use this test case to benchmark the performance of our engine against various alternatives, including one without coherent quantum information processing, and one without predictive functionality.Our predictive quantum engine outperforms these alternatives in terms of work output.We show that parametrized processes of correlated non-orthogonal quantum outputs exhibit phase boundaries between parametric regions where memory of past observations can and cannot enhance the work yield-despite the apparently smooth change of memoryful correlations in the process across this boundary.The sudden lack of memory advantage is thus fundamentally thermodynamic (since prediction per se has more freedom than during work extraction) and fundamentally quantum (since classical engines can exploit all the process's inherent memory).Finally, we generalize the Information Processing Second Law (IPSL) to the quantum regime and derive the fundamental bounds on a quantum pattern engine's performance.
We begin by introducing stochastic processes as a source of quantum states that will fuel our engine in Section 1.2.Section 2 shows how general interactions with the correlated quantum states induce belief states that enable optimal prediction and work extraction.We then illustrate how a theoretical pattern engine can be constructed in Section 3. The performance of the engine on an example process is then evaluated in Section 4. This leads to the discovery and explanation of the aforementioned phase transition in the efficacy of memory.The fundamental limit of a pattern engine is then derived in Section 5, and finally we draw our conclusion in Section 6.

What fuels our engine
Typically in quantum information theory, sources of quantum states are assumed to be memoryless, producing independent and identically distributed (IID) quantum states at each discrete time [22,23].On the contrary, nature abounds with rich dynamics.To go beyond the IID paradigm, we consider general finite-state sources of quantum states, which can create highly nontrivial correlations across time.Some simple examples are depicted in Fig. 1 Temporally patterned quantum outputs provide fuel for any entity capable of predicting them.Appendix A shows that any change in total correlation among outputs, as quantified by quantum relative entropy, changes nonequilibrium free energy proportionally.Correlations are therefore a source of free energy.
These memoryful sources of quantum states can be represented by a hidden Markov model (HMM) M = S, σ (x)  x∈X , T (x)  x∈X .Here, S is the HMM's set of classical latent states.The random variable representing the latent state at time t shall be denoted by S t .The transitionmatrix element T (x) s,s ′ = Pr(S t = s ′ , X t = x|S t−1 = s) represents the probability of transitioning from latent state s to s ′ and emitting the ddimensional quantum state σ (x) .
For simple HMMs, the memoryful transition structure can be visualized as an annotated directed graph, as in Fig. 1.In this graphical representation, nodes correspond to latent states while the directed edges correspond to latent-state-tostate transitions that produce a certain quantum output with a prescribed probability.
More complicated HMMs can generate more complex memoryful structure in the quantumoutput process.See Fig. 1(b) for a hint of this possible richness.
The HMM specifies the statistics of the non-Markovian classical variables X t across time, which, in turn, induce the quantum outputs indexed by x ∈ X .The quantum output process is described by the (formal) density operator where each time step t is associated with a unique elementary physical system A t , and ← → x = . . .x −1 x 0 x 1 . . .denotes a bi-infinite string over X .The joint quantum state is separable amongst the A t s, but can have non-classical correlations in the form of discord [24].These memoryful quantum sources generalize the kindred 'classically controlled qubit sources' of Ref. [25].
The time-indexed sequence of stochasticallygenerated quantum outputs σ At of such a process acts as a "fuel tape" that is fed to the engine (Fig. 2) one A t at a time, in a temporal order.If the quantum states could all be stored for a later time, then it would be straightforward to extract all nonequilibrium addition to free energy from the correlated quantum tape, by acting on the full sequence all at once [13], since the multipartite system could then be treated as a single quantum system.Rather, we consider a setting where each elementary physical system has an immediate expiration date-the engine can interact with A t at the present time t, and then never again.This time-locality restriction is shared with the information-ratchet [8,21] and repeated-interaction [23] frameworks.The challenge here is to extract maximal free energy when forced to act only locally and sequentially on the correlated quantum pattern.
Except for being constrained by a stationary finite-state source, the fuel process can be arbitrary in its alphabet size, statistics, and quantum outputs' finite dimensionality and form-σ (x) could be a pure or mixed state over any number of quantum degrees of freedom.Notice that here the form of the set of {σ (x) } x∈X does not vary with time, hence the lack of t index.To simplify notation, we will omit the system label A t unless required.
We assume that the source of the fuel tape is known exactly.This entails a complete knowledge of the underlying classical statistics and of the indexed set of quantum states {σ (x) } x∈X , but not of which specific string ← → x is instantiated.

Synchronizing to a quantum source
To fully leverage the structure of the pattern, the engine must dynamically incorporate information from past interactions with the elementary quantum systems, so that the engine's memory becomes correlated with the latent state of the source.This can be done by tracking-within the internal memory M of the engine-an interaction-induced belief state η t about the latent state of the source.
The type of interaction between engine and fuel at each time can depend on the memory of the engine.The general interaction and observation at time t can be described by a positive operator-valued measure (POVM) on the Hilbert space of the elementary fuel system A t ; Let O t denote the random variable for the observed outcome thereof.
The optimal belief state η t at time t is an observation-induced probability distribution over the latent states S of the source with probability elements This is the best knowledge that a local classical memory can have, as it represents the actual distribution over latent states as one would calculate via Bayes rule.It is convenient to treat η t as a length-|S| row vector for linear algebraic manipulation.Given a sequence of observations, what is the probability P t (x) that the source will next produce quantum state σ (x) ?It is P t (x) = η t T (x) 1, where 1 is the column vector of all ones.The belief state η t thus determines an expectation of the next quantum state with more free energy than the local reduced state ξ 0 of ρ← → A .This memory enhancement to free energy is proven in App.B.
The transition rules between these belief states are determined via Bayesian inference, based on the anticipated distribution of the observable O t .If the source is known, but no observations have yet been made, then the optimal belief state is simply the stationary distribution over source states: η 0 = π, which satisfies π = π x∈X T (x) .Theorem 1.For any POVM on the quantum state of the system at time t, the optimal belief state-about the latent state of the quantum source-updates iteratively according to where K t is the random variable for the belief state, and We derive Thm. 1 in Appendix C, generalizing the so-called mixed-state presentation [26,27,28,29,30].
The probability Pr(O t = o t |X t = x, K t−1 = η t−1 ) appearing in Eq. ( 4) is typically a straightforward physics calculation of the probability that the observed POVM outcome o t should be obtained, given that the system was prepared as σ (x) .Conditioning on the previous state of knowledge K t−1 is important to the extent that it can influence the choice of POVM applied at time t.
Notice that, for each o, the update rule for the belief state can be interpreted as a nonlinear return map.If the return map enables prediction, then we have some access to the nonequilibrium free energy in correlations.
Observation 1. Memory can be thermodynamically advantageous when the belief-state update rule has an attractor other than the stationary fixed point π for at least one observable outcome.

Engine construction
The engine is equipped with an internal classical memory M and access to a heat reservoir R at some fixed temperature T .Its objective is to extract work by raising the internal energy of a work reservoir or 'battery' B. To accomplish this, it can bring each system A t closer to its equilibrium state γ = e −H/k B T /Z, k B is Boltzmann's constant, and Z = Tr(e −H/k B T ) is the associated equilibrium partition function that yields the equilibrium free energy F = −k B T ln Z.For simplicity of presentation, we assume that each subsystem A t is subject to the same Hamiltonian H, but our results generalize in an obvious way if we allow different Hamiltonians for each subsystem.The construction of the pattern engine then requires only the description of the HMM M and the Hamiltonian H.
To harvest the free energy locked up in correlations, the engine's internal memory should somehow become correlated with the latent state of the source during its energy-harvesting opera-tion.However, directly measuring each quantum system would disturb the state and potentially cost energy.Rather, our engine updates its memory conditioned on the extracted-work value W t at each time.The change in the energy of the battery thus serves as the observable O t = W t for updating the belief state. At with associated probabilities This set of values is independent of the actual ddimensional quantum state σ input to the protocol, although the input state determines the probabilities of each outcome.
Notably, this yields the probability distribution for work extracted when the protocol optimized for ρ * t actually operates on σ (x) .Regardless of how the belief state influences the choice of ρ * t , we can now leverage Thms. 1 and 2 to rewrite the belief update as With Eq. ( 7), the belief-state return maps now reflect the physics of the work-extraction protocol.
From Eqs.
(3) and (6), we find that the workinduced transitions between belief states have probabilities Finally, to take thermodynamic advantage of this knowledge, the work-extraction protocol at each step1 is optimized for the expected state, so that ρ * t = ξ t .Indeed, extracting all work from the expected state ξ t requires a protocol designed around this expectation [31].In this case, the denominator in Eq. (7) simplifies to n λ n δ w t+1 ,w (n) .Similarly, Eq. (8) simplifies to n λ n δ w,w (n) .Combining Eqs.(5) and (8), we compute and find Note that the expectation value on the righthand side is taken over the instantaneous distribution over belief states.The meta-dynamic over belief states thus determines both the transient and asymptotic work-extraction rate.In particular, the stationary distribution over recurrent belief states allows closed-form expressions for the asymptotic work-extraction rate.
Both the belief states and transitions between them derive from the HMM of the known source.Belief states can thus be explicitly represented in the memory M of an autonomous workharvesting device.The memory states {(η, ε)} η,ε should also store the last measured energy state ε of the battery.A memory-controlled unitary can implement memory-assisted quantum work extraction, as depicted in the circuit diagram of Fig. 2. Subsequent measurement of the battery state then gives access to the work extracted and allows an autonomous update of the memory, according to the above-outlined rules of Bayesian prediction.This prediction-extraction cycle continues repeatedly, as suggested in Fig. 3.
In short, at every time step, the engine performs both prediction and work-extraction subroutines.The prediction subroutine updates the belief state stored in M via pre-programmed transition rules conditioned on the energy state of B. Each unique η t -up to the desired memory resolution-corresponds to a subset of distinguishable states in the memory.The engine's work-extraction subroutine is conditioned on the memory M and acts on A t with a quantum workextraction protocol tailored to extract work from ξ t .

Example processes and alternative approaches
To demonstrate our memory-assisted quantum approach-using a quantum work-extraction protocol designed for the work-observation-induced expected state ρ * t = ξ t -we apply it to the quantum perturbed-coin and golden-mean processes depicted in Fig. 1.We compute our engine's long-term work output from this approach (i) and compare its performance with those of three alternative approaches (ii -iv): (i) memory-assisted quantum, where the quantum work-extraction protocol is optimized for the work-observation-induced expected state ρ * t = ξ t ; (ii) memory-assisted classical, where the protocol, unable to extract work from quantum coherences, is optimized for the energydephased version of the expected state (iii) memoryless quantum processing, where memory is never updated by observations, and the quantum work-extraction protocol is simply optimized for the time-averaged quantum state ρ * = ξ 0 = x πT (x) 1 σ (x) ; and (iv) overcommitment to most probable quantum state, with protocol optimized for ρ * t = σ (argmax x η t T (x) 1) .The asymptotic work-extraction rates lim t→∞ ⟨W t ⟩ from these approaches are compared in Fig. 4, both analytically and with numerical simulations, for the perturbed-coin process.In Fig. 4(a) memory-activated work is defined as the difference between memoryassisted quantum approach (i) and memoryless quantum approach (iii).In Fig. 4(b) quantum enhancement is defined as the difference between memory-assisted quantum approach (i) and memory-assisted classical approach (ii).
In this demonstration, the system extracts work from a sequence of qubits, each with Hamiltonian At each time step, the source produces one of two nonorthogonal quantum states: , according to the labeled transition matrices of the perturbed-coin process.
tum approach (i) always performs at least as well as all other approaches, and strictly outperforms them in many regions of parameter space.In these examples, the ideal input ρ * t is a qubit density matrix with eigenvalues λ ± , where λ 5) and (6), we thus expect to observe one of two possible work values, w (±) from each distinct belief state.
Approaches (i)-(iii) share some nice features.From Eq. (6), assuming distinct work values w (+) ̸ = w (−) , we find that the probability of observing each possible work value is simply given by the corresponding eigenvalue of the optimal input: Combining this with Eq. ( 5), we find that the rate of work extraction can be expressed as for approaches (i)-(iii)-although, notably, both ρ * t and the set of belief states will be different in each approach-with ρ * t = ξ t , ξ dec t , and ξ 0 , respectively.The non-negativity of relative entropy thus guarantees the non-negativity of expected work extraction from these three approaches.Note that Eq. ( 12) is more general than Eq.(10) and allows us to compare workextraction performance of the three approaches.
Without any memory, the extractable structure is limited to the time-averaged statistical bias of the output [21], which explains why memoryless work extraction varies with r but not p.Further details of the analytic solution for expected work extraction can be found in Appendix H.
Our numerical simulations use the quantum work-extraction protocol of Ref. [13] at each time-step, and agree very well with our more general analytical predictions.Indeed, the quantum work-extraction protocol of Ref. [13] provides an example of a ρ * -ideal work-extraction protocol, in the limit of many bath interactions.More details can be found in Appendices F and G.
It is tempting to commit to the most likely outcome.However, the overcommitment approach (iv) performs the worst, since any reset operation (to γ in this case) with minimal entropy production for a pure-state input leads to infinite heat dissipation when operating on any other input [31,32].This translates to infinite negative work extraction ⟨W t ⟩ = −∞ in this case of ρ * t ∈ {|0⟩⟨0| , |ψ⟩⟨ψ|}.This divergence can alternatively be seen from Eq. (5) as w (−) ∼ ln λ − → ln 0 = −∞.In our numerical simulations, following Ref.[13], this minimal eigenvalue λ − is inversely proportional to the number N of bath interactions, so that the overcommitment work penalty diverges as

Phase transitions in efficacy of knowledge
Surprisingly, there exists a blue inner region of panel 4(a) where the memoryless quantum approach (iii) achieves the same performance as our memory-assisted quantum approach (i).There exists a sharp phase boundary within which the use of memory does not boost performance.As seen clearly in panels 4(c) and 4(d), the phase boundary exhibits a discontinuity in the first derivative of work extraction with respect to process parametrization.This phase boundary is not unique to the perturbed-coin process and, indeed, also occurs in the 2-1 golden-mean process.
Such phase transitions originate from bifurcations of the attractors of the belief-state update maps.This is illustrated in Table 1, which shows the nonlinear return maps along with the consequences for belief dynamics and workextraction dynamics.We focus for now on the first two rows of Table 1, which illustrate the nonlinear dynamics of our memory-assisted quantum approach (i), in the memory-apathetic and memory-advantageous regimes, before and after bifurcation respectively.
Recall from Fig. 1(a) that a two-state machine generates the perturbed-coin process.Hence, a scalar . The magnitude of this scalar ϵ t indicates the strength of evidence that the process is in a particular hidden state.
The first column of Table 1 shows the return maps for ϵ t → ϵ t+1 induced by either w (+) (red solid graph) or w (−) (blue dashed graph), when the work-extraction protocol is optimized for ρ * t = ξ t (for the first two rows) or ρ * t = ξ dec t (for the last row).To aid the visual bifurcation and stability analysis, we include a dotted diagonal line with slope one-representing the identity map-and a dotted diagonal line with slope minus one-representing the swap map.
Intersections between a return map and the dotted identity line would indicate a fixed point of the map upon its repeated application.If the magnitude of the slope at the intersection is less than unity, then it is a stable fixed point; if the magnitude of the slope at the intersection is greater than unity, then it is an unstable fixed point.Within the memory-apathetic region of parameter space, Work extraction does not supply enough evidence to nudge an observer out of a state of complete ignorance.In this regime, memory does not enhance quantum work extraction.
At the phase boundary in the process' parameter space, the fixed point at π becomes unstable, and new attractors emerge for each map.However, the coexistence of the maps introduces competition between the attractors, as the maps are selected stochastically with probabilities λ ± .The two maps (red solid and blue dashed) interact to induce a steady-state metadynamic over recurrent belief states, shown as a Markov process in each row of Table 1's last column.In this memory-advantageous region, work extraction supplies sufficient evidence to inform an observer about the hidden state of the process, which in turn avails more extractable work.
The elegance of memory-assisted quantum work extraction is reflected in the simple one and two state recurrent memory structures.In comparison, the classical extractor not only harvests less work, but requires more memory to achieve its relatively meager returns.Note the infinite number of recurrent memory states in the classical-processing case, in the last row of Table 1.

Fundamental limits of quantum pattern engines 5.1 Initial investment and ideal workextraction rate
Has our engine performed optimally?To answer this requires some benchmark of optimality.Fortunately, this benchmark is furnished by an appropriate application of the thermodynamic second law.
If we treat the engine's memory and the multipartite quantum pattern as a single joint system, we can apply the second law of thermodynamics to (i) bound the maximal possible work extraction, and (ii) discover an initial thermodynamic investment required for the engine to become correlated with the pattern.The transient work investment is apparent in the 'work series' column of Table 1.It is a necessary price to harvest work at the optimal steady-state rate.
Let us denote the joint state of a d Mdimensional memory and the length-L pattern as ρ (M,1:L) t , with corresponding reduced states of the memory ρ (M) t and of the pattern ρ .We will assume that the memory is fully energetically degenerate so that unitary memory updates do not cost energy.Then γ = (I/d M ) ⊗ γ ⊗L describes the equilibrium state of the joint su-persystem.If we denote W ext t = t ℓ=1 W ℓ as the net work extracted up to time t, then the second law of thermodynamics tells us that the reduction in nonequilibrium free energy F (M,1:L) t upper bounds the expectation value of extracted work [10]: Since the equilibrium free energy is unchanged during our work-extraction process, the change in nonequilibrium free energy is proportional to the change in relative entropy between ρ (M,1:L) t and γ.This relative entropy simplifies further when we invoke several features of our framework: Since the memory is always initialized as ρ (M) 0 , the memory and pattern are initially uncorrelated: . Through the sequence of deterministic updates, the entropy of the memory is unchanging.Altogether, these features imply that the extractable work is upper bounded by where ] is the mutual information that has built up between the memory and the quantum pattern.
Invoking two more features leads us to a more interpretable version of this result.We note that after t time-steps, the first t subsystems of the pattern have been brought to their equilibrium states, while the other subsystems remain unaltered: ρ ), we find that the steady-state work-extraction rate is upper bounded by 0 ∥γ] =: where the choice of ℓ ∈ {1, 2, . . ., L} is arbitrary since we assume a stationary quantum pattern.This is a special case of the Quantum Information Processing Second Law (QIPSL), derived as Eq.(77) in Appendix I. Whenever the input pattern is far from being fully consumed, we find that ⟨W ext t ⟩ ≤ tw ideal − k B T I t .By the information-processing inequality, the investment cost I t is no more than the quantum excess entropy E = lim ℓ→∞ I(ρ ), which is the quantum mutual information between the sequences of past and future inputs.However, the engine will only be able to harvest work at the ideal steady-state rate if the memory stores all of the information from past inputs relevant for predicting future inputs, such that I t = E.Only then will the process be seen at the true entropy rate s vN ; otherwise it will appear more random than it really is, with the hidden structure evading extraction.
On the other hand, when the entire pattern has been consumed, I L = 0 yet The quantum excess entropy E is a thermodynamic investment for any ideal extractor-even one that operates non-locally.However, for local extractors, this investment must be payed upfront as I t .
In the current framework, it remains an open question whether the knowledge about future inputs I t attains the fundamental limit of what is knowable E, and whether the work extraction rate ⟨W t ⟩ attains w ideal .Perhaps quantum discord prevents local extractors from performing optimally unless they are equipped with quantum memory.Unfortunately, there is no known closed-form solution for the von Neumann entropy rate s vN , even for the simple latent-state models studied here; Accordingly, we cannot analytically evaluate how close our engine is to optimal.Future studies could elucidate the path to optimality, by investigating the work-extraction rate as the agent (a) operates on longer subsequences or (b) explores non-greedy harvesting strategies.

Energetically-free memory updates in steady state
The initial thermodynamic investment I t can be associated with the initial nonunitarity of the work-conditioned map on the memory spacethe state compression needed for an observer to synchronize (to the extent possible) with the hidden state of the process.However, in the steady state, the work-conditioned maps are unitary for our memory-assisted quantum approach (i).This can be seen in the first two rows of the last column of Table 1: For the perturbed-coin process, the work-induced map is either the identity or swap map on the restriction to the recurrent belief states.These asymptotic memory-update maps are unitary and so can be performed with zero dissipation. 2urther investigation is required to determine if the steady-state work-conditioned maps of our approach are always asymptotically unitary for other processes.If not, it may be beneficial to merge the work-extraction and prediction subroutines into a single more sophisticated asymptotically-unitary transformation.We anticipate that the memory updates would then be energetically free in the steady state.This expectation aligns with previous related studies on local work-harvesting pattern extractors in the classical domain, where the ideal pattern extractor-which must be predictive of its input-can update its memory for zero energy cost during steady state work extraction [33,16,34,35].

Discussion
We developed the theoretical prototype for a quantum pattern engine: a machine that can adaptively extract useful work from quantum stochastic processes by exploiting knowledge of the temporal patterns they contain.We witnessed that, in the presence of coherence, the memory-assisted quantum approach will always outperform the memory-assisted classical approach.We also demonstrated its advantage over engines that can only harness static quantum resources-although, surprisingly, we found a phase transition marking the onset of memory advantage.It is an open question whether this phase transition coincides with the onset of quantum discord.
In Thm. 1, we found how to update the state of knowledge about any latent-state generator of a quantum process, given any POVM on the current quantum output.In Thm. 2, we found the exact work distribution obtained from any ρ *ideal work-extraction protocol operating on any quantum state.This enabled belief-state updates via observed values of work extraction.Sec. 5 developed the fundamental thermodynamic limits of work extraction from correlated multi-partite quantum systems, which sets the ultimate benchmark for our approach or any alternative.
Despite the advances presented here, many open questions remain for future work.
It is known that measurements themselves can cost some resource, energetic or otherwise [36,37].Careful accounting of the resources required for (perfect or imperfect) measurement in our framework would provide a fuller picture of our engine's performance.Alternatively, it may be possible to use a quantum memory and fully unitary evolution (rather than projective measurements on the battery and subsequent conditional operations on the memory), which would avoid this issue completely.
Although designing the protocol for the observation-induced expected state, ρ * t = ξ t , guarantees maximal work extraction locally in time, it remains an interesting open question whether there is a superior steady-state approach that sacrifices short-term work extraction for greater knowledge and long-term returns.
It may be possible to extend our method to more complex quantum processes, e.g., to those with entangled temporal correlations.This would, however, likely require a quantum memory.On the other hand, our method can immediately be adapted to applications where the pattern is spatial instead of temporal (e.g., states of many-body systems), and where the engine is constrained to operate locally on small regions at a time.

Acknowledgements
We acknowledge the support of the Singapore Ministry of Education Tier 1 Grants RG146/20 and RG77/22, the NRF2021-QEP2-02-P06 from the Singapore Research Foundation and the Singapore Ministry of Education Tier 2 Grant T2EP50221-0014, the Agency for Science, Technology and Research (A*STAR) under its QEP2.0 programme (NRF2021-QEP2-02-P06) and and the FQXi R-710-000-146-720 Grant "Are quantum agents more energetically efficient at making predictions?"from the Foundational Questions Institute and Fetzer Franklin Fund (a donor-advised fund of Silicon Valley Community Foundation).VN also acknowledges support from the Lee Kuan Yew Endowment Fund (Postdoctoral Fellowship).

A Decomposition of free energy in quantum patterns
Consider a finite portion of the quantum pattern: If each subsystem is non-interacting and the ℓ th subsystem has a reference equilibrium Gibbs state γ (ℓ) , then the nonequilibrium free energy for this portion of the pattern is given by where F (1:L) eq is the equilibrium free energy.We recognize that D[ρ (1:L) ∥ L ℓ=1 ρ (ℓ) ] is the total correlation within the quantum pattern, while D[ρ (ℓ) ∥γ (ℓ) ] is the local nonequilibrium addition to free energy.Each of these factors contributes uniquely to the free energy.When operating sequentially on each subsystem, quantum pattern engines must leverage past information to harvest the free energy in the correlations.
In the main text, we suppose each non-interacting subsystem has the same local Hamiltonian, which implies that the reference Gibbs states are also all the same: γ (ℓ) = γ.The general decomposition here shows that both quantum and classical correlations contribute to the extractable nonequilium addition to free energy.However, in the main text, the quantum pattern is assumed to be classically-generated, despite having non-orthogonal states.The inter-time quantum correlations are thus restricted to quantum discord, with no inter-time entanglement [24].

B Memory-enhanced free energy
By self-consistency, the local reduced state of ρ← → A will simply be a probabilistic mixture of σ (x) in the form of ⟨ξ t ⟩ Kt = x∈X πT (x) 1 σ (x) = ξ 0 .The predicted quantum state ξ t thus has more free energy than the reduced state on average since Thus, and so more work can be extracted on average when memory is leveraged to predict sequential quantum states.The non-negativity of Eq. ( 23) can be seen either from the convexity of relative entropy in Eq. (21) or from the concavity of entropy in Eq. (22).
A nearly identical argument also shows that the classically predicted state has more free energy than the average classical state: where the states of knowledge K t are now the classically induced ones.

C Synchronizing to a memoryful quantum source
Inferring the latent state of a known memoryful quantum source allows maximal work extraction when operating serially on the quantum states of the process.The optimal state of knowledge, given a sequence of observations o 1 o 2 . . .o t obtained via interventions on the sequence of quantum systems σ (x 1 ) , σ (x 2 ) , . . .σ (xt) is the conditional probability distribution induced by these interventions, The last condition S 0 ∼ π means that the initial latent state of the generator is distributed as π.This can be rewritten as is the stationary distribution over the states of the generator.Thus, η 0 = π.
If we introduce a new random variable K t to denote the optimally updated state of knowledge about the latent state of the pattern generator, then we can replace the condition S t−1 ∼ η t−1 with K t−1 = η t−1 .The condition on the state of knowledge is relevant to the extent that the choice of POVM is influenced by the state of knowledge.We remind the reader that in our framework the POVM on the current quantum output is chosen as a function of the state of knowledge K t .
Note that the current quantum output only depends on the current latent state of the process.Accordingly, the next observation-which is the outcome of the POVM on the current quantum output-is conditionally independent of all previous outputs, given the current latent state and given the state of knowledge induced by all previous outputs.
We will now show that the optimal state of knowledge is recursive.I.e., we will show that: This follows from marginalizing over intervening latent states, employing Bayes' rule, and recognizing that the belief state η t is a function of the observations up to that time o 1 . . .o t .Starting from Eq. (25), we find: Hence, we have obtained Eq. (26) from Eq. (25) as promised.
Figure 5: Bayesian network showing the structure of conditional independencies among latent states S t of the quantum source, the type X t of quantum state produced, the observable O t attained from interaction, and the state of knowledge K t that influences the work extraction protocol.
Further manipulations, using the rules of probability and the conditional independencies indicated in the Bayesian network depicted in Fig. 5, allow us to express the optimal state of knowledge in terms of both conditional work distributions and simple linear algebraic manipulations of the generative HMM representing the memoryful source.We find

C.1 Using this to build a predictive work-extraction engine
Rather than repeatedly calculating these ideal belief states on the fly for a specific realization of the process, we can alternatively systematically build up the set of all such belief states, together with the observation-induced transitions among them, to inform the design of an autonomous engine.There will be both a set of transient belief states and a set of recurrent belief states.Both of these sets may be either finite or infinite.In the case that only finitely many belief states are induced by observations, we can explicitly build out the transition structure among them.If there are infinitely many such states, then we would need to truncate unlikely states in the design of our finite physical engine [28].The physical memory system of our proposed engine should have at least one distinguishable state corresponding to every observation-induced belief state.In fact, the memory must encode both the belief state and the most recent energy of the battery, so that conditioning on the new state of the battery is sufficient to supply the change in battery energy.These will likely be encoded with some finite precision, to avoid storing real numbers.Conditioned on the state of the memory encoding η, the work extraction protocol will operate jointly on the quantum system, thermal reservoirs, and battery, to optimally extract work from the expected state ξ = x∈X ηT (x) 1σ (x) .
The subsequently observed work value w uniquely updates the memory from the state encoding η to the state encoding η ′ = x∈X Pr(Wt=w|Xt=x,S t−1 ∼η) ηT (x) x ′ ∈X Pr(Wt=w|Xt=x ′ ,S t−1 ∼η) ηT (x ′ ) 1 .Once the next quantum system arrives, the predictive quantum work extraction cycle begins again.D Proof of Thm.2: Work extraction, in the limit of zero entropy production Work extraction in the limit of zero entropy production is important since it extracts all extractable work from a quantum state.It thus indicates the best possible scenario, against which other efforts can be compared.
In the limit of zero-entropy-production work extraction from ρ * , the net unitary time evolution of the system-battery-baths supersystem must take a special form.In particular, the state of the battery will change deterministically when the initial state of the system is an eigenstate |λ n ⟩ of ρ * = n λ n |λ n ⟩ ⟨λ n |, almost-surely independent of the initial realization of the reservoirs.This implies that the net unitary time evolution will be of the form for some w (n) ∈ R. Above, |ε⟩ and |ε + w (n) ⟩ are energy eigenstates of the work reservoir, while |r⟩ is an energy eigenstate of the thermal baths.It will be useful in the following to note that ⟨f ε (n, r)|f ε (n ′ , r ′ )⟩ = δ n,n ′ δ r,r ′ since unitary operations map orthogonal states to orthogonal states The form of the unitary Eq. (38) effectively assumes that the energy of the battery is well above its ground state.Some interesting nuances have recently been explored for batteries close to their ground state (see, e.g., Ref. [38]), which would affect the statistics of the work-extraction values, but we avoid that regime here to instead focus on the best-possible scenario.
One way to determine w (n) is via the initial-state dependence of entropy production.Let ⟨Σ⟩ ρ denote the expectation value for entropy production, given initial system-state ρ, under the fixed work-extraction protocol optimized for ρ * .In our case with a single heat bath at temperature T , the expected entropy production can be defined as usual as ⟨Σ⟩ ρ = ⟨ W ⟩ ρ − ∆F t /T .This is the entropy production for a fixed protocol operating on the initial state ρ, where F t is the nonequilibrium free energy at time t, while ∆F t is the change in nonequilibrium free energy over the course of the protocol, and W is the work exerted, which is just the negative of the extractable work [10].Since all initial states map to γ by the end of the work-extraction protocol, we know from Ref. [31] that In this case, ⟨Σ⟩ ρ * = 0 and = Tr(σ ln γ) − Tr(σ ln ρ * ) .
In particular, let σ = |λ n ⟩ ⟨λ n |, and note that ln γ = ln(e The deterministic work-extraction value, given initial pure state |λ n ⟩, must be the same as its expected value w (n) = − ⟨ W ⟩ |λn⟩⟨λn| , and is thus given by The probability of obtaining the work-extraction value w, given any input state σ = n,m |λ n ⟩ ⟨λ m | ⟨λ n |σ|λ m ⟩, can be calculated as = independent of the initial energy state of the battery |ε 0 ⟩, and almost-surely independent of the initial realization |r⟩ of the thermal reservoirs in the probability theoretic sense.We see that Pr(W = w|σ) = 0 unless w ∈ {w (n) } n .The probability distribution over these allowed work-extraction values is When there is some entropy production, the probability density of work extraction will have more diffuse peaks.However, for sufficiently low entropy production, the peaks will still be well separated and, so, effectively discrete for the purpose of Bayesian updating.
The above derivation is valid whether or not ρ * has degenerate eigenvalues.Notably, the above sums are taken over the eigenstates and their associated eigenvalues, rather than summing over the eigenvalues directly.

E Power and inefficiency at rapid operation
It is a familiar concept in the design of any engine: that maximal thermodynamic efficiency requires sufficiently slow operation.Clearly, this has implications for the power output of the engine [39,40].However, the relaxation timescales of an engine depend on particular material properties of the system and baths, as well as the particular interaction Hamiltonian, so there is no implementation-independent timescale that determines the practical operation speed of an engine.
Nevertheless, we can apply rather general principles to assess how power typically scales with increasingly fast operation.For example, under assumptions of a Lindblad master equation, there will be a contribution to entropy production (and a corresponding decrease in extracted work per operation) that scales as 1/τ 0 , where τ 0 is the duration of the work-extraction protocol [41].Or, for unitary interactions with the bath, a similar statement can be made but with τ 0 proportional to the number of interactions with bath degrees of freedom (i.e., the 'circuit complexity') [42].In either case, we expect entropy production to scale as Σ ≈ c/(κ + τ 0 ) ≈ c/τ 0 for τ 0 ≫ κ > 0 where c and κ are implementation-dependent positive quantities.
In a fixed time τ , the number of work-extraction operations t determines the maximal allowed time τ /t ≥ τ 0 for each work-extraction protocol.Let ⟨W ext ⟩ be the steady-state work-extraction rate per operation in the limit of very slow operation (i.e., in the limit of infinitely many relaxation steps per work-extraction operation).The power P achieved by finite-time operation is then sandwiched by We focus on the regime where each work-extraction protocol is of sufficiently long duration τ 0 ≫ c/k B , such that cT /τ 0 is negligible.In this regime, the power trivially scales with the number of  with corresponding eigenstates For all times after t = 0, the update rule for belief states simplifies to the following which can be expressed explicitly in terms of p, r, and ϵ t .When ϵ t = 0, we find that η t+1 = 1 2 , 1 2 = π.I.e., the stationary distribution is a fixed point for this dynamic over belief states.Because of this, we break the initial symmetry by setting ϵ to a small non-zero value to obtain useful knowledge.In other words, for the very first work-extraction protocol, we choose some ρ * 0 ̸ = ξ 0 to avoid an unstable fixed point of the update rule.However, for all subsequent time steps, we choose ρ * t = ξ t .For the perturbed coin, the metadynamic of the belief state in the long run will yield two different results, depending on which regime the system is in, "memory-apathetic regime" or "memoryadvantageous regime".The reason for this separation comes from the shape of their update function.For the memory-apathetic region, the update function has gradient less than unity, making ϵ = 0 an attractor.For the memory-advantageous region, the gradient of the update function exceeds unity, therefore making ϵ = 0 a repellor, at the same time two other points become part of a new attractor.
In the long run, transient belief states die out, leaving only the steady-state dynamics among the recurrent states of knowledge; any initial distribution over belief states generically converges to the stationary measure π K .Hence the steady-state rate of work extraction is given by The expected extracted work for the memory-apathetic region coincide with that of memoryless extraction and is given by On the other hand, in the regime where memory enhances the performance of the protocol, the stationary distribution over the two recurrent belief states η and η ′ , with corresponding expected quantum states ξ and ξ ′ , is Hence, the work extraction rate is given by

H.2 Classical approach
The derivation for the memory-assisted classical approach is similar to that of the memory-assisted quantum approach illustrated above.However rather than operating on the induced expected state ξ t , the classical approach uses work-extraction protocols that are thermodynamically optimized for the decohered state The eigenstates of ρ * t are thus |0⟩ and |1⟩, independent of time in this case.In the classical approach, π is no longer a fixed point of the belief-state update maps.The transition probabilities between belief states are now given by λ The metadynamic of belief in the classical case behaves as a reset processes.Unlike the quantum case with only two recurrent belief states, the classical protocol induces an infinite set of recurrent belief states.To construct a finite-state autonomous engine, we could choose to truncate those states within some small δ distance from another recurrent state, or truncate belief states with negligible probability, with vanishing work-extraction penalty.
We find that the work-extraction rate can again be computed by averaging the relative entropy-now between the decohered expected state and thermal state-over all recurrent states of knowledge: ) . (67)

H.3 Overcommitment to the most likely outcome
The "overcommitment" approach used for comparison in the main text bets exclusively on the most likely outcome in {σ (x) } x .
The expected thermodynamic cost of misaligned expectations during work extraction can be quantified exactly via the relative entropy D[ρ 0 ∥α 0 ] between the actual input ρ 0 and the anticipated input α 0 that the protocol is optimal for, if we assume that the final state is independent of the initial state [31,32].Hence, if we design the protocol for a pure state, but operate on a mixed state, we will encounter divergent thermodynamic penalties.
Accordingly, we can observe divergent thermodynamic costs when we design the Skrzypczyk work extraction protocol to be optimal for operation on a pure state.
Using the Skrzypczyk protocol (with N relaxation steps) to extract work from the pure state bet upon, we see that the first bath state swapped with the system for energy extraction is not exactly pure, but rather satisfies γ N (e −βE 0 +e −βE 1 ) |1⟩ ⟨1|.(Recall that H is the Hamiltonian for the system, not of the bath.)Any purity of the actual input beyond this initial bath purity is wasted.The input state leading to minimal entropy production under this protocol is thus a unitary rotation of γ B .
Thus, for this use case of the Skrzypczyk protocol, the minimally dissipative state α 0 becomes pure as N → ∞.As N → ∞, we observe the battery's final expected energy diverging (but only logarithmically in N ) to negative infinity, when this protocol acts on any other state.I.e., ⟨W ⟩ ∼ −k B T ln N .
More specifically, we can leverage Eqs. ( 5) and (6) to calculate the expected value of work for the overcommitment approach.We find that N (e −βE 0 +e −βE 1 ) , w (−) ∼ −k B T ln N , and min x η t T (x) 1 ∼ min(p, 1 − p) when η t is close to either latent state, we anticipate that the overcommited work penalty diverges as −k B T (1 − r) min(p, 1 − p) ln N , as observed.
Interestingly, for a finite number of bath interactions, some work can be extracted on average within certain regimes.But other regions of parameter space would yield very negative work-extraction averages.
Unlike the other approaches, the expectation value of work in the overcommitment approach cannot be written as a relative entropy.Hence, whereas the other approaches were guaranteed to have nonnegative work extraction on average, the overcommitment approach enjoys no such guarantee of nonnegativity.Indeed in the limit of many bath interactions, the overcommitment approach leads to infinitely negative work extraction.
We consider the fundamental thermodynamic limits of a memoryful physical transducer that consumes one physical pattern and produces another.Both the input and output are assumed to be stationary quantum stochastic processes.Although much of the previous information-engines literature restricts itself to the energy-degenerate case (such that a '0' and '1' have the same energy), here we generalize to allow the Hamiltonian for each subsystem to be either degenerate or non-degenerate.
Denote the joint state of a d M -dimensional memory and the length-L pattern as ρ (M,1:L) t , with corresponding reduced states of the memory ρ (M) t and of the pattern ρ upper bounds the expectation value of extracted work when the environment is at the fixed temperature T [10].We assume cyclic transformations where the joint Hamiltonian at time τ is the same as the the joint Hamiltonian at time 0-although it may be modulated during the protocol to enable the work extraction.The equilibrium state γ for the joint pattern and memory is thus the same at times 0 and τ .For this reason, there is no net change in equilibrium free energy.Accordingly, the only change in nonequilibrium free energy is due to the change in the nonequilbrium addition to free energy-the quantum relative entropy between the actual state and the equilibrium state: We now make several assumptions relevant to energetically-efficient memoryful pattern manipulation: 1. We will assume that the memory is fully energetically degenerate so that unitary memory updates do not cost energy.Then γ = (I/d M ) ⊗ γ ⊗L describes the equilibrium state of the joint supersystem.
2. Since the memory is always initialized as a particular memory state ρ (M) 0 , the memory and pattern are initially uncorrelated: ρ 3. Through the sequence of deterministic updates, and equal (possibly zero) internal entropy of each memory state, the entropy of the memory is unchanging.
Altogether, these features imply that the extractable work is upper bounded by where ] is the mutual information that has built up between the memory and the quantum pattern.The operator ∆ evaluates the difference of each function from time t = 0 to time t = τ , such that ∆f t ≡ f τ − f 0 for any function f of t.
The classical IPSL, and the quantum generalization we derive, address the scenario where a physical machine scans unidirectionally along the pattern.After τ time-steps, the first τ subsystems of the pattern have interacted with the machine, while the other subsystems remain unaltered: ρ More specifically, the IPSL assumes that both the input tape and output tape are stationary stochastic processes, such that all marginal distributions of each (input or output) is shift invariant within the domain of the existing pattern.We likewise consider the case that both input and output processes have a well-defined von Neumann entropy rate s vN and s ′ vN respectively.Each also has a myopic entropy rate s ℓ = S(ρ τ ).The myopic entropy rate describes the convergence to irreducible randomness in pattern extension, as progressively more subsystems are prefixed.Relatedly, the myopic excess entropy for the input process is E ℓ = ℓ n=1 (s n − s vN ), with a similar relation for the primed output process.Asymptotically, the excess entropy describes the quantum mutual information between the past and future of a bi-infinite process: lim ℓ→∞ E ℓ = E = lim ℓ→∞ I(ρ ).With definitions and assumptions in place, we can now expand Eq. (71) to derive the QIPSL:    ) + ∆S(ρ To be thermodynamically efficient, it is known that the classical machine needs to be predictive or retrodictive in the case that it is consuming or generating a process, respectively [33,16,34,35].From Eq. (77), we now see that an analogous result remains true in the more general quantum case.
It is worth reflecting on the behavior of the contributions to Eq. (77).Notably, E L − E L−τ ≥ 0, with E L − E L−τ ≈ 0 when L − τ ≫ 0, and E L − E L−τ → E L ≈ E as τ → L. Meanwhile, the myopic excess entropy for the output process is a non-decreasing function of τ that approaches the past-future quantum mutual information of the output process E ′ τ ≈ E ′ whenever τ ≫ 0.

I.1 Implications for work extraction
What are the implications for work extraction?We note that after t time-steps of work extraction from a quantum pattern, the first t subsystems of the pattern have been brought to their equilibrium states, while the other subsystems remain unaltered: ρ When there is still plenty of input pattern remaining-i.e., when L − τ is sufficiently large such that E L − E L−τ ≈ 0-we find ⟨W ext τ ⟩ ≤ τ w ideal − k B T I τ .On the other hand, when the entire pattern has been consumed, I L = 0 yet E L − E 0 = E L ≈ E for sufficiently large L, leading to

J Non-Markovian generators
Unlike the classical case, the classical control symbols X t are hidden from direct observation when the process emits non-orthogonal quantum states.The latent-state generators may thus be referred to as 'doubly-hidden Markov models'.Accordingly, even if the intermediary X t process is Markovian, this would not directly imply a meaningful sense of quantum Markovianity of the outputs.Nevertheless, there is some sense in which processes with non-Markovian control outputs X t have more deeply hidden structure.
To benchmark the performance of the memory-assisted protocol on a process with higher Markov order of the control symbols X t , the 2-1 golden-mean process was chosen for comparison.The timeaveraged density matrix of the memoryless approach for both models is kept the same, ξ 0 = 1 2 (σ (0) + σ (1) ).The comparison is shown in Fig. 8, where we see that more work is extracted from the non-Markovian generator.This example suggests that memory can become even more important for enabling work extraction from non-Markovian generators of quantum processes, since the extractable structure can be more deeply hidden.

Figure 2 :
Figure2: Schematic diagram of a quantum-pattern engine.At each time step, the process will take a quantum system, σ At , from the "fuel" tape, reservoir qudit, R, battery, B, and memory, M , as input.The 'Work extraction' box should be interpreted as a memorydependent unitary.States of battery and memory are recycled.

1 Figure 3 :
Figure 3: The protocol proceeds cyclically to fine-tune the belief state.

Figure 4 :
Figure 4: Comparison between average work-extraction rates of various approaches.p characterizes the transition probability between the two latent states of the perturbed-coin process, and r quantifies the overlap between the two quantum outputs.(a) Memory enhancement of work extraction.(b) Quantum enhancement of work extraction.Panels (c) and (d) reveal phase transitions in memory enhancement through cross-sections of parameter space.Analytic results (solid lines) and simulations (markers) are shown.Blue (squares) represents approach (i); black (circles) represents approach (ii); green (stars) represents approach (iii); red (triangles) represents approach (iv).

1 L
) = −Tr(ρ ln ρ) denote the von Neumann entropy of ρ.If the engine operates on a very long quantum pattern with a von Neumann entropy rate s vN := lim L→∞

Figure 6 :
Figure 6: Quasistatic evolution of battery state over a finite number N of bath interactions, under the work-extraction protocol from Skrzypczyk et al.[13] .Red line represents the total nonequilibrium addition to free energy present in the initial input state.

Figure 7 :
Figure 7: Probability distribution of work extracted, when using the the Skrzypczyk work-extraction protocol with a total of N = 22 bath interactions.Red line represents the distribution when the protocol-thermodynamically ideal for some mixed state very close to |0⟩-acts on the pure state |0⟩.Blue represent the same protocol's work distribution when acting on a relatively non-orthogonal pure state |ψ⟩, where fidelity between the two states is F |0⟩ , |ψ⟩ = | ⟨0|ψ⟩ | 2 = 4/5.

.
If we denote W ext τ = τ t=1 W t as the net work extracted up to time τ , then the second law of thermodynamics tells us that the change in nonequilibrium free energy F (M,1:L) t

Figure 8 :
Figure 8: Comparison of average work extracted between 2-1 golden-mean and perturbed-coin processes, varying the nonorthogonality parameter r.Blue and red dots represent the memory-assisted quantum approach on golden mean and perturbed coin respectively; Green line represents the memoryless approach.

Table 1 :
Summary of metadynamics in different regimes.The update function shows the nonlinear relationship between ϵ t and ϵ t+1 .The belief evolution shows the evolution of ϵ t over iterations, which give rise to the corresponding work series, with two possible work values per belief state.The recurrent belief states show the recurrent metadynamic of the different regimes.