Charging a quantum battery with linear feedback control

Energy storage is a basic physical process with many applications. When considering this task at the quantum scale, it becomes important to optimise the non-equilibrium dynamics of energy transfer to the storage device or battery. Here, we tackle this problem using the methods of quantum feedback control. Specifically, we study the deposition of energy into a quantum battery via an auxiliary charger. The latter is a driven-dissipative two-level system subjected to a homodyne measurement whose output signal is fed back linearly into the driving field amplitude. We explore two different control strategies, aiming to stabilise either populations or quantum coherences in the state of the charger. In both cases, linear feedback is shown to counteract the randomising influence of environmental noise and allow for stable and effective battery charging. We analyse the effect of realistic control imprecisions, demonstrating that this good performance survives inefficient measurements and small feedback delays. Our results highlight the potential of continuous feedback for the control of energetic quantities in the quantum regime.


Introduction
The ability to control the quantum dynamics of mesoscopic systems has opened a broad research frontier in the physical sciences, spanning metrology [1], information processing [2], and non-equilibrium statistical mechanics [3]. With the growing complexity of experiments in these fields, a thorough understanding of the energetics of quantum systems is increasingly important, both as a diagnostic tool and to help design optimal control protocols. A fruitful way to gain insight is through specific examples of thermodynamic processes in the quantum regime, such as energy storage and extraction. To that end, quantum batteries -dynamical systems that receive and supply energy -have emerged as a useful paradigm to explore the fundamental limits and potential benefits of energy transduction with quantum degrees of freedom [4,5].
Mark T. Mitchison: mark.mitchison@tcd.ie Even far from thermal equilibrium, the second law of thermodynamics (or an appropriately generalised version thereof) constrains the amount of useful work that can be extracted from a quantum system [6][7][8][9][10]. Nevertheless, various strategies have been developed to boost the extractable work, such as operating collectively on several systems [11,12] and exploiting quantum correlations [13,14] or coherences [15,16]. Likewise, it has been shown that collective operations can enhance the charging power of composite quantum batteries [17,18]. These predictions have inspired a substantial body of theoretical research aiming to harness quantum or many-body effects in order to improve the performance of energy storage devices. Numerous quantum battery architectures have since been proposed [19][20][21][22][23][24][25][26][27] and the effects of different physical phenomena -ranging from entanglement [28,29] to many-body localisation [30] -have been extensively investigated.
One phenomenon that is especially relevant for practical energy storage is dissipation stemming from interactions with the environment. A good battery should be well isolated from its surroundings in order to prevent the loss of charge over time [31][32][33]. Yet a perfectly isolated battery is decoupled from all external energy sources and thus cannot be charged in the first place. A natural way to resolve this dilemma is to ensure that the power supply is physically separated from the system used for long-term energy storage. This can be achieved by supplying energy to an auxiliary system -the charger -which is allowed to interact with the battery in a controlled way [34,35]. However, the coupling between the charger and its external power supply necessarily introduces noise, which limits the charging process [36]. Recent proposals have shown how to mitigate the effect of environmental noise via measurements [37] or dark states [38,39], while other authors have instead suggested to harness noise as a charging mechanism [40][41][42][43][44][45]. These approaches are notable due to their stability, meaning that the battery's charge tends to a stationary value instead of oscillating over time [38,46]. This desirable feature mimics the behaviour of battery charging in everyday life and removes the need for precisely timed switching of the battery-charger coupling.
Here, we propose an alternative route based on continuous weak measurements and feedback control. This effectively recasts the problem of dissipative bat-tery charging as a feedback stabilisation protocol [47][48][49][50]. More precisely, we consider a homodyne-like measurement scheme, which leads to a dynamical description in terms of diffusive quantum trajectories [51? -53]. This framework applies to a variety of experimental settings -including optical, atomic, and electronic systems -where continuous quantum feedback control has already been implemented [54,55]. The thermodynamics of such feedback has recently been studied both theoretically [56][57][58] and experimentally [59,60].
In order to explore the effect of feedback on the charging of an open quantum battery, we adopt a simple model that is analytically and numerically tractable, as detailed in Sec. 2. We consider a twolevel charger coupled to a finite-dimensional battery and specialise to direct, linear feedback, where the driving strength is directly proportional to the measurement signal. In Sec. 3, we introduce and analyse two different control protocols, based on stabilising either populations or coherences in the charger. Despite its simplicity, linear feedback is shown to enable highly stable and effective energy transfer from the charger to the battery. In the ideal case of efficient measurements and instantaneous feedback, the battery can be charged perfectly (i.e. to its maximumenergy state). We also analyse the effect of measurement inefficiency and feedback delay in detail, finding good performance even in the presence of realistic imperfections. We discuss our results and suggest interesting future directions in Sec. 4. The Appendix provides an analysis of the effect of thermal noise and detuning between the charger and battery. Units where = 1 are used throughout.

Description of the system
The objective is to pump energy into a d-level quantum battery, B, via a two-level system (qubit), C, which acts as the charger. The battery is modelled as a ladder of equidistant states separated by energy ω 0 , with the same energy splitting characterising the charger. The bare Hamiltonian is thus given bŷ whereσ x,y,z are standard Pauli operators describing the qubit and we defined the number operator of the batteryN During the charging process, the charger resonantly exchanges energy with the battery via the interaction whereσ ± = 1 2 (σ x ± iσ y ) are raising and lowering operators for the qubit, g is the coupling strength, and we defined the lowering operator for the batterŷ Note that, since [Ĥ 0 ,Ĥ int ] = 0, the interaction can be switched on and off without affecting the average local energy of the charger and battery, Ĥ 0 , in principle [34].

Open-system dynamics and feedback control
The battery is assumed to be a well isolated system, so that its direct interaction with the environment can be neglected on the timescales under consideration. In contrast, the charger is an open quantum system that couples to an external field. This coupling allows energy to be pumped into the charger by a coherent driving tone, but also necessarily entails dissipation due to the many-mode field acting as a reservoir. We focus on the low-temperature regime, where the thermal energy is much less than ω 0 so that the probability of photon absorption from the reservoir field is negligible. Working in an interaction picture with respect toĤ 0 and invoking the rotating-wave approximation, the coherent drive is described by the Hamiltonian where the Rabi frequency Ω(t) is proportional to the intensity of the driving field. In the absence of any feedback, the driven-dissipative dynamics is then described by the master equation whereĤ(t) =Ĥ int +Ĥ drive (t), Γ is the spontaneous emission rate, and is based on the Born-Markov and rotating-wave approximations, where the latter requires ω 0 to be the largest energy scale in the system. In particular, we assume that ω 0 |Ω|, g, Γ, and that Ω(t) varies slowly compared to the fast timescale ω −1 0 1 . An optimal driving protocol Ω(t) should maximise the energy deposited in the battery that can subsequently be extracted. A simple approach is to drive 1 This assumption is consistent with the feedback protocol introduced in Eq. (8) so long as the white noise in Eq. (7) is understood as an idealisation of the true noise at the detector output, which of course has a finite correlation time tc. The rotating-wave approximation is justified so long as ω 0 t −1 the qubit with a constant intensity, i.e. Ω(t) = Ω 0 = const., as considered in Ref. [36]. However, this strategy is only effective at charging the battery transiently, with the extractable work oscillating over time and eventually decaying to a fraction of the maximum. We note that similar transient oscillatory effects have been identified in small absorption refrigerators [61][62][63][64]. The ineffectiveness of the charger at long times is caused by spontaneous emission, which randomises the phase coherence of the driven qubit and prevents a stationary population inversion from being established. This motivates our alternative approach, where phase information that is lost to the environment is partially regained by a weak measurement and then fed back into the control field. We assume that some of the spontaneously emitted photons are collected and measured with a homodyne interferometer, as shown in Fig. 1. The total measurement efficiency is denoted by η ≤ 1, which incorporates both the fraction of collected photons, η c , and the detector efficiency, η d , so that η = η c η d . The resulting homodyne current is represented by an appropriately normalised and shifted measurement record [52,65,66] whereρ r (t) is the quantum state (in the interaction picture) conditioned on the measurement record and dw(t) is a Wiener increment that represents noise in the detector output. The average measurement signal yields an estimate of the qubit coherence as E[r(t)] = σ x , where E[•] denotes an average over the noise 2 . We emphasise that the optical apparatus depicted in Fig. 1, similar to optomechanical realisations [67][68][69], is selected mainly for illustrative purposes. Analogous continuous weak measurements have been implemented on other platforms, for example, using microwave fields [59,70] or electrical currents [53,71,72]. Feedback is enacted by applying a driving field that depends on the results of the measurement. We consider the simplest case of direct feedback, in which the drive intensity is proportional to the measurement record, i.e.
where Ω 0 is a constant drive, f parametrises the strength of feedback, and we have allowed for a small time delay τ > 0 in the feedback loop. Since feedback is applied after the measurement record is read out, the evolution is described by the map [1] 2 Note that we assume that the measurement apparatus is arranged so that the measured field quadrature is orthogonal to the one driving the system. This definite phase relationship requires both fields to derive from the same source, as indicated in Fig. 1. where K[ρ r (t)] represents the dissipative part of the evolution while • is the innovation superoperator and dw(t) is the same Wiener increment appearing in Eq. (7). Expectation values are computed from the ensemble-averaged density matrixρ(t) = E[ρ r (t)]. In the ideal case of negligible feedback delay, τ → 0, it is possible to recover a Markovian master equation forρ(t). In particular, by expanding Eq. (9) to first order in dt while applying the rules of Itô calculus (i.e. dw 2 = dt), one obtains [1] The first line above describes coherent evolution due to interactions and driving, while the second line describes the effect of spontaneous emission and the measurement noise that the drive feeds back into the system. For finite values of τ , however, no such Markovian description is possible and it is necessary to solve the explicit Itô equation (9) and average over many trajectories. It is straightforward to extend our model to describe finite-temperature dissipation and detuning between the charger and the battery. These effects are analysed in detail in the Appendix.

Energetics of the battery
At the end of the charging process, the charger is decoupled from the battery. The main energetic quantity of interest is then how much energy can subsequently be extracted from the battery as work. The mean energy of the battery is denoted by is the battery's state, which may be far from equilibrium and exhibit significant energy fluctuations and entropy. As a result, the extractable work is generally less than E[ρ B ].
In order to quantify the useful portion of the deposited energy, we use the ergotropy [6], which upperbounds the work that can be extracted from the battery by a cyclic variation of its Hamiltonian. Any such cyclic process generates a unitary operationÛ , which reduces the average energy of the battery by an amount Since the unitary transformation is isentropic, an energy change W ex > 0 may be attributed entirely to extracted work. The ergotropy is defined as the maximum work extractable from the stateρ B by any unitary, i.e. E = maxÛ W ex .
To get an explicit form for the ergotropy, we write the state in its eigenbasis {|ψ n }, asρ B = n p n |ψ n ψ n |, with eigenvalues ordered so that p n+1 ≤ p n . The maximal work is extracted when U takesρ B to the corrresponding passive state [73], πρ B = n p n |n n|, where |n are the eigenvectors of H B ordered by increasing energy, i.e.N |n = n |n . The ergotropy is thus given by By definition, a passive state is diagonal in the energy eigenbasis, with more population in low-energy eigenstates than high-energy ones. Thermal equilibrium states are passive, for example. According to Eq. (13), any state possessing ergotropy must be non-passive, i.e. having population inversion or coherence in the energy eigenbasis. Ergotropy thus quantifies the degree to which the charger-mediated energy transfer is ordered (work-like), rather than entropic (heat-like) [74].
Following Ref. [16], we may explicitly distinguish the incoherent and coherent contributions to the ergotropy, so that E = E i +E c . The incoherent ergotropy E i can be defined operationally as the maximum work extractable by a coherence-preserving unitary operation, and it is specified by , witĥ δρ B = n |n n|ρ B |n n| the dephased state in the energy eigenbasis. Because E i depends only on the energy distribution of the stateρ B , it quantifies the work that is extractable solely by changing the populations in the energy eigenbasis. The remainder E c = E − E i thus isolates the contribution to ergotropy from coherence.

Optimal Markovian feedback
We begin by considering the ideal scenario of feedback with negligible delay, τ → 0. The ensemble dynamics in this case is defined by the Markovian master equation (11). Our aim is to choose the control parameters Ω 0 and f in order to maximise the final battery charge. In the following, we introduce two strategies -appropriate for different parameter regimesbased either on stabilising population inversion or coherence in the qubit charger.

Stabilising population inversion
The first control strategy aims to stabilise the state of the qubit charger as close to its excited state as possible [48]. This aim is met by setting Ω 0 = 0 and f > 0 in Eq. (8), so that the Rabi frequency tends to have the opposite sign to the measurement record (7). The feedback mechanism can be understood intuitively by visualising the state of the qubit on the Bloch sphere, as depicted in the central inset of Fig. 1 (blue arrows). Whenever the Bloch vector rotates away from the vertical, the conditional state acquires a finite expectation value σ x r = Tr[σ xρr (t)], which is recorded in the homodyne signal. In response, the feedback applies a drive proportional to −f σ x r that acts as a restoring force.
To find the optimal value of f , we consider the state of the system at asymptotically long times, which is given by the stationary solution of the master equation (11), i.e. dρ/dt = 0. Following Ref. [75], we posit a product ansatzρ =ρ CρB for the stationary state, whereρ C andρ B are diagonal in the energy eigenbasis, viz.
p n |n n| , (14) with successive populations related by a fixed ratio It is easy to check that such a state commutes with the interaction Hamiltonian, , the stationary state is then obtained by choosingρ C so that L CρC = 0, with L C acting only on the qubit degrees of freedom. The solution of this equation for Ω 0 = 0 is fully characterised by the expectation value which is maximised by the choice f = Γ. This condition is quite intuitive, as it balances the rate of dissipation, which tends to destroy population inversion, The maximum asymptotic charge of the battery can now be found directly from Eqs. (14), (15), and (17). The corresponding energy and ergotropy are given by with R = (1 − η) −1 for the optimal feedback, f = Γ.
The above solution can be interpreted in terms of thermalisation between the charger and the battery at a virtual temperature given by T v = −ω 0 / ln R [75,76]. For any non-zero efficiency we have R > 1 and T v < 0, implying that the battery is placed in a population-inverted state with finite ergotropy. This occurs irrespectively of the value of g, as a consequence of the property [Ĥ int ,Ĥ 0 ] = 0 of the interaction Hamiltonian (3). The maximum possible ergotropy is E max = ω 0 (d−1), which we adopt as a convenient reference energy scale. For concreteness, we take a particular representative value d = 20 in most of the following examples. Choosing another value of d would merely rescale the final battery charge and charging time; see the Appendix for a detailed analysis.
In Fig. 2 we plot the steady-state battery energy and ergotropy as a function of the feedback strength.
We see that efficient measurements allow for perfect charging, in the sense that E = E = E max at the maximum where f = Γ. Crucially, however, Fig. 2 shows that rather inefficient measurements with η = 0.3 still lead to significant energy and ergotropy deposited in the battery. Therefore, even when the majority of spontaneously emitted photons are irreversibly lost to the environment, it remains possible to exploit the information gained from the weak measurement to stabilise the battery in a charged state. Since this state is diagonal in the battery's Hamiltonian eigenbasis, the ergotropy E = E i is purely incoherent in this case.
Another notable feature of Fig. 2 is the sharp disappearance of the ergotropy at f = Γ/2. This reflects the fact that σ z changes sign to become negative for f < Γ/2, as can be seen from Eq. (16). Therefore, the population inversion of the charger and, correspondingly, the ergotropy of the battery both vanish when the drive is too weak. In the following section, we consider an alternative approach that is appropriate for this weak-driving regime.

Stabilising coherence
The second control strategy targets the coherence of the qubit charger. We will see that this approach allows for charging even when the driving Rabi frequency is much smaller than the dissipation rate. We therefore restrict our considerations to the regime where |f | < Γ/2, in which we find numerically that population inversion cannot be generated. Nevertheless, by choosing Ω 0 = 0 and f < 0, it is possible to stabilise qubit states with finite coherence in the lower half of the Bloch sphere [47,48]. An intuitive picture of this feedback mechanism can be understood by inspecting the inset of Fig. 1 (red arrows). The constant drive and the conditional feedback either counterbalance or reinforce each other depending on whether the Bloch vector lies in the left or right hemisphere, generating a finite value of σ x on average.
Unlike in the previous section, an analytical derivation of the optimal feedback parameters for arbitrary d is hindered by the presence of coherences and correlations between the charger and the battery. We therefore proceed numerically by finding the zero eigenvector of the generator in Eq. (11), corresponding to the stationary solution of the master equation. In Fig. 3, we plot the asymptotic energy and ergotropy as a function of Ω 0 for an example with Γ = 10g and η = 0.3. We observe that the optimal drive strength is on the order of the coupling g. Crucially, adding feedback with f < 0 increases the maximum ergotropy that can be stored in the battery, even for inefficient measurements. We have found that larger values of |f | lead to an increase in the peak ergotropy, and similar behaviour is found for other parameter choices satisfying Γ g, f, Ω 0 . However, the attainable ergotropy is generally small in comparison to the control strategy discussed in Sec. 3.1.1. Perhaps unsurprisingly, stabilising the charger's coherence leads to a build-up of almost purely coherent ergotropy in the battery. This can be seen from the blue lines in Fig. 3, which demonstrate that the incoherent ergotropy E i is negligibly small or zero and thus E ≈ E c . Interestingly, the inclusion of feedback leads to a small decrease in the battery's mean energy, even though the ergotropy is increased. This shows that the primary advantage of feedback in this case is to increase the purity and coherence of the battery's state.

Dynamics of the charging process
In the previous section we discussed optimal feedback strategies that maximise the final battery charge. We now focus on the dynamics of the charging process, assuming that both battery and charger are initialised in their respective ground states. First we consider the case where the control is set to optimally stabilise the charger's population inversion, f = Γ. Some representative results for the time-dependent energy and ergotropy are shown in Fig. 4, obtained by numerically solving the master equation (11) with efficiency η = 0.3. We see that both the energy and ergotropy grow monotonically towards their asymptotic value. This highlights the stability of the charging process, meaning that the precise instant at which the battery is extracted is unimportant so long as sufficient time has elapsed.
To emphasise the random character of the underlying measurement and feedback process, we include in Fig. 4 some trajectories obtained by solving the stochastic master equation (9) for the same parameters. We numerically integrate Eq. Euler-Milstein scheme proposed in Ref. [77], which guarantees complete positivity. Each trajectory represents a possible outcome of a single run of the charging protocol. The battery energies for different trajectories show a significant dispersion during the transient energy transfer process, yet these fluctuations are strongly suppressed in the steady state by the stabilising effect of the feedback. This indicates that the charger works not only in an ensemble-averaged sense but also at the single-trajectory level, even at low measurement efficiency. This remarkable effectiveness is partly due to the assumption of Markovian feedback: the performance will be seen in Sec. 3.3 to deteriorate substantially when large time delays in the feedback loop are considered.
For a fixed efficiency, the time taken for the battery to reach its maximum charge depends on a competition between the rate of dissipation Γ and the coupling strength g. In order to quantify the charging time precisely, we identify the time T at which the battery's energy differs from its asymptotic value by a fractional error , i.e.
where we choose a small (arbitrary) value = 10 −2 . The behaviour of the charging time is shown in Fig. 5 as a function of the driving and dissipation rate f = Γ, for three different values of the measurement efficiency. Naturally, the charging process is fastest for efficient measurements, becoming progressively slower as η is reduced below unity. We also find that for each η there exists an optimal value of the coupling Γ that minimises the charging time. Such an optimum is expected, since for Γ = f g the charging speed is limited by the small power input, while for Γ g the system enters a quantum Zeno regime where energy transfer from contact to battery is inhibited by fre- quent spontaneous emissions. Since efficient measurements use every emitted photon to improve the feedback, the decohering effect of spontaneous emission is most dramatic at low efficiencies. The optimum driving and dissipation strength is therefore f = Γ ∼ g, with the optimal Γ decreasing with η, as shown in Fig. 5.
For comparison, in Fig. 6 we plot the battery dynamics obtained for the coherence-based charging protocol defined in Sec. 3.1.2. We focus on parameters near the optimal point of Fig. 3 where the final ergotropy is maximised. In the absence of feedback, a constant drive is seen in Fig. 6 to generate transient oscillations, with the ergotropy peaking at short times before settling to a smaller steady-state value. Feedback tends to reduce the magnitude of these oscillations and stabilise asymptotic states with greater ergotropy. However, the feedback also substantially increases the time taken to reach the stationary state. Overall, we observe that the timescale of the coherence-based charging protocol seen in Fig. 6 is significantly longer than in Figs. 4 and 5, because the large impedance mismatch when Γ f, g suppresses energy transport.

Effect of feedback delay
So far we have assumed that the feedback control is applied instantaneously, but any real feedback loop has some delay due to the finite response time of the detector and controller. In this section, we examine how this delay influences the efficacy of battery charging, focussing on the optimal case where f = Γ. Since the dynamics is no longer Markovian, it is necessary to simulate the stochastic master equation (9) explicitly [77] and average over many trajectories. We find that the effect of time delay in the feedback loop is negligible so long as Γτ 1, in accordance with previous studies [50]. As the time delay τ increases, shot-to-shot fluctuations grow and attaining numerical convergence of the trajectory average becomes increasingly demanding. Our examples are therefore restricted to relatively small time delays, Γτ 0.5, in order to obtain reliable results with moderate resources. We take up to 500 trajectories for each parameter set.
The charging dynamics is plotted in Fig. 7 for an example with Γτ = 0.1. The ensemble-averaged behaviour is qualitatively similar to the case with no delay, albeit the timescale to reach stationarity is increased as can be seen by comparing the solid and dashed black lines in Fig. 7. However, the battery energy along individual trajectories exhibits significant dispersion around the average even in the steady state, marking a clear departure from Markovian feedback [c.f. Fig. 4]. These fluctuations tend to increase the entropy of the ensemble and the achievable ergotropy is correspondingly reduced.
To examine the effect of delay on the final battery charge in more detail, we plot the steady-state energy and ergotropy as a function of τ in Fig. 8. With efficient measurements, η = 1, the delay has essentially no effect for Γτ < 0.1, while the energy and ergotropy are seen to progressively decrease for Γτ > 0.1. For inefficient measurements, in contrast, the attainable charge begins to deteriorate as soon as any finite delay is introduced, as shown by the dashed lines in Fig. 8.
As the feedback delay increases, very large differences arise between individual trajectories and the ergotropy of the ensemble-averaged state decays to zero. This indicates a complete breakdown of the feedback loop due to lag between the measurement backaction and the control response. Indeed, the measured value of σ x r drifts over a timescale on the order of Γ −1 , by which time the delayed feedback is likely to drive the qubit away from the inverted state instead of towards it. This randomises the direction of the Bloch vector and leads the charger to a maximally mixed state, which is equivalent to infinite temperature. The battery then effectively thermalises with the qubit to a fully passive state with E ≈ 0.5E max . This tendency can be seen in the dashed curves of Fig. 8 for larger τ .

Discussion
In this work, we have explored the use of linear feedback control to power a qubit charger coupled to a quantised, finite-dimensional battery. We have introduced two different control protocols based either on stabilising population inversion or quantum coherence, which respectively generate incoherent or coherent ergotropy. Both kinds of feedback have been shown to improve the stability and the asymptotic battery charge, as compared to the case with unconditional driving (no feedback). However, the approach of Sec. 3.1.1 based on population inversion has superior performance overall. In particular, this strategy allows for effective charging even under the realistic constraints of inefficient measurements and a small time delay in the feedback loop. Incoherent ergotropy is arguably preferable for energy storage because it is robust against dephasing and can be extracted without coherence-changing operations. In contrast, extracting work from coherent ergotropy, as generated by the protocol of Sec. 3.1.2, requires a coherent drive with a definite phase relationship to the original charging field.
Although we have simplified our model by taking the charger-battery interaction Hamiltonian to have energy-independent matrix elements, we expect our conclusions to apply in other settings, e.g. spin systems or mechanical oscillators, whenever nearresonant interactions are a good approximation. As discussed in the Appendix, detuning between the charger and battery frequencies increases the charging time but does not affect the final charge achievable under optimal feedback. We also show in the Appendix that our scheme tolerates a small amount of finite-temperature noise. More generally, our results highlight the potential of feedback -already well established in the context of refrigeration [67][68][69] -for the manipulation of energy and ergotropy at the quantum scale. We have demonstrated that these quantities can be stabilised by controlling an auxiliary qubit with intuitive feedback strategies, while avoiding the need for complex time-dependent control over the battery system itself. While we have considered the simplest case of direct, linear feedback, the energetic and ergotropic capabilities of more sophisticated feedback protocols based on quantum state estimation [78,79] deserve further investigation.
It is worth emphasising that the metrics we use to assess performance refer to an ensemble of many identical, independent batteries. In particular, the ergotropy only quantifies the energy that can be extracted on average. We leave the important issue of extractable work fluctuations and charging precision to future research [80][81][82][83]. We also note that an experimenter could conceivably exploit their knowledge of the measurement record to optimise the work extraction step differently for each individual battery. The tools of single-shot statistical mechanics appear well suited to analyse this more involved scenario [84,85].
Our main focus has been the dynamics and en-ergetics of the charging process, leaving open several interesting questions regarding the thermodynamic value of quantum measurements [86][87][88][89][90] and feedback [91][92][93] in this setting. Energetic constraints on discrete quantum feedback operations can be rigorously formulated by generalising the notion of entropy production to incorporate information gained by the controller [94][95][96][97][98]. Similar ideas have been applied to jump-like unravellings of open quantum system dynamics [99][100][101][102]. However, a comprehensive formulation of information thermodynamics along continuous, stochastic trajectories between quantum superposition states appears to be considerably more elusive, notwithstanding some notable recent progress [56][57][58][103][104][105]. Entropic considerations aside, any serious thermodynamic account of a putative quantum battery should include the energy needed for work extraction [106], not to mention the power consumption of the classical control apparatus, which typically exceeds quantum scales by dozens of orders of magnitude. These problems naturally motivate [107] a fully autonomous approach to feedback control in quantum thermodynamics, e.g. along the lines of Ref. [108].

A Appendix: Variations of the model
In this appendix, we analyse the role of the battery dimension and extend the model to include the effect of finite temperature and detuning. We focus for simplicity on the limit of Markovian feedback.

A.1 Generalised model
In general, we write the charger and battery Hamiltonians asĤ where the energy scales ω 0 and ω B may be different. We assume the driving field is resonant with the qubit at frequency ω 0 . In a frame rotating at this frequency, the total Hamiltonian then readŝ with ∆ B = ω B − ω 0 the charger-battery detuning, and whereĤ int andĤ drive (t) are given by Eqs. (3) and (5), respectively.
To model the effect of thermal noise, we assume that the uncollected photons are emitted into a thermal radiation bath with inverse temperature β corresponding to a mean occupationn = (e βω0 − 1) −1 . For n = 0, this opens the additional possibility of photon absorption by the qubit charger. The decay channel corresponding to photons collected by the detector is assumed to remain at effectively zero temperature. Therefore, the dissipative evolution is described by the superoperator (c.f. Eq. (10)) where •} is the standard Lindblad dissipator describing thermal emission and absorption [1]. We recall that η c denotes the fraction of photons collected in the detection channel, leading to a total efficiency η = η c η d , where η d is the detector efficiency. Linear, Markovian feedback is described by Eq. (8) in the limit τ → 0. Following the procedure described in Sec. 2.2, we obtain a Markovian master equation for the ensemble-averaged density operator, which reads This reduces to Eq. (11) in the absence of detuning, ω B = ω 0 , and in the limit of zero temperature, βω 0 → ∞.

A.2 Battery dimension
Let us first briefly address the role of the battery dimension d. For this analysis, it is sufficient to assume that the charger and battery frequencies are resonant, ω B = ω 0 , and take the zero-temperature limit, as in the main text. Fig. 9 shows the time evolution of the battery charge for three values of d. Here, we do not normalise by E max = ω 0 (d − 1), in order to better highlight the differences between the three cases. At short times, i.e. when E E max , the battery energies and ergotropies evolve identically for all three values of d, while at later times the curves depart from each other due to the upper-bounded spectrum of the battery. Larger values of d obviously allow for a greater steady-state energy and ergotropy, although the battery takes longer to fully charge. Overall, the evolution is qualitatively similar for all values of d, and this conclusion also holds for other parameter choices.

A.3 Finite temperature and detuning
Now we examine the effect of finite temperature, allowing also for a finite detuning, ∆ B = 0. We focus on the control strategy where population inversion is stabilised, and thus set Ω 0 = 0. In this case, it is possible to solve exactly for the steady state following the method described in Sec. 3.1.1. We again posit the product ansatzρ =ρ CρB , with factors given formally by Eq. (14). It is then straightforward to check that [∆ BN +Ĥ int ,ρ] = 0 for this state. The state of the charger follows from the solution of L CρC = 0, where the dissipator L C acts only on the qubit and is given explicitly by the second and third lines of Eq. (24). We obtain The steady-state properties are manifestly independent of the detuning, ∆ B , which affects only the transient dynamics under this control strategy. For a given temperature and collection efficiency, the optimum feedback strength is found by maximising Eq. (25) with respect to the ratio f /Γ. The solution is given by with the corresponding population inversion   Figure 11: Dynamics of the charging process under optimal feedback, with ∆B = 0 andn = 0 (solid lines), ∆B = g and n = 0 (dashed lines), and ∆B = 0 andn = 2 (dot-dashed lines). We also take Γ = 2g, Ω0 = 0 and f given by Eq. (26).
(c.f. Eq. (17)) Therefore, finite-temperature dissipation generally reduces the maximal population inversion of the qubit charger, and increases the feedback strength necessary to stabilise it. In particular, perfect charging is no longer possible because σ z < 1. The corresponding steady-state battery charge is plotted in Fig. 10 as a function of the thermal occupation. The ergotropy and energy are both seen to decrease monotonically with increasing temperature. A similar reduction in charging performance with temperature is observed for other parameters, including coherent stabilisation strategies with Ω 0 = 0. Physically, these results can be understood in terms of the randomising effect of thermal noise. Even though thermal absorption injects energy into the charger, it does so incoherently and therefore reduces the attainable purity of the qubit state. This ultimately obstructs the coherent feedback loop from stabilising ergotropy in the battery. As a consequence, finite temperature also increases the time taken to reach the steady state.
We illustrate the charging dynamics in Fig. 11, which compares the evolution of the energy and ergotropy under the optimal feedback (26) with and without thermal noise and detuning. We see that both of these effects slow down the charging process. However, detuning has no effect whatsoever on the steadystate battery charge. Note that this conclusion holds only under the assumption of an excitation-preserving coupling in the form of Eq. (3). A large value of the detuning, ∆ B g, would activate any counterrotating terms that are neglected when assuming an interaction of this form. These contributions open up additional transition pathways that could significantly modify the dynamics for large ∆ B .