Quasi-probability distributions for observables in dynamic systems

We develop a general framework to investigate fluctuations of non-commuting observables. To this end, we consider the Keldysh quasi-probability distribution (KQPD). This distribution provides a measurement-independent description of the observables of interest and their time-evolution. Nevertheless, positive probability distributions for measurement outcomes can be obtained from the KQPD by taking into account the effect of measurement back-action and imprecision. Negativity in the KQPD can be linked to an interference effect and acts as an indicator for non-classical behavior. Notable examples of the KQPD are the Wigner function and the full counting statistics, both of which have been used extensively to describe systems in the absence as well as in the presence of a measurement apparatus. Here we discuss the KQPD and its moments in detail and connect it to various time-dependent problems including weak values, fluctuating work, and Leggett-Garg inequalities. Our results are illustrated using the simple example of two subsequent, non-commuting spin measurements.


Introduction
Quasi-probability distributions such as the Wigner function have been an important tool in quantum mechanics since the early days of the theory [1]. Indeed, the Wigner function lies at the heart of the phase-space representation of quantum mechanics which is tightly connected to classical statistical mechanics and provides an alternative route to quantization [2]. In classical statistical mechanics, averages of observables can be described as averages taken with respect to some underlying probability distribution. This probability distribution fully describes the system under consideration and reflects the lack of knowledge of the observer. If one has perfect knowledge of the behavior of all degrees of freedom, there would be no need for probability distributions and measurement outcomes could be predicted in a deterministic manner. This is not possible in quantum mechanics due to the probabilistic nature of the theory. Similarly to classical statistical mechanics, the fluctuations in quantum mechanics can be encoded in quasi -probability distributions. These distributions can take on negative values which is ultimately a consequence of the non-commutativity of operators, reflecting the fact that a measurement inevitably disturbs all subsequent measurements of observables which do not commute with the first one. Negative quasi-probability distributions therefore reflect the impossibility of accessing observables without disturbing them, indicating a non-classical behavior of the system which requires to take into account the action of the measurement apparatus to predict its outputs. The concept of negative quasi-probability distributions, or negativity in short, is therefore closely related to the concept of contextuality which deals with the dependence of measurement outcomes on the experimental setup [3][4][5][6].
So far, most quasi-probability distributions describe the quantum state of a system. This holds true for the Wigner function and its generalizations to finite dimensional Hilbert spaces [6][7][8], as well as a number of other distributions used in quantum optics [9,10]. Negativity in these distributions

The Keldysh quasi-probability distribution
In this section we briefly introduce the Keldysh formalism, a powerful framework for tackling outof-equilibrium problems, where the KQPD arises quite naturally through its moment generating function. Here we only touch upon the ideas in the Keldysh formalism which are relevant for our discussion. For a detailed introduction to the subject we refer the reader to Ref. [23]. We will start by considering the somewhat trivial example of the statistics of a single observable before we generalize the results to the highly non-trivial scenario of multiple, non-commuting observables at different times.

Single observable
The main idea that we will be concerned with is the evolution along the closed time contour illustrated in Fig. 1, an idea that goes back to Schwinger [24]. In this section, we are interested in the statistics of a single observableÂ at time τ . Its average value can be written as whereÛ (t , t) =Û † (t, t ) is the time-evolution operator from time t to time t andρ 0 denotes the density matrix at time t = 0. Throughout this article, we consider unitary time evolution.
Generalizing the results to non-unitary time evolution is in principle straightforward and discussed in App. A. Writing the density matrix as a mixture of pure states, the first line of the last expression can be interpreted as time evolving the states up to time τ , evaluating the operatorÂ, and evolving the states backward to the initial time t = 0 [cf. the left-hand side of Fig. 1]. On the second line, we inserted identities 1 =Û (τ, T )Û (T , τ ), extending the contour along which the state is time evolved. We call the resulting contour, from t = 0 to t = T and back again, the Keldysh contour. We can evaluate the operatorÂ either along the forward-in-time or along the backward-in-time branch of the contour. As discussed below, the most natural choice is to treat both contours on an equal footing as illustrated graphically in Fig. 1.
More information on the observableÂ can be obtained by considering its moment generating function given by All higher moments can be obtained from this generating function by taking derivatives with respect to λ In anticipation of the discussion below, we wrote Eq. (2) symmetrically in terms of the Keldysh contour. Taking the derivative with respect to λ at λ = 0, we recover Eq. (1). Since the moment generating function is the average of the exponential ofÂ, it corresponds to the Fourier transform of a probability distribution associated to the fluctuations ofÂ where the sum over A goes over all eigenstatesÂ|A = A |A and has to be replaced by an integral ifÂ has a continuous spectrum. In the case of a single observable, the KQPD P(A; τ ) is a positive probability distribution which predicts the outcome of projectively measuringÂ at time τ . The symmetric treatment with respect to the forward-in-time and the backward-in-time branches of the Keldysh contour might seem like an unnecessary complication at this point. Below, we show how such a treatment for multiple observables leads to a time-ordering prescription which is relevant for von Neumann type measurements. Before discussing multiple observables in detail, we express the moment generating function in terms of the classical and quantum fields obtained by a Keldysh rotation [cf. Eq. (8)]. To this end, we assume that the observableÂ has a discrete spectrum and that its eigenstates span the whole Hilbert space, i.e. 1 = A |A A|. We then divide the time evolution along the Keldysh contour in Eq. (2) into 2N steps (N for the forward-, N for the backward-in-time branch) of infinitesimal length δt and insert identities in between the steps. Writing the trace in terms of the eigenstates ofÂ, we then find Here, A denotes the state that was inserted at time t on the forward-in-time (backward-in-time) branch, the vectors A ± gather all A ± t and the amplitudes along the branches read The Kronecker delta in Eq. (5) is due to the fact that the point t = T connects the two branches (cf. Fig. 1). We can write the moment generating function in the form of a path integral The moment generating function thus describes the fluctuations of A cl τ and the probability density P(A; τ ) is given by the sum over all paths for which A cl τ = A, weighted by the complex exponential of the action in complete analogy to Feynman path integrals [25]. In the case of a single observable, the quantum field is zero (i.e. A + τ = A − τ ), making its introduction seem like an unnecessary complication. However, when multiple observables are involved, the quantum fields are no longer bound to be zero. Considering the fluctuations of the classical fields then corresponds to a given time-ordering prescription. We now generalize the results of this section to an arbitrary number of (possibly non-commuting) observables at arbitrary times.

Multiple observables
The KQPD, denoted by P(A), can be defined via its moment generating function which is obtained by generalizing Eq. (2) and reads HereT +(−) is the (anti) time-ordering operator, the variables λ l and A l are grouped in vectors λ and A, and the operators are given in the Heisenberg picture, i.e.
whereÂ denotes the operator in the Schrödinger picture (where all considered operators are timeindependent). In the last line of Eq. (11),T K denotes time-ordering along the Keldysh contour (from t = 0 to t = T and back again). The functions f l (τ ) take into account a possibly timedependent coupling to the observablesÂ l . When using the Keldysh contour, the functions f l take on the same values on the forward-and the backward-in-time branch. The case of N observables probed instantaneously at different times τ l is obtained by taking f l (τ l ) = δ(t − τ l ) (for l ≤ N ) resulting in the moment generating function The apparent symmetry between terms on the left and on the right of the density matrix is equivalent to treating the forward-and the backward-in-time branch of the time evolution symmetrically and determines the time-ordering of operators when evaluating higher moments (see below).

Examples
We now introduce two well-known examples of the KQPD: the Wigner function and the Full counting statistics (FCS). To recover the Wigner function from the moment generating function in Eq. (11), we set which results in where we identifiedx =x(0) and similarly forp. It is straightforward to verify that the Wigner function is indeed given by The Wigner function is thus recovered as the KQPD corresponding to a simultaneous observation ofx andp. In the next section, we discuss how the Wigner function can be used to predict a simultaneous (but imprecise) measurement of position and momentum [26]. The moments obtained from the Wigner function are determined by the Weyl order [27] obtained by complete symmetrization, i.e. every possible order is given the same weight, e.g.
This is a general result for the KQPD; when considering multiple operators at equal times, the moments generated by the KQPD are determined by the quantum averages of the completely symmetrized operators. The next example we consider in this section is the FCS which is obtained from the moment generating function in Eq. (11) upon setting which results in The FCS is given by the Fourier transform of the last expression and gives information on the fluctuations of the time integral ofÂ. The FCS has been investigated intensively in the context of electron transport [14], whereÂ corresponds to the current operator and one is interested in the fluctuations of the charge that is transferred through a conductor during a time interval [0, T ].
The moments of the FCS can conveniently be expressed using the Keldysh contour We thus find that the time-ordering corresponding to the KQPD is determined by the Keldysh timeordering, implying that operators are ordered along the Keldysh contour. The Wigner function as well as the FCS have been discussed extensively in the literature. Explicit evaluations exhibiting negativity can be found in, e.g., Refs. [19,28].

Negativity
As mentioned above, negativity in the KQPD indicates that measuring the considered observables necessarily influences the outcomes (i.e. measuring all of the operators results in different statistics than measuring each of them in a different realization). Following Ref. [19], we corroborate this claim by connecting negative values in the KQPD to coherent superpositions of different eigenstates of the observables. In the presence of such superpositions, any measurement (partly) collapses the state, possibly influencing subsequent measurement outcomes. For clarity, we consider the case of N observables at subsequent times as described by the moment generating function in Eq. (13).
Assuming that the eigenstates of each operatorÂ l span the whole Hilbert space, we replace whereÂ l |A α l = A α l |A α l and the superscript α = L, R labels if the replacement happened on the left (L) or on the right (R) of the density matrix in Eq. (13), i.e. α labels the branch (forward-or Figure 2: Keldysh quasi-probability distribution for two subsequent spin measurements on a pure qubit state. While the second argument is restricted to the eigenvalues ofσ2, the first argument can also be zero. (a) σ0 =σ2 =σx andσ1 =σz. In this case, a measurement ofσ1 influences a subsequent measurement of σ2 which results in a negative value forP(0, −1). (b)σ0 =σx,σ1 =σy, andσ2 = σz. In this case, the measurement outcome ofσ2 is completely random, no matter if the first measurement is performed or not. A measurement ofσ1 does therefore not influence a subsequent measurement ofσ2 and the KQPD is purely positive.
backward-in-time). This allows us to take the λ-dependent factors out of the trace and obtain the KQPD by a Fourier transformation Here the vectors A L/R have as their entries A L/R l and thus denote a particular manifestation of eigenvalues that we call a trajectory. The amplitudes associated with such a trajectory read the Dirac delta for vectors is defined as andρ τ1 =Û (τ 1 , 0)ρ 0Û (0, τ 1 ). Equation (28) expresses the KQPD as a sum over pairs of trajectories. These pairs, here denoted left (L) and right (R) trajectory, are equivalent to a single trajectory along the Keldysh contour. A given pair of trajectories contributes with a weight that is given by the interference term A(A L )A * (A R ) times a density matrix element A L 1 |ρ τ1 |A R 1 corresponding to the "starting point" of the trajectories. Each pair contributes to the KQPD at the argument which corresponds to the mean value of the observable on the two trajectories, again underlining the fact that the KQPD considers the fluctuations of the classical field in the Keldysh language (see below). Finally, the Kronecker delta in Eq. (28) results from the trace and reflects the fact that the disturbance of the last measurement is irrelevant.
From Eq. (28), we see that as long as A L = A R , the KQPD is strictly greater or equal to zero. It is therefore the interference terms of trajectories that differ from each other which allow the KQPD to become negative. This is a genuine quantum effect since these interference terms are a direct result of a coherent superposition between different eigenstates of the observablesÂ l . Although superposition can be encountered in classical (wave) theories, a coherent superposition between two states which correspond to two different values for an observable does not exist in classical theories (indeed, the presence of such superpositions violates the assumptions made by Leggett and Garg to obtain their inequalities [29], see Sec. 4.3).
In analogy to Eqs. (8) and (10), we can introduce a Keldysh rotation and write the KQPD in Eq. (28) as a path integral with the action and the integration measure In this language, the KQPD is given by the sum over all paths for which A = A cl , weighted by the complex exponential of the action. From Eqs. (32) and (33), we see that any negativity in the KQPD necessarily implies the presence of a non-zero quantum field A q = 0. In contrast to Eq. (10), which describes a single observable, the quantum field in Eq. (32) is not necessarily zero implying that the statistics of A cl , determined by the Keldysh time-ordering, are not necessarily equal to the statistics of A L or A R .
We now illustrate these ideas with the KQPD for two subsequent qubit measurements introduced in the last section. In terms of trajectories, the discrete KQPD in Eq. (26) can be expressed asP We thus see that the terms where σ L 1 = σ R 1 are responsible for the termsP(0, σ 2 ). Note that these terms are only non-zero if the initial state is in a coherent superposition of the two eigenstates of σ 1 . One can further show that σ1,σ2=±1P which directly implies thatP(0, +1) = −P(0, −1). Thus any contribution to Eq. (34) from pairs of trajectories with σ L 1 = σ R 1 results in negativity in the KQPD as illustrated in Fig. 2. We close this section with a brief comparison to other non-classicality indicators based on negative quasi-probabilities. As mentioned above, the Wigner function (at time t) is the KQPD for the observablesx(t) andp(t). A negative Wigner function implies an impossibility of describing the corresponding state classically in the canonical position-momentum phase space (i.e. as a mixture of states with well defined positions and momenta). The KQPD generalizes this concept to arbitrary observables. Negativity in the KQPD implies that it is impossible to describe the system and its time evolution as a mixture of states with well defined values for the observable outcomes. We note that negative values in a discrete analogue of the Wigner function have recently been connected to the computational speed-up of a quantum computer [30][31][32].
In quantum optics, one often considers negativity, or strong singularity, in the Glauber-Sudarshan P function as an indicator of non-classicality [33,34]. This criterion captures all states with nonpositive Wigner functions as well as squeezed states and entangled states [35]. A non-classical P function implies that the state can not be interpreted as a mixture of coherent states which indicates that the state can not be fully described by classical optics [34]. We note that the P function has recently been extended to multiple times [36][37][38]. In contrast to the KQPD, this distribution is relevant for photo-detection and its moments correspond to normal ordering and regular time ordering.
While these QPDs only contain information on the states and possibly their time-evolution, quasi-probability representations also describe measurements with QPDs [3][4][5][6]. It has been shown that a positive representation (i.e. a representation that only uses positive probability distributions) of all states and measurements is impossible. Restricting the states and measurements to a given experimental procedure, a procedure is then said to require non-classical resources if no positive representation can be found [4]. It is an interesting question if negativity in the KQPD implies the absence of a positive representation for a corresponding experimental procedure.

von Neumann measurements
In this section, we consider von Neumann type measurements and we show that the corresponding probability distributions can be obtained by convolving the KQPD with the back-action and imprecision associated with the measurement. We start by considering the instantaneous measurement of a single observable before moving on to subsequent instantaneous measurements, simultaneous measurements, and measurements of finite duration.

Instantaneous measurements
Following von Neumann [21], we model the measurement of an observableÂ at time τ by coupling it to a detector described by the canonically conjugate observablesr andπ via the interaction HamiltonianĤ where χ denotes the coupling strength. This interaction induces the time-evolutionÛ m = exp(−iχÂπ), which displaces the detector coordinater by an amount that depends on the state of the system. Projectively measuring the position of the detector then yields the measurement ofÂ. Measuring a single observable in this way yields the probability distribution (see App. B) whereρ det is the density matrix of the detector, |χ(A − A ) is an eigenstate of the operator r, and P(A ; τ ) is the KQPD for a single observable given in Eq. (4). Note that the position uncertainty in the initial detector state directly translates into a measurement uncertainty. A projective measurement can be obtained by choosing a position eigenstate as the initial state for the detector. In this case, the measured distribution coincides with the KQPD. This is due to the fact that the back-action of the measurement is irrelevant for a single observable since we are not interested in the post-measured state. For multiple subsequent (instantaneous) measurements, we couple each observable to a different detector. The measurement Hamiltonian then readŝ In contrast to a single observable, each measurement can now influence all subsequent measurements. To capture this back-action effect, we evaluate the KQPD with the additional term l δ(t − τ l )γ lÂl in the Hamiltonian, where the γ j are random variables which are distributed according to the detector states (see below). This yields the modified moment generating function which can be obtained by replacing λ l → λ l ± 2γ l in Eq. (13), where the + (−) sign corresponds to terms on the left (right) side of the density matrix.
The probability distribution which describes subsequent von Neumann measurements can then be shown to read (for a derivation, see App. B) where W l (r, γ) denotes the Wigner function of detector l and P(A ; γ) is the KQPD defined as the Fourier transform of the moment generating function in Eq. (39). The measurement outcomes are determined by a convolution of the KQPD and the Wigner functions of the detectors. We can thus interpret the KQPD as describing the intrinsic fluctuations of the observables of interest which is being influenced by the effects of the measurement [17]. The position distributions of the detectors determine the precision of the measurement while the momentum distributions determine the backaction via the γ l -dependent terms in the KQPD. The Heisenberg uncertainty relation implies a trade-off between these two effects, indicating that we can never eliminate both influences.
As an example, we consider a measurement of positionx at τ = 0 followed immediately by a momentump measurement. The probability distribution describing such a measurement reads Here W x/p denotes the Wigner function of the detector measuring x/p and W without a subscript denotes the KQPD which in this case is given by the Wigner function of the system. The position measurement disturbs the momentum of the system as reflected by the second argument of the KQPD. Note that it is the momentum distribution of the x-detector which determines the effect of this disturbance. This reflects the trade-off between the precision of the position measurement (which requires a peaked position distribution for the x-detector) and the disturbance of the subsequent momentum distribution (which is minimized for a peaked momentum distribution for the x-detector). Note that the last expression is independent of the momentum distribution of the p-detector (this can be seen by performing the integral over γ p ), reflecting the fact that the backaction of the momentum measurement is irrelevant since it is the last measurement. Inverting the order of the measurements yields a probability distribution which is given by the last expression upon exchanging Returning to arbitrary observables and choosing Gaussian Wigner functions for the detectors we can write the probability distribution in Eq. (40) in terms of trajectories [cf. Eq. (28)] yielding where σ imp,l = σ r l /χ l and σ BA,l = σ γ l χ l . Comparing the last expression with the KQPD in Eq. (28), we can clearly identify the effects of the measurement imprecision and back-action. The first Gaussian in the last expression replaces the Dirac delta in Eq. (28). It implies that even terms with A l = (A L l + A R l )/2 contribute to the probability of measuring A l . This effect captures the imprecision of the measurement which becomes more pronounced as the uncertainty in the position of detector l grows. The second Gaussian in the last expression reduces the weight of terms which have A L l = A R l , i.e. a finite quantum field in the Keldysh language. This effect is due to the measurement back-action which (partly) collapses the coherent superpositions between eigenstates ofÂ l . The coupling Hamiltonian in Eq. (38) entangles the system with the detector leading to dephasing in the reduced state of the system. As the uncertainty in the detector momentum grows, this effect becomes more pronounced. From the Heisenberg uncertainty relation, we find reflecting the trade-off between measurement imprecision and back-action (see also Ref. [39]) which ensures that the measured distribution is positive even if the KQPD exhibits negativities. We note that coupling the system to additional degrees of freedom, such as a thermal reservoir, introduces dephasing analogously to the measurement back-action [19].
To illustrate the effect of the measurement, we return to our example of two subsequent spin measurements [cf. Eq. (23)]. Since the back-action of the second measurement is irrelevant, we consider a measurement of intermediate strength ofσ 1 followed by a projective measurement of σ 2 . In terms of the discrete KQPD given in Eq. (26), we find for the distribution describing a von Neumann measurement (with Gaussian detectors) The effect of the measurement is illustrated in Fig. 3. We note that the distribution of measurement outcomes for the first measurement depends very strongly on the outcome of the second measurement. Specifically, we see from Fig. 3 that only considering events where the second (projective) measurement yields −1 allows one to distinguish the outcomes of the first measurement much better (for a given measurement strength). This effect, which arises from the negativity of the KQPD, is responsible for the occurrence of anomalous weak values as discussed in Sec. 4.2.
Note that subsequent von Neumann measurements can also be used to model continuous measurements, a subject which has received a lot of attention recently, see e.g. [40][41][42][43][44]. Depending on the measurement outcome of the second (projective) measurement, the distribution for the first measurement results in two clearly distinguishable outcomes or not. This is directly related to the termsP(0, σ2) which are responsible for the negative values in the KQPD. Note that the peaks are shifted with respect to the eigenvalues ofσ1 (dashed lines). This effect allows for the occurrence of anomalous weak values as discussed in Sec. 4.2.

Simultaneous measurements of position and momentum
We now turn to the simultaneous measurement of position and momentum. In analogy to Eq. (36) (and following Arthurs and Kelly [26]), we model the measurement by the simultaneous coupling of two detectors to the position and momentum of the system respectivelŷ It can be shown that this measurement is described by the probability distribution [45] W (x, p) = dx dp dγ The arguments of the last Wigner function reflect the fact that in a simultaneous measurement, both position and momentum are being disturbed. Note the close similarity between the last expression and the probability distribution describing subsequent measurements of position and momentum [cf. Eq. (41)]. Notably, the KQPD is in both cases given by the Wigner function. This is a consequence of the commutation relation [x,p] = i and does not hold in general for non-commuting observables. For a detailed discussion on the von Neumann type simultaneous measurement of position and momentum, we refer the reader to Ref. [22]. Here we only discuss the case of Gaussian states of minimal uncertainty for the detectors (k = x, p) with the additional constraint σ x σ p /(χ x χ p ) = 1/2. This constrained minimizes the disturbance of the measurement and yields the Husimi Q-function as the measured probability distribution [22,46] W (x, p) = 1 π dx dp e − (x−x ) 2 where σ = σ x /χ x . In addition to this von Neumann type measurement, the Husimi Q-function also describes heterodyne measurements of two field quadratures [47].

Measuring the FCS
In this section, we consider a von Neumann type measurement of the time-integral of an observable [15]. To this end, we couple a detector to the observable of interest for a finite amount of timê This measurement can be shown to be described by the probability distribution [15] where F(m; γ) is obtained through Eqs. (20) and (21) upon adding the term Θ(t)Θ(T − t)γÂ to the Hamiltonian. Again, we find a trade-off between measurement imprecision, depending on the spread in the detector position, and back-action which depends on the spread in the detector momentum. As discussed in Ref. [19], the last equation can be expressed in terms of trajectories yielding (for a Gaussian detector) where the trajectory amplitudes are obtained by dividing the time evolution into N time-slices of infinitesimal duration δt and inserting identities resolved in the eigenstates ofÂ after each slice The discrete time-integral over such a trajectory is denoted by m α = l A α l δt. The Heisenberg uncertainty relation applied to the detector variables again enforces σ imp σ BA ≥ 1/2, reflecting the trade-off between smearing out the negativities in the FCS by measurement imprecision and reducing the weight of negative terms (which necessarily have m L = m R ) by measurement backaction in complete analogy to Eq. (43).
This concludes our treatment on von Neumann measurements. Expressing the measured probability distributions in terms of the KQPD suggests that the latter should be interpreted as describing the intrinsic fluctuations of the observables of interest. In the next section, we will build on this interpretation and demonstrate the utility of studying the KQPD to understand the quantum behavior of dynamic systems.

Applications of the KQPD
We now turn to situations where it is beneficial to study the KQPD in the absence of any measurement device. We are especially interested in situations where the KQPD fails to be positive. As discussed in Sec. 2.4, this indicates non-classical behavior. The most prominent KQPD is by far the Wigner function, which is a representation of the density matrix and is used extensively to illustrate the state of a quantum system and to capture its non-classicality [28,48]. An example for a KQPD which goes beyond instantaneous states is provided by the FCS which describes the fluctuations of an observable over a certain time-interval. While the FCS has mostly been employed to describe electronic transport in systems where it is a purely positive function, its utility for detecting non-classical behavior has been recognized [16-19, 39, 49]. Here we discuss three additional applications of the KQPD, stressing its versatility and its utility in capturing non-classical behavior in the dynamics of quantum systems.

Work fluctuations
Since work is not a state function, it is in general not sufficient to interrogate a system at a single time in order to determine the work performed by the system. As we have seen above, performing measurements at multiple times generally results in a probability distribution which is influenced by the measurement itself. As a consequence, the observed amount of work will in general depend on the way it is measured [50]. Here we use the KQPD to describe work as a measurementindependent, fluctuating quantity. We note that the KQPD has been used before to describe work fluctuations [51][52][53].
In classical statistical mechanics, work is defined as the energy change in the macroscopic, accessible degrees of freedom, while heat is the energy change in the microscopic, inaccessible degrees of freedom. In quantum systems, all degrees of freedom are in principle microscopic. We can however still define work as being the energy change mediated by the accessible degrees of freedom. Here we consider two different scenarios: In the first, one has access to the energy stored in a part of the system that we call the work storage device. The energy increase of this device is then interpreted as the work performed by the system. This is the analog of determining the work performed on a suspended weight by measuring its height. In the second scenario, one has access to the power produced by the system and the work is defined as the time-integral of the power. This captures the situation in thermoelectric heat engines, where one can access the power through a measurement of the electrical current. In terms of the suspended weight, this corresponds to the analog of measuring the power by measuring the velocity of the weight.
For the first scenario, we write the total Hamiltonian asĤ =Ĥ w +Ĥ rest , whereĤ w denotes the Hamiltonian of the work storage device. Work fluctuations can then be described by the KQPD corresponding to the difference of two subsequent energy measurements on the work storage device (54) HereĤ w |E j = E j |E j and τ denotes the time interval between the two measurements. For initial states which commute withĤ w , we find E L = E R and we recover the probability for obtaining w as the difference of two projective energy measurements [54]. Any negativity in the above distribution is thus due to the coherence in the initial state. We note that the probability distribution obtained by performing two Gaussian energy measurements, as discussed in Ref. [55], is obtained from Eq. (54) by including the imprecision and back-action of the measurement as discussed in Sec. 3. The moments generated by the KQPD are given by whereĤ w (τ ) =Û (0, τ )Ĥ wÛ (τ, 0). For the first two moments, i.e. k = 1, 2 in the last expression, this reduces to The mean work obtained through the KQPD is thus just given by the difference of the average energy in the work storage device at t = τ and t = 0. This is in contrast to the probability distribution obtained by performing two projective energy measurements, where the mean value only coincides with Eq. (56) for initial states that commute withĤ w . For an explicit evaluation of Eq. (54), we refer to Ref. [53]. See Ref. [56] for an experimental reconstruction of the work distribution by measuring its moment generating function (for an initial state that commutes with the Hamiltonian).
In case one has access to a power operatorP , the KQPD describing work fluctuations is given by the FCS of the power operator. The KQPD and its moments are then given by Eqs. (20)(21)(22) upon replacingÂ withP . This scenario is relevant for thermoelectric devices, where the power operator is given by the electrical current operator times the voltage across the deviceP =ÎV . In this case, the work fluctuations reduce to the problem of charge FCS [14]. We now return to systems described byĤ =Ĥ w +Ĥ rest , where we take the Hamiltonians to be time-independent for simplicity (see Ref. [57] for a discussion on the power operator for a time-dependent Hamiltonian). In this case, the power operator is given by the energy change of the work storage device per unit time, i.e.P = i[Ĥ,Ĥ w ]. If the power operator commutes with the total Hamiltonian, i.e.
[P ,Ĥ] = 0, we find This implies that the FCS of the power operator is equal to the KQPD for two energy measurements given in Eq. (54) [51]. The difference of the two scenarios considered here is then only relevant when taking into account the disturbance from an actual measurement. For a thermoelectric device, the energy in the work storage device corresponds to the electric potential energy stored in an electronic contact, i.e.Ĥ w = VQ, whereQ denotes the charge operator in the contact withÎ = i[Ĥ,Q]. The work provided by a thermoelectric device can thus either be determined by measuring the electrical current (determining the power) or by measuring the number of electrons in an electric contact (determining the energy in the work storage device). We note that both approaches have been considered in the theory of charge FCS [14].

Weak values
The weak value of an observableÂ is obtained by preparing the system in some state |I , performing a weak measurement ofÂ, and post-selecting the measurement outcomes on the final state |F [58]. Describing the weak measurement as a von Neumann measurement, the post-measured state of the detector is determined by the weak value and the average of the measurement outcome is given by the real part of the last expression [58,59]. Any possible time-evolution between pre-selection, measurement and post-selection is absorbed in the definition of the initial and final states. In the KQPD framework, we can describe the weak value measurement using the distribution describing two subsequent measurements, one for the weak measurement and one for the post-selection. For simplicity, we do not treat the pre-selection as an additional measurement but impose it by settingρ 0 = |I I|. The relevant KQPD then reads where |B F are eigenstates ofB. The KQPD corresponding to the measurement of the weak value is then obtained by taking into account the post-selection, i.e. fixing the argument corresponding to the second measurement The statistics of weak value measurements (with post-selection performed by a projective measurement) can be described using the last expression by including measurement imprecision and back-action as discussed in Sec. 3. In the weak measurement limit, the disturbance of the measurement does not alter the average value of the KQPD, which is equal to the real part of Eq. (58) From Eq. (60), we see that the KQPD vanishes for arguments that lie outside the spectrum ofÂ. However Eq. (61) can take on values which lie outside the spectrum ofÂ. In such cases, the weak value is called anomalous. We now show that an anomalous weak value implies that the distribution P w (A) takes on negative values. To see this, assume that P w (A) ≥ 0. In that case, we find where A min (max) is the smallest (largest) eigenvalue ofÂ and a similar inequality restricts A w ≥ A min . Note that the inequality in Eq. (62) is only valid for a strictly positive KQPD. An anomalous weak value thus necessarily implies that the KQPD P w (A) [and also P (A, B), cf. Eq. (60)] exhibits negativity indicating non-classical behavior. This is an agreement with Ref. [60], where anomalous weak values have been shown to imply contextuality. We note that weak values have been connected to quasi-probability distributions before [59]. In particular, the average momentum at a given position that is obtained from the Wigner function has been shown to be determined by the weak value of the momentum dp pW(x, p) dpW(x, p) = Re x|p|I x|I , (63) which follows from our approach when takingÂ =p andB =x.
Finally, we note that the distribution in Eq. (60) is relevant for a measurement ofÂ with an arbitrary strength. It has the variance ForÂ =p and |F = |x , the last expression is proportional to the quantum potential energy [59], a central object in Bohmian mechanics [61]. See also Ref. [62] for a connection between weak values and von Neumann measurements of arbitrary strength and post-selection. For our example of two subsequent spin measurements, we find [using Eqs. (23), (26), and (60)] where we chose |− 2 as the final state |F . We see that for a sufficiently small denominator, the probability density associated with σ 1 = ±1 can become very large as long as the presence of the negative term ensures the normalization of P w (σ 1 ). In this way, the weak value, determined by P w (1) − P w (−1) can be amplified resulting in values far outside the range [−1, +1]. This can however only occur for a sufficiently large negative term in the KQPD as can be seen from Eq. (65) which implies

Leggett-Garg inequalities
In Ref. [29], Leggett and Garg derived inequalities which hold under the assumption of macroscopic realism [i.e., a (macroscopic) system will at all times be in one (and only one) of the states available to it] and non-invasive measurability. Clearly quantum systems do not fulfill either of these assumptions and can thus violate Leggett-Garg inequalities. In contrast to Bell inequalities, where multiple spatially separated parties apply local operations, Leggett-Garg inequalities are obtained by probing the same system at multiple times. We note that even if a Leggett-Garg inequality is violated, one can always argue that this happened because the measurement did influence the system, and therefore all subsequent measurements, in an unexpected (but in principle avoidable) way. This is known as the clumsiness loophole. Since it is not possible to certify the non-invasiveness of a measurement, the clumsiness loophole can never be closed [63].
We will now show that the violation of a particular Leggett-Garg inequality directly implies the occurence of negative values in the corresponding KQPD. The considered inequality reads where the correlators are given by Here P ij (Q i , Q j ) is the probability of obtaining the outcome Q i at time t i and Q j at t j if the system is measured at these two times only. The measurement outcomes are restricted to Q i = ±1. Violation of the inequality in Eq. (67) implies that the correlators C ij can not be obtained from a positive probability distribution which describes all three measurements and is independent of the choice of measurements that are performed [29,63], i.e.
For projective measurements of a dichotomic observableQ, the correlator to be used in Eq. (67) is given by [64] C ij = Tr We now introduce the KQPD for three subsequent measurements ofQ It is straightforward to show that the correlators in Eq. (70) can directly be obtained from the KQPD If the KQPD was strictly positive, it would thus provide a positive probability distribution which describes all three measurements and is independent of the choice of measurements that are performed. The existence of such a probability distribution prevents the Leggett-Garg inequality to be violated [63] (see also App. C). Inverting the argument, we conclude that a violation of the Leggett-Garg inequality in Eq. (67) implies negative values for the KQPD in Eq. (71) and thus non-classical behavior in the time-evolution ofQ. While we discussed projective measurements, the KQPD could also find applications when considering Leggett-Garg inequalities obtained from weak continuous measurements [65,66].

Conclusions
The KQPD provides a generalization of the Wigner function and the FCS to describe arbitrary observables within a quasi-probability formalism. This provides a unified framework for the description of fluctuations in dynamic quantum systems. Negative values of the KQPD can be directly linked to an interference between paths through Hilbert space which visit different eigenstates of the observables of interest. This requires coherent superpositions of states that correspond to different measurement outcomes, implying that negativity in the KQPD indicates non-classical behavior. As we have shown, this negativity is witnessed by anomalous weak values and violated Leggett-Garg inequalities (see also Refs. [67,68] for a connection between anomalous weak values and Leggett-Garg inequalities). The versatility of the KQPD allows to describe the problem at hand with a tailor-made quasi-probability distribution which focuses on the observables of interest. When measuring these observables in von Neumann measurements, the outcome is determined by the KQPD, corrupted by measurement imprecision and back-action. However, the underlying KQPD can in principle be reconstructed tomographically by coupling the observables of interest to qubits and performing state tomography on the qubits. This has been discussed extensively for the FCS [13,17,19,69,70].
While we presented the point of view of the KQPD being a measurement-independent quantity that captures non-classicality of the system, a complementary viewpoint is provided by only considering von Neumann measurement setups. The appearance of negative values in the KQPD then reflects the presence of non-trivial correlations between the system and the detectors during the measurement.
Here we only discussed a small number of applications for the KQPD. Other scenarios where the KQPD might be of interest include: measurement based quantum computing [71]; could the KQPD in this case capture the non-classical features that allow for outperforming a classical computer? Quantum-to-classical transition; how much noise is necessary for ensuring positivity of the KQPD and in which cases can the back-action in a measurement be neglected? Bell nonlocality, entanglement, and steering; are these notions of non-classicality captured by a negative KQPD? In addition to these open questions, we believe that there are many useful applications for the KQPD yet to be discovered.

Acknowledgments
I acknowledge stimulating discussions with A. Clerk, N. Brunner, C. Flindt, and M. Perarnau-Llobet. I gratefully acknowledge the hospitality of McGill University where part of this work has been done. This work was funded by the Swiss National Science foundation.

A Non-unitary time evolution
In the main text, only unitary time evolution is considered. Generalizing the results to nonunitary time evolution is straightforward but leads to cumbersome expressions. Considering first N operators observed at subsequent times, Eq. (13) has to be replaced by Λ(λ) = Tr {X N (λ N )U(τ N , τ N −1 ) · · · X 2 (λ 2 )U(τ 2 , τ 1 )X 1 (λ 1 )U(τ 1 , 0)ρ 0 } , where the time evolution superoperator is defined for τ ≥ τ and When considering operators over finite time-intervals, we divide the time interval into N infinitesimal slices of width δt and insert a superoperator akin to X after each time slice. The completely general expression for the moment generating function for non-unitary time evolution then reads where t j = jδt and

B von Neumann measurements
In this section, we calculate the probability distribution that describes a von Neumann measurement of subsequent instantaneous measurements. To this end, we first calculate the probability distribution for finding the detectors at the positions r j , grouped in the vector r, after interacting with the system P (r) = Tr r|e −iχ NÂNπNÛ (τ N , τ N −1 )e −iχ N −1ÂN −1πN −1 · · · e −iχ1Â1π1ρ τ1 e iχ1Â1π1 · · · · · · e iχ N −1ÂN −1πN −1Û (τ N −1 , τ N )e iχ NÂNπN |r .
Inserting identities resolved in the eigenstates of theÂ j operators then yields Using the identity we find P (r) = dA dγ l W l (r l − χ l A l , γ l /χ l )/χ l P(A ; γ).

C Leggett-Garg inequality violation implies negativity in the KQPD
We consider the KQPD given in Eq. (71). First, we convert the probability density distribution into a probability distribution ( < 1) Note that the KQPD can also be finite for Q 1/2 = 0 due to the terms where Q L 1/2 = Q R 1/2 . We will now show that a purely positive KQPD implies that the Leggett-Garg inequality in Eq. (67) can not be violated. First, note that Q1,Q2,Q3=±1P (Q 3 , Q 2 , Q 1 ) = 1. (84) Together with the normalization of the KQPD, this implies For a positive KQPD,P thus provides a positive probability distribution which describes all three measurements and fulfills the assumptions of macroscopic realism and non-invasive measurements, preventing a violation of the Leggett-Garg inequality [29,63].