Sequential hypothesis testing for continuously-monitored quantum systems

We consider a quantum system that is being continuously monitored, giving rise to a measurement signal. From such a stream of data, information needs to be inferred about the underlying system's dynamics. Here we focus on hypothesis testing problems and put forward the usage of sequential strategies where the signal is analyzed in real time, allowing the experiment to be concluded as soon as the underlying hypothesis can be identified with a certified prescribed success probability. We analyze the performance of sequential tests by studying the stopping-time behavior, showing a considerable advantage over currently-used strategies based on a fixed predetermined measurement time.

Traditional statistical inference approaches rely on processing measurement data only af-ter the experiment, or multiple repeated experiments, have been completed [26].However, in the context of continuously monitored sensors and for many real-life applications, it is highly pertinent to consider sequential strategies that process resources on the fly and make decisions based on the single stream of data accumulated so far [27].
The primary objective of this article is to shed light on the application of sequential analysis methodologies in continuously monitored quantum systems, with a specific focus on one of the most fundamental primitives in statistical inference: binary hypothesis testing.
This work, thus represents the sequential counterpart of the previous works on binary hypothesis testing in continuously monitored quantum system [15,18] that were so far restricted to detection strategies of a fixed predetermined duration.By leveraging the benefits of these sequential methods, we aim to overcome the limitations of traditional postexperiment analysis approaches and enable more efficient and accurate extraction of information from continuously monitored quantum systems.
In a broad sense, standard hypothesis testing aims to assess the minimum probability of error ϵ when identifying the true hypothesis after using a fixed number n of samples.For independent and identically distributed (IID) samples the error scales exponentially with the number of observations, ϵ n .= e −nR 1 , where the error R rate depends on the precise setting and the two probability distributions (p 0(1) ) underlying 1 In this work we give several asymptotic results.We will adopt this notation from asymptotic analysis.Given to sequences an, bn: Equality to first order in the exponent: an .= bn for limn→∞ 1 n log an bn = 0. Asymptotic equivalence: an ∼ bn for limn→∞ an bn = 1; an = O(bn) for limn→∞ an bn = 0; an = O(bn) for ∃ constants c, m s.t.|an| < c|bn| ∀n ≥ m. each of the hypotheses.Instead, in sequential hypothesis testing, data is sampled sequentially until the moment when the true hypothesis can be identified with a prescribed probability of error ϵ.Since different measurement records convey different information, the number of samples required n to reach an accurate enough guess, the so-called stopping time, is itself a stochastic variable.Wald [28] proved that, for IID sampling, the mean stopping time scales as E 0 [n] ∼ − log ϵ D(p 0 ∥p 1 ) as ϵ → 0, where D(p 0 ∥p 1 ) is the relative entropy and it satisfies D(p 0 ∥p 1 ) > R.This finding highlights that sequential strategies offer substantial resource savings in a classical IID scenario.
The problem of sequential hypothesis testing has been recently tackled in the quantum realm [29,30], in a setting where copies of a quantum system (either in ρ = ρ 0 or in ρ = ρ 1 ) are provided on demand.The ultimate quantum bound on the mean stopping time (or the mean number of sampled copies) has been shown to follow the "quantized version" of Wald's result: naively exchanging probability distributions p 0(1) (x) for quantum states ρ 0(1) , i.e.E 0 [n] ∼ − log ϵ D(ρ 0 ||ρ 1 ) .Here, we study a very different quantum setting where, instead of performing a sequence of measurements on an increasing number of copies, we perform a (continuous) sequence of measurements on the very same quantum system; and the question is to discriminate between two possible internal dynamics of the monitored system.We envision a quantum sensor, in particular an optomechanical sensor, whose dynamics are affected by external factors such as the presence of external mass or other forces, and the task is to detect this presence by observing a single (possibly long) measured signal.
As previously mentioned, the characteristic trait of sequential problems is that the horizon of observations is not fixed in advance; it is a stochastic variable.Under quite mild assumptions we will give an expression for the mean stopping time and characterize the stopping-time distribution for a wide class of continuously monitored quantum systems, with special focus on those described by Gaussian quantum states and Gaussian measurement statistics.
This work is structured as follows.In Sec.2 we revisit the hypothesis testing scenario, and introduce both the deterministic (fixed horizon) strategy and the sequential probability ratio test (SPRT); which are illustrated in Sec.2.1 by means of a simple Gaussian IID case.We then move to continuously-monitored quantum systems, described in Sec. 3, and introduce the Gaussian model under study in Sec.3.1.Our results are presented in Sec. 4. We begin by analyzing the stochastic evolution of the relevant statistic for hypothesis testing, i.e. log-likelihood ratio (LLR), and in Sec.4.1 our analytical results for general continuous-monitoring quantum systems are presented.We prove some general results on the properties of the SPRT stopping times, under some mild assumptions on the statistical properties of the LLR.The case of Gaussian quantum systems is studied in detail in Sec.4.1, first by providing analytical results for the first moments of the LLR distribution, and secondly by numerically studying its behavior in order to assess its advantage of sequential strategies against deterministic ones.Finally, an outlook is given in Sec. 5

Hypothesis testing
Let us now introduce the basic framework of standard binary hypothesis testing.Here one counts with a sequence of n observations Y n = (y 1 , . . ., y n ) and the goal is to identify, with a minimum probability of error, which of two given hypotheses -called the null hypothesis (h 0 ) and alternative (h 1 ) hypothesis-is responsible for generating this string of data.It is inherent in this formulation that there exists a (known) model that gives the probabilities P 0 (E) and P 1 (E) for an event E to occur under hypothesis h 0 or h 1 , respectively.An inference strategy is determined by a decision function d : Y τ → {0, 1} that assigns a guessed hypothesis to each of the possible strings Y n .
It is customary to assess the performance of such strategies by the so-called type-I error or false positive, α 1 = P 0 (d = 1), occurring when h 0 is rejected despite holding true, and type II error or false negative α 0 = P 1 (d = 0), occurring when h 0 is accepted while the data has been generated in accordance with h 1 .Since there is a trade-off between these two types of error one cannot minimize them independently.For this reason, one defines a single figure of merit, i.e. an optimality criterion for the different strategies, based on two standard approaches: • Symmetric hypothesis testing: follows a Bayesian approach, where prior probabilities {π 0 , π 1 = 1 − π 0 } are assigned to each hypothesis so that the total probability of error can be computed as where the lower bound is attained by deciding for the most likely hypothesis, i.e.
We use the lowercase notation to denote the probability densities, and reserve uppercase to denote the probability of events.
• Asymmetric hypothesis testing, where one is rather interested in minimizing only one of the two types of error whilst maintaining the other smaller than a predefined constant.
A central quantity in binary hypothesis testing is the log-likelihood ratio (LLR) Indeed, the Neyman-Pearson theorem [31] singles out the LLR test as an optimal one in the following sense: For a > 0 define the decision function (called likelihood test) with error probabilities Given any other decision function d ′ with error probabilities α ′ 0 and α The likelihood test with a suitable choice of threshold a is also optimal for symmetric and asymmetric hypothesis testing scenarios defined above.That is, the (log) likelihood function ℓ(Y n ) is the relevant statistic to define the acceptance (and rejection) region in the observation space.
Equipped with these definitions one can study the behavior of the probability of error ϵ n as we increase the number of observations n.For IID samplings, the error can be seen to decay exponentially, and closed expressions can be found for the error rate R := − lim n→∞ 1 n log ϵ n in terms of the underlying PDF's -see the Chernoff bound and Stein lemma, e.g. in [31], for the rates corresponding symmetric and asymmetric setting respectively.
Let us now move to sequential hypothesis testing.Here a general inference strategy is defined as the duple S = {τ, d(Y τ )} of a stopping time and a decision function.The stopping time shall be determined solely by the information available at each step of the process, or more succinctly: τ is a stopping time if, for any n, the occurrence of the event τ ≤ n can be determined from Y n .The decision function takes as input the observed sequence until the stopping time τ and produces a guess for the hypothesis d(Y τ ) = {0, 1}.
Note that this set of strategies includes the traditional deterministic strategies, i.e. those where the stopping time is a fixed deterministic variable τ = n, where n is a predetermined value.By relaxing this constraint and enabling samples to be provided on demand, we can explore new scenarios and utilize resources more efficiently.In contrast to the standard (deterministic) settings where given a number of observations, one assesses the expected probability of error, we will assess the mean number of observations required in a sequential strategy to attain given error bounds.
We start by considering, the strong error conditions [29,32,33], a Bayesian scenario where both decisions (d = 0, 1) have a certified error below a given threshold, for each possible measurement record.We will see that this leads to the sequential probability ratio test (SPRT), a test that turns out to be optimal also in other settings (including non-Bayesian, asymmetric ones).
If hypothesis h k is given with prior probability π k , after observing a measurement sequence Y t , the posterior probability is given by Bayes' update rule, where p(Y t |h k ) = p k (Y t ).Hence, the strong error conditions mean that a decision can be taken only if either hold2 .We will denote by S(ϵ 0 , ϵ 1 ) the class of inference strategies S = {d, τ } which satisfy prescribed strong error bounds (6).The shortest permissible stopping time is the earliest moment at which either of these conditions is met.It is easy to see that the conditions in (6) can be expressed in terms of the log-likelihood ratio (2) as with ℓ t := ℓ(Y t ).Therefore the optimal strategy (i.e. with the shortest stopping time) respecting the strong error conditions is given by an SPRT with the threshold values given in (7).An SPRT is defined as follows, by the stopping time and the decision function for some threshold values a i > 0. In plain words, a sequential probability test operates by following these rules • if ℓ(Y t ) ≥ a 1 , the test stops and h 1 is accepted • if ℓ(Y t ) ≤ −a 0 , the test stops and h 0 is accepted • if a 0 < ℓ(Y t ) < a 1 , the test continues by asking for new samples.
Clearly, if the strong error bounds (6) are the same for both hypothesis, ϵ 0 = ϵ 1 = ϵ, then the unconditional total error probability P err (1) is also bounded by ϵ.In the seminal works introducing the SPRT [28,32], Wald showed that for an SPRT with thresholds (−a 0 , a 1 ) the type I and type II errors (that are defined with no need of Bayesian priors) satisfy In case of no (or negligible) overshooting, i.e. when the sampling process exactly stops when the decision contour is reached (ℓ τ = a 0 or ℓ τ = a 1 ), the inequalities are saturated and the probabilities of type I and II are given by Throughout this work, we will denote by C(α 0 , α 1 ) the class of inference strategies S which satisfy prescribed bounds on type I and type II error probabilities, P 0 (d = 1) ≤ α 1 and P 1 (d = 0) ≤ α 0 .To distinguish those strategies, from those obeying the Bayesian strong error conditions (single-trajectory), we will say that strategies in C(α 0 , α 1 ) obey the weak error conditions.The Wald-Wolfowitz theorem [34] establishes the optimality of the SPRT under IID, in the sense that it minimizes the expected stopping time under both hypotheses, E 0 [τ ] and E 1 [τ ], among all tests in C(α 0 , α 1 ) (sequential or not).In addition, for the IID setting the mean stopping time E k [τ ] can be computed, and can be seen to be significantly shorter than the time (number of samples) required for a deterministic bound to attain the same error bounds.
In the next section, we give an explicit example demonstrating the advantage of sequential hypothesis testing and showcasing some features that will also be present in continuously monitored quantum systems.

Gaussian Distribution IID
Here, we offer a straightforward illustrative example in the IID case, where all computations can be performed analytically, which will also serve to further introduce some essential results in hypothesis testing.
Let us assume that our observations are described by a random variable that, depending on the hypothesis, obeys where ζ is a Gaussian stochastic variable with zero mean and variance Since the probability density function (PDF) of the observations obeys p k (Y t ) = t i=1 p k (y i ) where p k (y i ) are Gaussian for both hypothesis, k = 0, 1, it immediately follows that the LLR is given by where 2 .That is, the LLR at a given time t, ℓ t = ℓ(Y t ), is itself a Gaussian random variable with mean and variance given by: where, in order to present the results for both hypotheses k ∈ {0, 1} in a unified manner, here, and throughout this paper, we utilize the modulo 2 addition so that k ⊕ 1 represents the complementary hypothesis of k.We note that both the mean and variance grow linearly with the number of samples.More interestingly, note that all rates depend on a single parameter: and ν 0 = ν 1 = 2µ.This relation is not accidental, but it can be seen as a necessary condition for any setting with a Gaussian LLR (see Theorem 5 below).
Having a full characterization of the distribution of the LLR statistic ℓ t , we can easily compute the type I and type II errors after a fixed number of samples n.From (4) these errors can be obtained by integrating the tails of a Gaussian: By choosing a threshold value adapted to the number of samples available a = ξt (with −µ < ξ < µ) we get the asymptotic error rates where we have used that for b > 0, lim t→∞ 1 shows, shaded in green, the attainable asymptotic error rates (R 0 , R 1 ) for deterministic strategies.The optimal deterministic strategies are given by the curve (R given by (15), which describes the optimal trade-off between the false positive and false negative error rates.
The point where both error rates coincide, R 4 corresponds to the symmetric (Bayesian) error rate, i.e.P err .= e −µt/4 -independently of the choice of priors.Indeed, from (1) it follows that the optimal threshold in this setting is a = log π 0 π 1 for all t, which means that ξ = a/t → 0.
For ξ → ±µ we get the optimal error rate for asymmetric hypothesis testing in agreement with Stein's lemma [31]: if α ϵ 0 (t) = min d {α 0 (t) such that α 1 (t) < ϵ} for some fixed ϵ > 0, then the corresponding error rate is given by the relative entropy: Similarly, if we bound the type II error, α 0 (t) < ϵ, the optimal rate for the type I error will be R * 1 = D (P 1 ∥P 0 ).It is worth mentioning that the relative entropy gives the fastest error decay rate one can attain, in the sense that any strategy that detects hypothesis h k with a smaller error, i.e.R k > D(p k ∥p k⊕1 ), will effectively always decide for h k , i.e. α k⊕1 (t) → 1 as t → ∞ -as can be also seen from ( 14) by taking |ξ| > µ).
Let us now compute the average number of resources needed to achieve the same error probability with the SPRT.In the IID case, one can relate the average stopping time to the average LLR (of a single symbol) by first writing (18) where the Wald identity [28] is used to compute the expected value of a summation, where the range is determined by a stopping time (random variable).
In addition, if we neglect the overshooting, we can assume that at the stopping time, the LLR will take one of the two possible values at the decision boundary, ℓ τ = (−1) k⊕1 a k .Under hypothesis h k hitting the boundary (−1) k a k⊕1 is associated to a wrong identification, i.e.P k (ℓ τ = (−1) k a k⊕1 ) = α k⊕1 , while hitting the boundary (−1) k⊕1 a k is associated to a successful identification, i.e.P k (ℓ τ = (−1) k⊕1 a k ) = 1 − α k⊕1 .Therefore the expectation value of ℓ at the stopping time is approximately (up to the possible overshooting) given by: Combining the above expressions with E k [ℓ 1 ] = (−1) k⊕1 µ (see (17)) we find that, in the regime where the errors (either strong or weak error conditions) are asymptotically small, i.e. a k 1, reduces to: for both weak and strong error conditions, i.e. for S ∈ C(ϵ 0 , ϵ 1 )and S ∈ S(ϵ 0 , ϵ 1 ).Here we used that the SPRT thresholds a k required to meet the weak (strong) conditions, given by equation Now we are in a position to compare the performances of sequential and deterministic strategies.The time required for a deterministic strategy to guarantee a small error probability is given by the asymptotic error rates studied above.For instance, if we need to impose a small error for both hypotheses we have (22) That is the use of sequential strategies allows to save a significant amount of resources: a deterministic strategy would require 4 times longer sample times to attain the same error bounds than the expected time for the SPRT.The advantage is also clear in the asymmetric setting: while deterministic strategies are inevitably bound by the trade-off between type I and type II errors (green area in Fig. 1) sequential strategies allow to minimize both error rates simultaneously, up to the absolute non-trivial optimal value given by Stein's lemma (orange area in Fig. 1).
Having introduced the hypothesis testing framework and analyzed the IID Gaussian example in detail, we now shift our focus to the main subject of this work: sequential hypothesis testing in continuously-monitoring quantum systems.We start by defining the physical systems that we have in mind and derive the statistics that will govern the observed signals.

Continuously-monitored Quantum Systems
Continuously-monitored quantum systems are systems from which a certain amount of information is extracted at each instant of time.The act of extracting information from the system, or in other words the act of measuring, perturbs the system by an amount that increases with the information extracted from it.At one extreme, under a fully informative measurement, corresponding to a sharp, or rank-1 POVM, at each given time, the system collapses to the same measured state consistently, preventing any evolution (known as the quantum Zeno effect).At the other extreme, enforcing no perturbation to the system dynamics will end up with a completely uninformative measurement.Hence, in a continuously monitored system, a balance is sought between the amount of information extracted at any one time and the measurement-induced perturbation that can be tolerated on the dynamics.This balance between perturbation and informativeness may be controlled by tuning the coupling of the system of interest with an external environment, on which a sharp measurement is performed.A Markovian approximation concerning the system dynamics is often assumed, i.e. the environment is assumed to either reset on a very fast time scale or it is assumed to be large enough to guarantee that the information leaked from the system does not kick back.If both the coupling with the bath and measurements are weak, and the observational model is assumed to be Gaussian, then the dynamics of a continuously monitored system is well described by the Belavkin-Zakai equation [12,13,17,35], which here we review: where is the generator of the unconditional dynamicsin which no measurement is performed, or equivalently, no measurement record is registered-, Ĥ is the free Hamiltonian and D ĉ(ρ t ) = ĉρ t ĉ † − 1 2 {ĉ † ĉ, ρ t } is a diffusive term due to the presence of an external bath (where ĉ = (ĉ 1 , . . ., ĉk ) T are generic operators defined by the structure of the Markovian environment).The second line of (23) represents the measurement back-action, where is the so-called innovation term, dy t is the measurement outcome at the given time t, and semi-definite matrix representing the detection efficiency, which also controls which modes are effectively monitored.Plugging (25) into (23) allows us to write a stochastic equation of motion for the current state of the system conditional to a particular measurement record Y t = (dy 0 , . . .dy t ).The non-linear term appearing in the resulting equation describes the re-normalization process of the state after each measurement, i.e. it incorporates the Born rule in the system state dynamics which can be understood as the quantum counterpart of a Bayesian update on the statistical operator.
The probability of having the measurement outcome dy t at time t is described by the Gaussian observational model: where in the second equality we have used (25), and P dw (dy t ) is the probability distribution of k independent Wiener processes such that The probability distribution of the measurement record Y t , conditioned on the system state -whose dynamics is governed by (23)-is readily obtained as the product of the probabilities in (26) describing the probability measurement outcome dy t at a given time and reads where Wiener process P W (Y t ) ≡ t τ =0 P dw (dy t ) can be understood as the noise process driving the system, and is what mathematically defines the probability measure in the space of the continuous measurement signals.
Here, λ(y t |θ) is the log-likelihood associated with the sequence of measurement Y t and is described by the following master equation3 where, for clarity, we have made the dependence of the state on the parameters θ and the measurement record Y t explicit.
The above equation allows us to easily keep track of the likelihood of a particular trajectory Y t , as it can be computed recursively, only knowing its current value, the conditional state of the system, and the measurement outcome at each given time.
Before moving further few remarks are in order.First notice that (28) can also be understood as the equation governing the evolution of trace of the unnormalized state described by the linear Belavkin-Zakai equation, i.e.
as firstly shown in [15] and subsequently rediscovered in [16] 4 .This fact should not be surprising, since the linear Belavkin-Zakai equation can also be understood as the quantum counterpart of the classical Duncan-Mortensen-Zakai equation [37], i.e. the equation describing the dynamics of the unnormalized conditional probability density function for a classical non-linear filtering problem [37,38].
The above equations are usually hard to handle, especially in the case of continuous, infinitedimensional systems.Even conducting numerical simulations often demands extensive allocation of computational resources.However, there is a full class of experimentally relevant systems whose behavior can be approximated by the evolution of a Gaussian state.Here, the system is described by the first two statistical moments of the quadratures, and its evolution is fully characterized by a system of linear stochastic differential equations.Such Gaussian models are employed to describe several real-world scenarios with great success [1,39,40].They are not only a useful tool to describe real experimental scenarios but they can also be employed to have a more transparent connection between continuously monitored systems dynamics and classical filtering theory [26,41].

Gaussian systems
A quantum Gaussian system of n-modes is described by the quadratures q and p with [q i , p j ] = iδ ij , whose unitary part of the dynamics is described by a quadratic Hamiltonian of the form Ĥ = 1 2 xT H x + b T Ωx, where x = (q 1 , p1 , q2 , p2 , ..., qn , pn ) T , H is a 2n × 2n-matrix, b is a 2n-dimensional vector accounting for a time-dependent linear driving, and Ω is the nmode symplectic matrix [42,43].The effect of 4 There is a discrepancy of a factor 1/2 in the second term of the r.h.s. of (28) with respect to (26) in [16].However, it is not difficult to show that, starting from (26) in [20] and with the help of the Itô calculus the correct eq.( 28) is obtained.
the environment is described by Lindblad generators that are linear in the system's quadratures, as well as the noisy measurement which is described by a linear function of the quadratures.So that the dynamics will preserve the gaussianity of the state [44].This means that if also the initial state is Gaussian the first moment r = ⟨r t ⟩ and the Covariance Matrix (CM) σ = ⟨{r t , rt }⟩/2 − ⟨r t ⟩⟨r t ⟩ are enough to characterize the system at any time.The evolution for the first two momenta is thus obtained through (23) and reads and the model describing the measured signal simplifies to where A θ is the drift matrix and takes into account the unitary interaction between the system and environment, as well as the internal dynamics of the system, b θ,t describes the effects of a (possibly time-dependent) force on the system and D θ the diffusive part of the dynamics due to the interaction with an environment, and χ(σ) := σC T − Γ accounts for the measurement back-action.The sub-index θ denotes the dependence on certain parameters that characterize the different hypotheses.Note that on one hand, the dynamics of the first moment is perturbed by the measurement back-action, of an amount proportional to the innovation term, dw t , and hence it explicitly depends on the measured signal.While, on the other hand, the dynamics of variance -while influenced by the measurement process-does not depend on the particular measured signal but is reduced by an amount proportional to the averaged fluctuation of the first moment induced by the measured signal.
It is furthermore worth noticing that (30) are mathematically equivalent to the Kalman-Bucy equations [41] solving the classical filtering problem of estimating the internal state in a linear dynamics system from a series of noisy measurements [45].The quantum formalism effectively includes both the Bayesian (state of knowledge changes) and the quantum measurement backaction.In turn, eqs.(30) can be understood as those governing the evolution of the first two moments of the normalized probability distribution obtained through the Bayesian update of a classical linear system under the information acquired through the sequence of linear noisy measurements [41].
Having discussed the quantum continuously monitoring systems, we now apply the statistical inference tools discussed in Sec. 2 to this kind of systems.

Sequential Hypothesis testing in continuously-monitored systems
Now that the basic notions and formalism have been set, let us discuss sequential hypothesis testing in a quantum system that is being continuously monitored.Contrary to previous studies of in quantum sequential analysis [29,46], where copies of a quantum state are provided on demand, here one has a single system that evolves due to its own internal dynamics and the effects of measurements.The measurement record Y t cannot be described via a sequence of IID but is instead generated by a specific hidden Quantum Markov Model.In any case, what it is clear from the previous section is that the central quantity to be studied is the LLR.
Exploiting (28) it is immediate to show that ℓ(Y t ) is characterized by the following differential equation: where ρ θ 0 (ρ θ 1 ) is determined by solving (23) under the null(alternative) hypothesis.This expression was derived by Tsang in [15] where (deterministic) hypothesis testing in continuously monitored quantum systems was discussed for the first time.
It is worth noticing that (33) can be further simplified assuming the sequence of outcomes Y t is generated through the null/alternative hypothesis, denoted by k (i.e. one of the models correctly describes the system's dynamics).Indeed, under this assumption we can express dy t as and rewrite the LLR ℓ(Y t|θ k ) as a function of the Wiener noise dw t , In the Gaussian case, the equation for the LLR is: where r k (t) are the first moments of the Gaussian state generated through the conditional dynamics of (30) fed with Y t|θ k , and Similarly, in terms of the innovations or Wiener noises: The above expression shows that the likelihood ratio has a positive or negative drift depending on whether the measured signal Y t is generated by the alternative or null hypothesis, showing the tendency to increase the chances to discriminate between the two hypotheses over time correctly.Notice that the same conclusion can be obtained, in full generality by means of the following identity: P 1 (ℓt) P 0 (ℓt) = e ℓt .Indeed E k [e (−) k ℓt ] = 1 and with the help of Jensen inequality, i.e. e We note that equation (33) allows for immediate implementation of an SPRT in a real experiment, while (35) is more suited to theoretically study the performance of the SPRT in the context of continuously monitored systems as we will see next.
We observe that integrating (37) would result in a Gaussian distributed LLR (compare also with (12)) if ∆r were a constant.Such a scenario would greatly simplify our analysis.Unfortunately, this quantity is far from being constant as it is a stochastic variable itself, strongly correlated with the noise process.The next section is devoted to presenting the main formal results of this paper, where under some mild assumptions on the statistical properties of the LLR we derive the statistics of the most relevant quantity in sequential methodologies, namely the stopping time τ .
Before that, let us conclude this section with Figure 2 illustrating the SPRT strategy in continuously monitored systems where we anticipate some of the general features that we will formally prove in the next section.

General results on Discrimination in Quantum Continuously Monitored Systems
We now present an analytical study of the performance of the SPRT in hypothesis testing for continuously monitored quantum systems.We give a general theorem providing a tool to upper bound the SPRT stopping time and show the optimality of the test under some assumptions on the underlying stochastic process, and derive some relevant statistical properties of the SPRT stopping time.We further provide the optimal rate of error for the asymmetric deterministic setting, i.e. the Stein lemma for continuously monitored systems.All the details and the proofs of the following theorems can be found in the Supplemental Material.
Along with the asymptotic analysis notation introduced in previous sections (see footnote 1) let us introduce the notation for convergence in probability: Let X t , Y t be random sequences taking values in any normed space, we use the compact notation and with a k > 0 for k = 0, 1.Let T be a generic time and α * = max{α 0 , α 1 } then and where τ and τ (δ) s respectively denote the stopping time of a generic hypothesis test in the class S ∈ C(α 0 , α 1 ) and the stopping time of the SPRT in the class S δ ∈ C(α 1+δ 0 , α 1+δ 1 ) with ϵ ∈ (0, 1), and This first theorem is fairly general and allows us to asymptotically lower-bound the stopping time of a generic test in the class C(α 0 , α 1 ), by that associated to an SPRT.A less abstract result and an asymptotic optimality condition can be obtained if some reasonable assumptions over the SPRT stopping time are made.

Corollary 1.1. If, under the hypothesis h
with τ s the SPRT stopping time in the class C(α 0 , α 1 ), then the SPRT is asymptotically optimal in a weak sense, i.e.
Corollary 1.2.If furthermore the SPRT is not only asymptotically optimal in the weak sense but also in momenta, i.e.
On the other hand, the following theorem sets a bound on the optimal error rate that the (asymmetric) deterministic setting can achieve: Theorem 2. (Continuously-monitoring Stein lemma).Let P k be the probability under the hypothesis h k with k = 0, 1, d = d(Y T ) a decision function in the set {0, 1}, and T a fixed time at which the decision is taken.Let α 0 and α 1 respectively be the type I and II errors, with ϵ ∈ (0, 1), i.e. the minimum error achievable in the asymmetric scenario and with µ k > 0, then the minimum error rate R k that can be attained by a deterministic test is given by the following equation The proof of the theorem can be found in the supplemental (Theorem 10).As shown in the Supplemental Material, this is indeed the faster error rate we can achieve using a deterministic strategy, in the sense that any fastest rate will lead with certainty to false positive (negative), i.e. α 0(1) → 1.Also notice that if ℓ t converges in mean, then µ k assumes the role of the regularized Kullback-Leibler information divergence, i.e.
Theorem 3. Let ℓ(Y t ) be the LLR described by (33) and τ s be a stopping time associated to the SPRT with thresholds a k .If with µ k > 0 then, on the one hand, limit of small error bounds (i.e. a k 1) the stopping time converges in probability to a constant: where recall that a k ∼ − log(ϵ k ), as max(ϵ 0 , ϵ 1 ) → 0 (or similarly for the weak error conditions replacing ϵ by α).
On the other hand, under weak error conditions the SPRT is asymptotically optimal in a weak sense, i.e.

Corollary 3.1. If we further assume
then the (asymptotic) mean stopping time is given by while for weak error conditions the SPRT is asymptotically optimal in mean, i.e.
Theorem 4. Let ℓ(Y t ) be the LLR described by (33) and τ s be the stopping time associated to the SPRT.If and the SPRT is asymptotically optimal in momenta, i.e.
The proof can be found in the supplemental (Theorem 8 and subsequent corollaries).

Continuously-monitored gaussian systems
We recall that under the Gaussian assumption, the dynamics is fully characterized by the evolution of the first two cumulants, and reads By picking the parameters θ corresponding to hypothesis h 0 and h 1 respectively, we can write two decoupled sets of equations describing the quantum state of the system conditional on an arbitrary measurement signal Y t for each of the candidate hypothesis.This suffices to keep track of the LLR and implement the SPRT algorithm.However, in order to assess its performance and compute the statistical properties of the LLR and the stopping time τ s , it is important to take into consideration the statistical properties of the true measurement signal, which will be governed by (31) with r t corresponding to the true hypothesis.Since this dependence is fed in the stochastic equation for the other (false) hypothesis it results in a coupled system of equations.To solve this system of equations it is convenient to treat the problem in an extended vector space where the state of the system is defined as X t = (r 0 (t), r 1 (t)) T and Σ t = σ 0 (t) ⊕ σ 1 (t) whose evolution is given by where k ∈ {0, 1} and denotes the hypothesis under which the signal Y t is generated.In this notation, the LLR reads: with ∆ T = (1, −1) T .Under the assumption that the covariance Σ t , admits an asymptotic steady state5 Σ st = σ 0 ⊕σ 1 , ℜ[A − χ(Σ st )Π k C] < 0 and B t = B, i.e. the affine term of the equation is constant, then the probability distribution of X t admits an asymptotic solution of the form [47]: From this it is easy to see that: with ω = (ω + dd T ).Now, following a similar procedure than in the proof of Wald's identity [28] we have that in the limit of large a k 's (i.e.large stopping times) where we have defined the indicator function I C = 1 if condition C is fulfilled and I C = 0 otherwise.In the third equality, we have used that I t<τ = I ℓt / ∈(−a 0 ,a 1 ) and dℓ t are independent stochastic variables.Finally, in the fourth equality, we have used that E k [dℓ t ] is bounded and becomes constant µ k at a fast enough rate, so as to guarantee that the integral ∞ Finally, as in the IID case, we can use the fact that ℓ τ = {−a 0 , a 1 } is a binary random variable (continuity of ℓ t guarantees that there's no overshooting) and proceeding along the same lines as in (20) we obtain an asymptotic expression for the average stopping time: where recall that a k ∼ log α −1 k (a k ∼ log ϵ −1 k ) for the strong (weak) error conditions.
If, furthermore the variance of the LLR E then the SPRT is also asymptotically optimal in mean, i.e.

E[τ ] ≥ E[τ
where τ is the stopping time of a generic test in the class C(α 0 , α 1 ), in accordance to Theorem (4).
To conclude this work in the next section we will numerically investigate the behavior of the LLR and tests' performances in a specific model, and explain the results in light of the theoretical results obtained so far.

Optomechanical Sensors
In this section, we study the performance of sequential hypothesis testing on an optomechanical system under homodyne measurement that operates in the linear regime.We also assume that the system operates in the unresolved sideband regime, which enforces a separation of the time scales of the optical and mechanical modes, allowing the cavity mode to be eliminated adiabatically [41].In this regime, the fluctuations evolve according to a quadratic Hamiltonian, ensuing a Gaussian evolution described by (30), where the number of modes is reduced to one, i.e. the mechanical mode.
As a first case study, we make use of the SPRT to discriminate two different values of the decoherence rate γ, via a demodulated homodyne signal in the rotating frame of the mechanical frequency [48,49].This setting is experimentally achievable [50], and analytically treatable.We observe the corresponding tendency towards positive(negative) values according to the underlying true hypothesis h 1(0) .
Within the rotating wave approximation, the coefficient matrices in (30) reduce to where σ uc = nth + 1 2 + κ γ is the covariance of the unconditional dynamics steady state, nth is the average number of photons in the thermal environment, γ is a dissipative term due to the presence of an environment in thermal equilibrium interacting with the system, η describes the measurement efficiency, and κ is a decoherence rate induced by the measurement.
In Fig. 3 we show the evolution of the mean value of the LLR, along with several realizations of the stochastic process, for this particular system under study.For this case study, we have considered discriminating between damping rates γ 1 = 440 Hz and γ 0 = 100 Hz, and fixed the remaining parameters to be n = 1, κ = 10 Hz and η = 1 for both hypotheses.
As discussed in Sec. 2, the LLR is the main object to study in testing scenarios.As we show in the Supplemental Material, for this system we can prove that where closed expressions for µ k and ν k can be found.This is corroborated by Fig. 4, which shows good agreement between theoretical predictions and our numerics.
With the help of Chebyshev inequality and (68), we can readily show that conditions of theorems 3 and 2 are satisfied, guaranteeing both weak and mean asymptotic optimality of the SPRT.This is again confirmed in Figure 5 which shows how the LLR histogram under each hypothesis approaches a Gaussian distribution with the predicted drift and variance.
In order to illustrate how to implement the SPRT, and prove the advantage of sequential against deterministic tests, we have carried out a numerical experiment [51] by simulating a large number N of stochastic trajectories -integrating (59)-and used (36) to keep track of the LLR for each of the measurement records as one would do in a real experiment.We consider a symmetric hypothesis testing scenario with equal priors, therefore we generate N/2 trajectories under h 0 and the other half under h 1 .
To implement the optimal deterministic strategy, at every time t we check whether the LLR is above or below the threshold value: ℓ t ≥ a = 0 (decide in favour of h 1 ) or ℓ t < 0 (decide in favour of h 1 ).We keep a record of the number of incorrect guesses N F and estimate P err ≈ P e := N F N .In Fig. 6 we show the so obtained values (− log P e , t) for various times.
To implement the optimal sequential strategies, we apply the SPRT by fixing the equal upper and lower thresholds a 0 = a 1 = 1−ϵ ϵ and for each trajectory i we keep a record of the times τ i when the LLR first hits the boundary, and of the wrong guesses (cases where the upper threshold is hit but the true hypothesis was h 0 or vice versa).The mean stopping time is estimated as E(τ s ) ≈ τ = 1 N i τ i , and in Fig. 6 we plot the points with coordinates (− log ϵ, t = τ ) for a range of values of ϵ.
Before discussing the theoretical curves shown in Fig. 6, we can already highlight the distinct advantage of the sequential strategy over the deterministic one: the duration of the experiment required to identify the true hypothesis with a given error probability is about three times longer for the deterministic strategy than the (average) elapsed time in the sequential strategy.In addition, we also note the sequential strategy is able to certify the error probability for each single trajectory, while the deterministic protocol only guarantees the error bound when averaging over many trajectories.An important caveat of sequential strategies is that the exact duration of the experiment is unpredictable.The mean stopping time is bounded (and may be known beforehand).However, as shown in Fig. 7 the stopping time distribution has quite long tails, so a particular experiment may take substantially longer than expected.The figure also shows an excellent fit with the theoretical curve given in (57).
We now proceed to discuss the theoretical curves displayed in Fig. 6.Since the mean stopping time depends on the underlying hypothesis and we have assumed an equally likely hypothesis, the average stopping time is written as with where we have used (11) to obtain α k = (1 + e a ) −1 = ϵ.This is an asymptotic result, however, it is already an excellent approximation when required stopping times are large compared to the relaxation time of dE[ℓt]  dt to its stationary value µ k -see section (4.1).
A naive interpretation of our previous results and the histograms shown in Fig. 5 might lead to the conclusion that for all practical purposes, one can assume that the LLR is a Gaussian distribution with mean µ k t and variance ν k t.Under these assumptions we can easily compute the probability of error of the deterministic strategy using the results in Sec.2.1.Indeed, since using first equalities in (14) with a = 0, However, this result is bluntly wrong as it is apparent from the mismatch shown in Fig. 6 between the numerical simulation results and the (magenta) curve (− log P err , t) obtained from (71).The following simple theorem highlights that the underlying assumption above results, namely the Gaussianity of ℓ, the reason for this discrepancy.
Theorem 5.If the log-likelihood ratio ℓ(x) := log p 1 (x) p 0 (x) is Gaussian distributed random variable under one of the hypotheses, then it will be Gaussian distributed under both hypothesis and their means and the variances must fulfill the following relation with µ > 0. i.e.PDF for ℓ is given by Proof.Take p 1 (ℓ) to be the Gaussian probability distribution, with mean µ > 0 and variance σ 2 .From the definition of ℓ(X) it immediately follows that p 0 (ℓ) = e −ℓ p 1 (ℓ), which means that the distribution of p 1 (ℓ) has also quadratic exponent.
Imposing the normalization condition on e −ℓ p 1 (ℓ) we readily obtain condition σ 2 = 2µ, and the rest of the claims follow.More succinctly, the following relation between the moment-generating functions holds true: Recalling the moment-generating function for a Gaussian distribution to be χ(q) = exp(qµ + Taking, e.g., q = 0, 1 (i.e. the normalization conditions) together with q = 1/2 leads to desired results.
It is easy to verify that these conditions do not hold in general -in the current case under study it is sufficient to check that µ 0 ̸ = µ 1 as shown in (S5) of the Supplemental Material.Therefore, it is clear that the underlying assumption that led to (72) does not hold.It is important to emphasize that this fact does not contradict the results presented in this paper, nor does it undermine the content of the theorems stated in the previous section.This is because convergence in the probability of a specific stochastic variable does not necessarily imply convergence in moments.While the theorems address the concentration properties (i.e., the bulk) of the LLR distribution, that contribute for instance to the mean stopping time, the calculation of the deterministic error rates involves large deviation properties of the PDF (i.e., the tails), which are not expected to follow a Gaussian distribution.
It is worth mentioning that the seminal paper on (deterministic) hypothesis testing in continuously monitored systems [15], specifically in its Supplementary Material, provides integral expressions for the Chernoff Information (see also [26,52]), which provides upper-bounds to the deterministic probability of error.We compare the average time required for the sequential test to reach a predefined error threshold P err , with the time for which a deterministic strategy would need to attain such error threshold.These quantities are estimated using 2 • 10 4 simulated quantum trajectories for each hypothesis, and averaging the performance over both hypotheses.
As a second case study, we consider the discrimination between two different values of the oscillation frequency ω m of the mechanical mode.Within the rotating wave approximation, the system is described by the set of differential equations in (30), with coefficient matrices given by where ω is the mechanical-mode frequency and the remaining parameters are defined above.The frequency values have been set to ω 1 = 1.02 ω 0 and ω 0 = 10 4 Hz, while the values of the remaining parameters are γ = 500 Hz, κ = 10 3 , n = 1 and η = 1 for both hypothesis.In this case, we are not able to find an analytical expression for the variance σ k -already the solution of Riccati equation has a transcendental solution.However, numerical results show that the variances σ k,t admit an asymptotic stationary state, similar to eqs.(68).
Similarly to the damping-rate discrimination, we have carried out a comparison between deterministic and sequential tests, as shown in Fig. 8.A clear advantage in favor of the latter strategy is also observed.
Force discrimination.As a final example, we consider how well can a test do when it comes to detecting the presence of an external force.This enters as a linear term in the first-moment dynamics, as described by b θ,t = (0, b θ ) T in (30).
Here, we have considered a constant force, with values b 1 = 40Hz and b 0 = 0; the remaining parameters were set to γ = 500 Hz, κ = 10 Hz, ω = 10 2 Hz, n = 1 and η = 1. Figure 9 shows the corresponding numerical simulations in an error vs time plot, where sequential gives a similar advantage as the other cases.In this particular instance, hypothesis h 1 is indistinguishable from hypothesis h 0 except for a displacement in the origin of the harmonic oscillator.Consequently, we expect that the multivariate Gaussians characterizing the statistical properties of the observed signal will be identical for both hypotheses, except for their means, which will differ.From here, it is not hard to show that, as in the example in 2.1, the log-likelihood ratio is a linear function of the (Gaussian) signal (Y t ) and therefore it is itself a Gaussian stochastic variable.The Gaussianity of ℓ t allows one to use (70) to calculate the mean stopping time for the optimal sequential strategy and (71) to determine the probability of error for the optimal deterministic strategy.Furthermore, the Gaussianity of ℓ t also implies that all its statistical properties under both hypotheses can be defined by a single parameter µ (as indicated in Theorem 5).The analytical curves resulting from these considerations are illustrated in Figure 9.One can easily derive the asymptotic expression of these curves to find that in the limit of small error probability ϵ, the mean stopping time for sequential strategies is four times shorter than the required time for a deterministic strategy:

Discussion and outlook
This study represents a primary exploration of sequential hypothesis testing in open and continuously monitored quantum systems.We have introduced the sequential framework and methodologies, deriving general results on statistical properties of the stopping times, a key figure of merit in sequential hypothesis testing.Moreover, we have established the optimality of the SPRT (Sequential Probability Ratio Test) for hypothesis testing under weak error conditions.Explicit closed-form expressions for the (asymp-totic) mean stopping time have been provided for ubiquitous Gaussian systems, under stationarity conditions on the dynamics.Additionally, we have conducted case studies in optomechanical systems to supplement our analysis, demonstrating a clear advantage of sequential strategies over deterministic ones, with a reduction in the required measuring time by a factor between 3 and 4.
Current research efforts are focused on various extensions of this work.In particular, studying the performance of the SPRT for other detection schemes, such as performing photoncounting measurements on the leaked cavity modes, instead of dyning measurements considered in this work.A dynamical equation for the log-likelihood ratio in such scenarios has already been derived in [15].Furthermore, we aim to determine the ultimate quantum limits for general measurement schemes, in the spirit of recent works on sequential hypothesis testing for IID quantum states [29,30], or [18] in the context of deterministic hypothesis testing strategies for continuously monitored quantum systems.
Incorporating physically sound feedback schemes into sequential methodologies is another area of interest, as it has the potential to enhance the power of the tests.Lastly, we would like to highlight the utility of sequential analysis tools in other relevant primitives or applications, such as quickest change point detection [46] or anomaly detection.These tasks can be considered as genuine sequential problems, as they cannot be adequately addressed with fixed-horizon strategies.
Finally, a compelling avenue for future investigation lies in exploring model-free schemes, such as machine-learning approaches, to infer the Log-Likelihood Ratio (LLR) value without relying on a perfect model [53].
A Asymptotic behavior of the log-likelihood ratio in the Gaussian regime The results presented in sec.4.1 of the main text exploit the standard treatment for an Osrtein-Uhlenbech process to obtain, under some easy-to-verify conditions, an asymptotic expression for loglikelihood's drift and consequently of the Sequential Probability Ratio Test (SPRT) stopping time mean value.
However, having knowledge solely about the asymptotic behavior of the log-likelihood drift is insufficient to guarantee the optimality of the test in the C(α 0 , α 1 ) class or to determine the probability distribution of the SPRT stopping time in the asymptotic regime.
To achieve this goal, it is necessary to possess at least some understanding of the asymptotic behavior of the variance of the Ratio (LLR).This information is crucial in bounding the speed of convergence of the LLR by a deterministic function of time.
An expression for the mean and the variance of the LLR that can be numerically (and in some cases analytically) computes, can be obtained from the formal integral solution to the stochastic differential equation for X t , i.e.
Under the assumption that Σ t admits an asymptotic steady state6 Σ st and ℜ[A − χ(Σ st )Π k C] < 0 then the asymptotic mean and variance of the log-likelihood can be written as: where with Q = (1, 1) T .As one can see from the above expression, the variance of the LLR does not, in general, fulfill the condition in (73) that any LLR with Gaussian statistics must obey.The complexity of the equations makes it challenging to derive sufficient conditions for the conditions in (73) to hold.So we move to study the specific cases analyzed in the main text.
A.0.1 Case study in section 4.2 For the first case study it is not difficult to check the stability condition Hence, the Riccati equation describing the evolution of the covariance matrix under each of the hypotheses, admits a stationary solution, which is given by the diagonal matrix: where σ uc,k = nth,k + 1/2 + κ k /γ k is the covariance of the steady state of unconditional dynamics.This fact guarantees that Σ t will tend to the steady state Σ st = σ 0 ⊕ σ 1 .With this in hand and after some calculations one arrives to the following solutions for the mean where χ k = cσ k , with σ k and a similar expression for the variance, can be obtained (see Ref. [51]), showing that conditions in (68) are satisfied.

B Stopping time probability for a 1-D bounded stochastic process
Let Y t be a 1-D stochastic process described by the stochastic differential equation in the Itô form: The cumulative distribution gives the probability of the process to be in the interval (a 0 , a 1 ), i.e.P (τ > t) = a 1 a 0 P(y, t)dy where P(y, t) ≡ E[δ(Y min(t,τ ) − y)] is the probability distribution of the stopped process Y min(t,τ ) .From this one obtains: The evolution of the probability (P(y, t)) associated with the stopped process Y min(t,τ ) is described by the Fokker Planck equation associated to the stochastic process Y t plus absorbing boundary conditions, i.e. by: 1. ∂ t P(y, t) = ∂ y µ(y, t)P(y, t) + 1 2 ∂ 2 y σ(y, t) 2 P(y, t), i.e. the Fokker Planck equation associated to the stochastic process described by (S6).
It is worth noticing that conditions 1 and 3 can be used together with (S8) to obtain the following expression for the stopping time probability distribution B.1 Stopping time for a Stochastic process with deterministic drift and diffusion.
In this section, we restrict to study the case where drift and diffusion term in (S6) are continuous real deterministic functions of time, i.e. µ(y, t) = µ(t) and σ(y, t) = σ(t).In this case, the problem of finding the stopping time probability of the stochastic process Y t bounded in the region Ω = (a 0 , b 1 ), can be mapped to the equivalent problem of finding the stopping time probability of the stochastic process X t = Y t − t 0 µ s ds confined in the moving region [a 0 (t), a 1 (t)] where a i (t) = a i + t 0 µ s ds.The Fokker-Planck equation associated with the process X t is symmetric under reflection operations of the form x → 2β − x with β ∈ R, allowing for the use of the image charge method to find an explicit solution in the form of an infinite series for the survival probability P(x, t), i.e.

P(x, t)
is the solution of the differential problem ∂ t P (x, t) = 1 2 σ 2 t ∂ 2 x P (x, t) with initial condition P (x, 0) = δ(x).Substituting (S10) in (S9) and after some manipulations, one may obtain the following expression for the stopping time probability distribution In the case where the diffusion and drift are constant functions of time, i.e. µ t = µ and σ t = σ, the eq.(S12) further simplifies as Let us now define a * = min(a 0 , a 1 ) and take the limit a * → ∞.Under the assumption that µ > 0, the asymptotic behavior of the stopping time probability is described by the inverse Gaussian distribution (also known as Wald distribution) Assuming instead µ < 0, the asymptotic behavior of the stopping time probability distribution is described by the same inverse Gaussian distribution where a 1 has been replaced with a 0 , i.e.
P 0 (Xt) the LLR.The SPRT is defined by the test S s = (d s , τ s ), with and where T ∈ [0, ∞) is a generic time, α * = max{α 0 , α 1 } and τ ϵ s denote the SPRT stopping time in the class S ϵ ∈ C(α 1−ϵ 0 , α 1−ϵ 1 ) with ϵ ∈ (0, 1) and ε = ϵ 1+ϵ .In addition where to move from the second to the third line we have used the following set of inequalities Exploiting the inequalities in (S20), and recalling that α i ≥ P i (d = j) with j ̸ = i one obtains the following bound for the stopping time of a generic test in the class C(α 0 , α 1 ): What is left to do is to show that the r.h.s term can be bounded by the stopping time cumulative distribution of the SPRT.We recall that in the SPRT: and that the cumulative distribution of the SPRT stopping time (τ s ) is described by: The term P 1 (τ < T ∩ (ℓ τ / ∈ Ω)) resembles the above expression for P (τ s ≤ T ), suggesting to use the inequalities in (S22) to characterize b 0 and b 1 , however making them equal to the r.h.s. of that inequality will make the term e b 1 α 0 converge to one in the asymptotic regime producing a trivial result.To guarantee that lim α * →∞ e b 1 α 0 = 0 we choose b with ε ∈ (0, 1).Under this choice for b 0 and b 1 , the following inequality holds and one obtains where ϵ = ε 1−ε .Under the further assumption T = τ ϵ s the inequality simplifies as follows that proves the theorem for h 1 once the limit α * → 0 is taken.The case for h 0 is similarly proved.
which concludes the proof.
Proof.Let us write ℓ t in its integral form and make use of the (S48) to obtain the following identity set of From the above expression, one obtains: what is left to do to prove is to show that ζ a k goes to zero in the limit of a * → ∞. and the SPRT is asymptotically optimal in momenta, i.e.
where τ represents the stopping time of a generic test in the class C(α 0 , α 1 ).
Proof.We first prove eq. (S58).Let a * = min(a 0 , a 1 ), lemma 7 shows that lim a * →∞ P k (ℓ t ∈ Ω|t ≤ t k,− ) = 1, from which immediately follows that for t < t k,− where P (τ s ) is the normalized stopping time probability distribution associated to lt and reduces to the inverse Gaussian distribution in eq.(S58) once the limit of a * → ∞ is taken (see section (B.1) for further details), concluding the proof.Weak asymptotic optimality of the SPRT is guaranteed by theorem 8, to prove optimality in momenta we notice the probability distribution (S58) allows for the following asymptotic result and that log(a k ) ∼ log(α k ) for a * → ∞, that is a sufficient condition of asymptotic optimality as stated in the corollary(6.2).
The limit of (S76) reduces to: that is the thesis.Since the Neyman-Pearson lemma guarantees the log-likelihood test to be the deterministic test with the smallest weak error, one can also conclude that µ k is a lower bound for the deterministic case scenario.

Figure 1 :
Figure1: Attainable asymptotic error rates (ratio between log ϵ and sampling time) for type I and type II error rates, for deterministic strategies (in green), and for sequential strategies (in orange).

Figure 2 :
Figure2: We illustrate how the stopping time distribution arises from the stochastic continuous trajectories of ℓ t ; here we show some realizations of the ℓ process, along with some time-slices at t 0 and t 1 shown in blue together with the corresponding distributions p 1 (ℓ t ), which close to their peak value are well approximated by a Gaussian.The horizontal line in red corresponds to the fixed threshold ℓ τ = a, leading to an arrival time distribution p 1 (τ ).Such distribution appears as a consequence of a difference in the arrival times for trajectories of ℓ t .

Figure 3 :
Figure3: Damping-rate discrimination.We show the evolution of the mean-value of the LLR, obtained by sampling Y t under hypothesis h 1(0) in red(blue), along with different realizations of the stochastic process.We observe the corresponding tendency towards positive(negative) values according to the underlying true hypothesis h 1(0) .

Figure 4 :
Figure4: Damping-rate discrimination.We show the ratio between theoretical and numerically-computed values of the mean value (variance) in the first (second) column, as a function of time, when sampling from hypothesis h 1(0) in the first (second) row; these results were computed over 2 • 10 4 quantum trajectories.As observed, after a transient time, the first and second moments converge to the predicted analytical values.

Figure 5 :
Figure5: Damping discrimination.We show the histogram of the LLR for time t = 2s (t = 8s) in the upper (lower) panel, under hypothesis h 1(0) in red(blue).We compare this with the theoretically predicted Gaussian distributions .

Figure 6 :
Figure6: Damping-rate discrimination.We compare the time required for the optimal deterministic and sequential tests to reach a certain symmetric error-probability threshold.The performance is computed by averaging stopping times (sequential) and estimating the error made for each fixed time (deterministic) over N = 4•10 4 trajectories, averaged over both hypotheses.Moreover, we compare the numerics with the theoretical predictions (solid lines) using eqs.(69) and (70) for the sequential test, and the Gaussian model for the deterministic one, here denoted by τ G det , using eqs.(71) and eqs.(72).The later model is clearly seen to be invalid (see main text).

Figure 7 :
Figure 7: Damping-rate discrimination.We compute the histogram of the stopping times for the sequential test and compare it with the Inverse Gaussian probability distribution of theorem 4, showing a good agreement.Results are shown when h 1 holds true (similar results are obtained when swapping the underlying hypothesis).

Figure 8 :
Figure8: Frequency discrimination.We compare the average time required for the sequential test to reach a predefined error threshold P err , with the time for which a deterministic strategy would need to attain such error threshold.These quantities are estimated using 2 • 10 4 simulated quantum trajectories for each hypothesis, and averaging the performance over both hypotheses.

Figure 9 :
Figure9: Force discrimination.We compare the sequential and deterministic test, over 4 • 10 4 trajectories.Simulated performance of both tests is compared with the theoretical predictions (solid lines) using eqs.(69) and (70) for the sequential test, and the Gaussian model for the deterministic one, here denoted by τ G det , using eqs.(71) and eqs.(72).The latter model is seen to be valid, in this case (see main text)
Let P i ,and E i denote the probability and the expectation under the hypothesis h i , S = (d, τ ) denote a generic hypothesis test where τ is a Markov stopping time, d = d(X τ 0 ) is a terminal decision function with values in the set {0, 1}, and X t the sample of length t.Let C(α 0 , α 1 ) = {S : P )C Theorems and Proofs Theorem 6.
If ℓ t describes the LLR then τ s represents the SPRT stopping time, and the test is weakly asymptotically optimal.Proof.one only need to notice that for the SPRT a k ∼ log(α k ), then corollary (1.1) guarantees the optimality.
Let us study ζ a kand rewrite the integral of dt as the sum of the following integrals lim(1∓δ)with δ ∈ (0, 1).Since P (τ s ) is a bounded positive function one is allowed to replace P (τ s ≥ t) with its limit on a * → ∞ in the integral, and making use of lemma 7 obtain lim where for the last inequality we have used the fact that the averaged increment of the log-likelihood is a bounded function, i.e.E k [dℓ t /dt] < C with C < ∞.Then with the help of (S55), it is not difficult to show that lim Let P i the probability distribution under the hypothesis h i , ℓ t , the LLR, to be described by a continuous stochastic function under the probability measure P i .Let τ s be a stopping time defined as τ (Y t ) = inf{t ≥ 0|ℓ t (Y t ) / ∈ Ω} with Ω = (−a 0 , a 1 ),i.e. the SPRT stopping time.Ifℓ t = lt + O p k ( where lt ≡ (−) k⊕1 µ k t + σ k ζ t , µ k ∈ R + andζ t a standard Wiener process, i.e.E[ζ t ] = 0 and E[ζ t ζ τ ] = min(t, τ ), then the probability distribution of the rescaled stopping time τs := τs a k is asymptotically approximated by the Inverse Gaussian distribution: