Phase diffusion and the small-noise approximation in linear amplifiers: Limitations and beyond

1Centre for Quantum Technologies, National University of Singapore, 3 Science Drive 2, Singapore 117543 2ICTP, Strada Costiera 11, I-34151 Trieste, Italy 3Dipartemento di Fisica, Università di Napoli "Federico II", Monte S. Angelo, I-80126, Italy 4MajuLab, CNRS-UNS-NUS-NTU International Joint Research Unit, UMI 3654, Singapore 5National Institute of Education, Nanyang Technological University, 1 Nanyang Walk, Singapore 637616 6Department of Physics, University of Oxford, Parks Road, Oxford, OX1 3PU, UK


Introduction
It was shown in the 1960s that linear amplification of light is an inherently noisy process [1][2][3]. As a result, the input phase suffers from an increased uncertainty during the course of amplification. The process by which this occurs in a phase-insensitive linear amplifier is phase diffusion, the same process responsible for the natural linewidth of single-mode lasers [4][5][6][7][8]. Aside from fundamental interests, phase diffusion has also been shown to be useful for quantum random number generation [9][10][11][12]. More recently it has also been used to demonstrate the applicability of phase squeezing as a means to reduce phase noise in linear amplifiers [13].
Phase diffusion in linear amplifiers has been studied with a variety of methods [14][15][16][17][18][19]. In general, one would like to characterise phase diffusion by calculating the diffusion coefficient, or by calculating a sensible measure of phase uncertainty (often the phase variance but not exclusively [17,[20][21][22]). However, phase diffusion is inversely proportional to the photon number which makes the diffusion coefficient or variance difficult to calculate without approximations. The most widely used approximation assumes the input light to be sufficiently intense such that its photon number has an average much greater than its standard deviation (see e.g. [15,16]). The simplification gained from such an assumption is that one may replace the photon number by its average. Although this approximation is extremely useful and continues to be a standard method of analysis in quantum optics, it restricts one to field states with a narrow photon-number distribution. For example, weak coherent states, which may be produced by attenuating the output of a laser may not satisfy the small-noise approximation. These states have been shown to be useful for quantum cryptography [23][24][25]. In addition, states with low number of photons are useful in numerous fields outside of quantum information science, such as single-molecule spectroscopy, light detection, and many others [26][27][28]. We note that the small-noise approximation is, in spirit, similar to what others might call a "semiclassical" or "meanfield" approximation in quantum optics. 1 A primary function of linear amplifiers is to boost the strength of a weak signal to detectable levels [29]. For quantum linear amplifiers, increasing the signal strength comes at a cost-the amplifier must add noise if it is to operate according to the principles of quantum mechanics. The quantum limit on how much noise such an amplifier must add were investigated in the early days of linear amplifiers [1][2][3], and has been extended recently [30]. If the input to the amplifier is a noisy signal (which is typically the case) this noise also gets amplified so in general the output consists of the amplified input signal, the amplified input noise, and the noise the amplifier adds. As a result, the signal-tonoise ratio degrades after amplification. This suggests that the small-noise approximation tends to get worse over the course of amplification, with the breakdown becoming more accute for weaker inputs. One can then try to optimise the signal-to-noise ratio by using an ideal linear amplifier-an amplifier which adds the least amount of noise required by quantum mechanics. It is therefore particularly interesting to know how well the smallnoise approximation performs for an ideal linear amplifier as this is the most favourable scenario for the approximation to hold. In this paper we show that even for an ideal linear amplifier, the small-noise approximation is still inadequate in the regime of low-intensity inputs (few photons on average) and large amplifier gain. This is our main result which essentially demarcates the limit of the small-noise approximation in treating phase diffusion in linear amplifiers.
An important aspect of the small-noise approximation is that it allows one to calculate the phase noise analytically. Naturally, one wonders if it is still possible to make some analytical progress when the small-noise approximation fails, or do we have to resort to numerics? In this paper we show that it is possible to generalise the small-noise approximation and obtain the phase uncertainty for weak inputs. In this treatment the phase uncertainty is given by a so-called inverse-number expansion. The expansion is checked against numer-ics and is also valid for nonideal linear amplifiers. We also derive a closed-form expression for the output phase noise under the small-noise approximation, which is then compared with the inverse-number expansion. An appreciable difference between the phase noise obtained from the small-noise approximation and the inverse-number expansion can be seen for low-intensity coherent inputs where only a few photons are present on average (and more generally intensities that are on the order of unity). However, the applicability of the small-noise approximation is seen to be restored by including, on average, an additional ten photons or slightly less. This highlights the sensitivity of the small-noise approximation to the discreteness (or photon nature) of weak coherent inputs.
The paper is organised as follows. We first introduce our linear-amplifier model in Sec. 2 (in terms of a master equation), and define our model of phase diffusion. Here we also review some essential features of the standard linear amplifier and state our measure of phase uncertainty. Here we also motivate the relevant parameter regime to operate the amplifier in and the importance of the coherent state as a simple model of the amplifier input. The small-noise approximation is then explained in Sec. 3 and we shall see how this approximation leads to a closed-form expression for the phase uncertainty. The details are left to Appendix A. The inverse-number expansion-which is our proposed method for going beyond the small-noise approximation-then follows quite naturally from this. The idea behind the inverse-number expansion can be simply explained so we leave its derivation and checks to Appendices B and C. Using the inverse-number expansion we then present our main result in Sec. 4, where the phase uncertainty obtained by the two approaches are compared against each other. Here we shall see that a treatment of the phase noise beyond the small-noise approximation is necessary for a coherent input whose photon number is on the order of unity. This is despite the fact that such an input gets added as little noise as possible (a condition for which the small-noise approximation is least likely to breakdown during amplification). Of course, our chosen path to this result is not without any drawbacks so both the advantages and disadvantages of our approach are discussed in Appendix D. We then conclude in Sec. 5 with a summary of our work and a discussion of the road ahead for extending the results of this paper.

Linear-amplifier model
There are a number of linear-amplifier models that one can choose from. A universal model in fact exists (a two-mode squeezer [30]) provided the noise introduced by the amplifier is additive [31,32]. However, for phase diffusion, the focus of attention has been on a masterequation realisation of the linear amplifier and we will follow suit. An important element of the master-equation model is that it can be recast as a Fokker-Planck equation-the quintessential equation for diffusive processes-which justifies calling the phase evolution (of the signal being amplified) phase diffusion. The master equation for a linear amplifier is given by where κ ↓ and κ ↑ are positive real numbers, andâ andâ † are annihilation and creation operators satisfying the canonical commutation relation [â,â † ] =1. Such a master equation may be derived by considering the interaction of a bosonic mode with a collection of two-level atoms. The terms proportional to κ ↓ and κ ↑ collectively describe photon loss and gain respectively. Hence the coefficients κ ↓ and κ ↑ can be thought of as rates at which photons are lost or gained. An input bosonic field prepared in the state ρ(0) then experiences amplification when there is net gain i.e. κ ↑ > κ ↓ , a condition usually achieved via population inversion in the two-level atoms. The simplest way to see this is a direct calculation of the output photon-number using (1), where we have defined the photon-number gain The output can be seen to comprise one part which is simply given by the input but now amplified by a factor G t , and another portion, independent of the input, interpreted as noise due to spontaneous emission. Note that a given value of photon-number gain G t is determined by the product of the amplification time t and the amplification rate κ − . When κ ↓ = 0, the model is said to describe an ideal linear amplifier, so called because in this case the amplifier adds the least amount of noise consistent with quantum mechanics [33]. From (2), the noise can be seen to be minimised when κ ↓ = 0.

Amplifier parameter regime and input signal model
As we mentioned above, a linear amplifier serves primarily to increase a weak signal to measurable strengths [3,29]. The high-gain limit of linear amplifiers (G t −→ ∞) is thus of special interest in research, both theoretically (such as addressing the amount of noise an amplifier must add [3,29]), and experimentally (such as the existing experimental efforts to build linear amplifiers in circuit QED [29,34,35]). We therefore focus on the high-gain limit here as well since this is the most relevant regime. Aside from a suitable amplifier model we also need a suitable model for the amplifier input. For this we consider coherent states. Coherent states provide a particularly good model for the input signal because their signal-to-noise ratio is easily tunable via its average photon number. Another reason to consider coherent states is that quantum optical experiments are typically performed with lasers and the state of laser light can be effectively modelled as coherent. In fact, a recent experiment demonstrating phase diffusion and how it can be countered using squeezing was performed with a coherent state input [13]. Aside from these advantages of coherent states, there are also reasons for not considering other commonly used quantum-optical states: First, for phase diffusion, states with perfect angular symmetry-the vacuum, thermal, or number states-cannot be used as input models as they have a maximally diffused phase. Second, for the phase-preserving amplifier model of (1), one would not expect to observe the effects of nonclassical features on phase diffusion such as squeezing, sub-Poissonian statistics, or macroscopic superpositions (corresponding to the squeezed, number, and Schrödinger-cat states). This is because such features will be completely lost even at moderate gains (G t = 2) [33,[36][37][38][39]. It has also been shown that some nonclassicality can persist [40][41][42][43]. Despite this, one should keep in mind that one usually cares about nonclassical states because they are useful for quantum information processing (e.g. see Refs. [44,45]). In these cases, most nonclassical states are interesting only when the state is maintained (e.g. a Schrödinger-cat state remains a cat state). This is a stronger requirement than just having any nonclassical state, a requirement which cannot be satisfied by our amplifier model. 2 Hence we do not consider nonclassical inputs in this paper. Our objective here is to make clear the breakdown of the small-noise approximation in phase diffusion.

Phase diffusion from the Glauber-Sudarshan distribution
Coherent states are simplest to describe using the P (also called the Glauber-Sudarshan) representation of ρ(t). This is defined by a P (α, α * , t) such that the state at any time t may be expanded as where the integral runs over the entire complex plane C. It is well known that P (α, α * , t) satisfies a Fokker-Planck equation which can be derived from (1) [46]. It is also well known that the Fokker-Planck equation can be converted to a set of stochastic differential equations for α(t) and α * (t) whose dynamics sample phase space so as to be consistent with P (α, α * , t) [46,47]. In this paper we adopt the latter approach of stochastic differential equations. That is, instead of considering an evolving P distribution, we consider α and α * to be changing in time, described by where dZ(t) is a complex Wiener increment given by However, to describe phase diffusion we ought to change variables from α and α * to N and Φ, where α = √ N exp(iΦ). Our model for phase diffusion is thus given by These equations may be derived by converting (1) to its equivalent Fokker-Planck equation in terms of the Glauber-Sudarshan distribution [46]. For phase diffusion one then needs to transform the Fokker-Planck equation into polar coordinates [15,16]. It is then simple to show that (9) and (10) are equivalent to the Fokker-Planck equation in polar form. Since we are only interested in amplification, κ − will be a positive real number. Equations (9) and (10) are to be interpreted as Itô equations with dV N (t) and dV Φ (t) being independent Wiener increments satisfying example, a probabilistic amplifier may be used to boost a cat state if we want to combat photon loss and at the same time and have the state remain as a cat for whatever information processing protocol to be used.
It is now clear from (9) that the phase Φ(t) is coupled nonlinearly to N (t). It is essentially this nonlinear coupling to N (t) that makes the phase uncertainty difficult to obtain in closed form. However, as Itô equations, (9) and (10) can be manipulated according to the rules of Itô calculus [48]. It is this aspect of our approach that allows us to go further than the small-noise approximation in quantifying the amount of phase noise appearing at the amplifier's output. Existing treatments on the phase uncertainty use Heisenberg equations of motion [18], the Pegg-Barnett phase operator [15,19], or the phase distribution ℘ Φ (φ, t) obtained from quasiprobability distributions [16,17]. While each treatment can be argued to have its own appeal, they all make the small-noise approximation.
In order to study the effectiveness of the small-noise approximation we will calculate the time-dependent phase variance The expectation value of any function of phase f (Φ) at time t is defined formally by where the phase distribution ℘ Φ (φ, t) is defined as the marginal distribution ofP (n, φ, t) (the P function parametrised by n and φ) with n integrated over. Note that we are following the convention of using a capital letter to denote a random variable and the corresponding small letter for its realisation. There are several measures of phase uncertainty that one can choose from [21,22]. Thus, given that phase is a cyclic variable one may wonder why we have not used other measures which are better at handling cyclic property of phase. The reason is that the cyclic nature of phase never really shows up here, at least for the simple example of a coherent-state input considered here. We will see that the phase distribution of the amplified field is always a single-peaked function which can always be placed in the centre of an appropriately chosen interval [φ 0 , φ 0 + 2π]. In this case the usual variance as given by (13) can be used [21]. It is widely known that P (α, α * , t) can be solved in closed form for Gaussian input states. However, this does not imply that the phase noise can be derived in closed form. This is because on reparametrising P (α, α * , t) to getP (n, φ, t), the nonlinear dependence of phase on the photon number gets introduced into the problem. The phase variance then involves calculating the first and second moments with respect to the marginal distribution ℘ Φ (φ, t). It is shown in Appendix D [see (103)] that even when restricted to the high-gain limit, the phase distribution ℘ Φ (φ, t) is very complicated for a coherent-state input which makes the moments of the phase impossible to derive.

The small-noise approximation and beyond
The small-noise approximation is first explained in Sec. 3.1 and we show how an expression for the phase variance can be obtained under this approximation. This naturally sets the stage for the main method of analysis-the inverse-number expansion in Sec. 3.2. The expansion can be stated quite simply and the idea behind it is in fact quite straightforward. Therefore we leave the details of its derivation to Appendix B.

Phase variance within the small-noise approximation
To obtain the phase variance we need its second moment. Using the Itô chain rule and (9) we obtain The second moment of phase thus depends on the expectation value of E[1/N (t)]. As alluded to in Sec. 1, the small-noise approximation assumes the field being amplified has a narrow photon-number distribution so that one may regard N (t) as essentially a sure variable (a deterministic quantity) whose value is equal to E[N (t)]. Under this assumption, one may thus replace (15) by .
We can then derive E[N (t)] directly from (10) to be [or take it from (2) with the appropriate change of notation since the photon number is already a normally ordered quantity] Here we see explicitly that under the small-noise approximation the fluctuations in N (t) are ignored as it is only the mean of N (t) that is used to determine the phase variance. This is clearly quite where both a deterministic part and a noisy part contribute to the denominator inside the expectation value. As we shall see, the inversenumber expansion on the other hand attempts to deal with the 1/N (t) dependence wholly which allows the fluctuations in N (t) as given in (10) to influence the phase variance. We can now derive the phase uncertainty under the small-noise approximation by substituting (17) into (16). The derivation is a little tedious so we leave the details to Appendix A. The result is simply Later we will compare the phase variance obtained from the inverse-number expansion to (18).

Inverse-number expansion
In order to go further than the small-noise approximation we must try to deal with the E[1/N (t)] appearing in (15). To this end, let us work out how E[1/N (t)] changes in time.
Using (10) and Itô calculus we find Hence we see that E[1/N (t)] depends on E[1/N 2 (t)]. However, since amplification increases the photon number we expect that on average N (t) to increase, and hence 1/N (t) to decrease. This is illustrated in Fig. 1 where different realisations of N (t) corresponding to an ideal linear amplifier are shown. If at time t we find that on average 1/N 2 (t) 1/N (t), then we can neglect the second term in (19) and obtain, as a first-order approximation, Clearly, one can continue in a similar fashion and improve on (20) by taking into account the is simple to obtain and is given by On neglecting E[1/N 3 (t)] we can then solve (21) straightforwardly. The second-order approximation to E[1/N (t)] is then obtained by first solving (21) with E[1/N 3 (t)] neglected and then using the resulting expression for E[1/N 2 (t)] to solve (19). Performing these steps we arrive at From this we can now appreciate the essential simplification made by the small-noise approximation-namely that it avoids the coupling of E[1/N (t)] to higher order moments in 1/N (t). In general we will find that requires us to solve an infinite set of coupled differential equations involving all the moments of 1/N (t). In practice however, we would expect (and in fact we shall find) that truncating the set of coupled differential equations at sufficiently accurately. Our objective is thus to first obtain a general expression for E[1/N (t)] for any K (i.e. any truncation) and subsequently an expression for the phase variance V[Φ(t)]. It will be convenient to define where using Itô's lemma, one can show that Υ(t) is the stochastic process defined by Based on (20) and (22) we propose the following expansion for E[Υ(t)] , We call (25) the inverse-number expansion up to Kth order. Equations (20) and (22) are thus examples of the inverse-number expansion up to order one, and order two respectively. The first two expansion coefficients can thus be read off from these equations, Note that to avoid convergence issues in (25) the realisations of N (0) need to be greater than one. The problem of obtaining the phase variance now requires one to find the expansion coefficients {g n (t)} K n=1 up to any order K, and for arbitrary parameter values κ ↑ and κ ↓ . In essence, the important step is the application of Itô's lemma to the nth power of Υ(t), giving where we have defined for ease of writing We refer the reader to Appendix B for the proof of (28) and the subsequent derivation of the time-dependent expansion coefficients. The result is where β 1,1 = 1, otherwise Note from (30) that g n (t) −→ 0 as t −→ ∞. This means that the right-hand side of (15) eventually goes to zero and the second moment of phase reaches a constant value. We can write out, as an example, the first three coefficients of the inverse-number expansion: It is simple to see that (26) and (27) are reproduced on substituting (29) into (32) and (33). The advantage of the inverse-number expansion can also be seen in the complexity of g 3 (t), where it would have been somewhat tedious to derive by solving a set of three coupled differential equations. Of course, after obtaining (25) one still has to calculate E[Φ 2 (t)] according to (15), and then the variance defined by (13). The expression for E[Φ 2 (t)] thus entails an integral of g n (t). We provide a discussion on the validity of the inverse-number expansion in Appendix C.

Phase uncertainty
We now have everything to present our main result. It is shown that for relatively weak inputs (whose number average is a few photons), the small-noise approximation inadequately captures the high-gain output phase noise even for an ideal linear amplifier. Such a linear amplifier is optimal with respect to the small-noise approximation, so if the approximation fails here, one can expect it to fail more severely for either nonideal linear amplifiers, or ideal amplifiers but with less gain. We have also provided a discussion on the use of the Glauber-Sudarshan distribution in describing the phase of the amplified field in Appendix D. In essence this validates our result in the high-gain regime.

General expression
Substituting (25) back into (15) and integrating we have To obtain the variance of Φ(t) we also require its mean. This is simple to get from (9) as the phase is entirely driven by a Wiener process so that on averaging (9) we have Squaring (36) and recalling the definition of the variance from (13) we arrive at where we have defined, on using (30),

Ideal linear amplifier
As explained earlier, an ideal linear amplifier is an amplifier that adds the least amount of noise. Such an amplifier corresponds to the model in Sec. 2.1 with κ ↓ = 0 so that Thus the ideal linear amplifier is entirely characterised by a single parameter, κ ↑ (assuming a fixed t). This sufficiently simplifies the expression for the phase variance so that we may write it as an explicit function of κ ↑ . From (29) we have It then follows that β k,n is independent of κ ↑ . Recall that we defined β 1,1 = 1 while and the phase variance then simplifies to where χ n (t) is now given by Note that the product in the denominator of (45) skips over terms for which j = k so that for k = 1 we have 1 j=1,j =1 (j − 1) = 1. The first time-dependent coefficient for the ideal linear amplifier is therefore Let us now compare the phase variance obtained from integrating the inverse-number expansion to the variance obtained with the small-noise approximation [recall (18) from Sec. 3.1]. This is illustrated in Fig. 2 for an ideal linear amplifier which we now explain. First, in Fig. 2(a) we show 500 sample paths of the phase which gives us a visualisation of how the phase of an input signal diffuses as it gets amplified. In Fig. 2(b) we plot the phase variance from the 500 phase realisations (blue jiggly curve) along with the variance obtained from (44) for K = 4 (red, solid line), K = 1 (cyan, dotted line), and the small-noise approximation (black dashed line). Recall that we defined the small-noise approximation in (16) back in Sec. 3.1. The same is true for Fig. 2(c) and (d) with the only difference being the value of N (0). For Fig. 2(a) and (b) we took N (0) = 2.25, while for (c) and (d) we used N (0) = 13. This allows us to see how the small-noise approximation performs when applied to the amplification of two inputs of different strengths. Clearly, the inversenumber expansion does a much better job of estimating the output phase noise for relatively weak inputs. Another interesting observation from Fig. 2 is that even the lowest-order approximation (K = 1 in the inverse-number expansion) as explained in Sec. 3.2 performs visibly better than the small-noise approximation.
With the comparison shown in Fig. 2 at hand, let us now try to understand some of its essential features: First note how in both Fig. 2

(b) and (d) the small-noise approximation appears to underestimate V[Φ(t)].
It is in fact a general property that the small-noise approximated phase variance provides a lower bound for V[Φ(t)]. We can show this by considering the evolution of the variance given by where we have used the fact that E[Φ(t)] is constant. Now we note that Jensen's inequality is a convex function of the random variable X. It can then be shown that 1/N is convex and hence it follows from Jensen's inequality that .
The right-hand side of the inequality (48) is exactly the rate of change of the phase variance in the small-noise approximation. Thus the exact phase variance must always rise at least as fast, and hence will be greater than or equal to, the phase variance in the small-noise approximation. Note that (47) also says that as N (t) increases without bound, This means that the phase variance approaches to some constant value and is therefore upper bounded. This precisely is what we see in both Fig. 2(b) and (d) where the phase variance can be seen to level off after some transient dynamics.
We can also try to understand why the small-noise approximation performs worse for smaller inputs. Recall from Sec. 3.1 that we expect the small-noise approximation to be good when the photon-number distribution is narrow. If the phase variance obtained from making such an approximation is to be accurate for all times then this condition must be met for all times as well. The condition of a narrow number distribution can also be understood as having a strong signal-to-noise ratio σ(t), defined as In fact, a Taylor expansion around the mean of N (t) up to second order gives Therefore to understand why the discrepancy between the V[Φ(t)] obtained from the smallnoise approximation and inverse-number expansion is larger in Fig. 2(b) compared to Fig. 2(d), we should consider how 1/σ(t) evolves for different N (0). From (51) we expect the larger 1/σ(t) is, the worse the small-noise approximation performs. A closed-form expression for 1/σ(t) can be derived. Remember from (17) of Sec. 3.1 we already have (for any κ ↑ and κ ↓ ) So to obtain σ(t) we only need E[N 2 (t)]. It is not too difficult to show that in general (i.e. arbitrary κ ↑ and κ ↓ ), (52) and (53), and hence σ(t). The analytic result for 1/σ(t) is plotted in Fig. 3 to illustrate its dependence on N (0). In particular, the N (0) = 2 and N (0) = 13 curves correspond to the two situations shown in Fig. 2 (ignoring the small difference of 0.25 for the weaker input). We have also plotted 1/σ(t) for intermediate values of N (0) to illustrate how the evolution of the signal-to-noise ratio depends on N (0). As can be seen from Fig. 3, 1/σ(t) rises much faster for N (0) = 2 than N (0) = 13. This is consistent with the phase variance observed in Fig. 2(b) where the small-noise phase variance diverges from the inverse-number expansion result early on during transience and severely underestimates the stationary phase variance thereafter. On the other hand, the 1/σ(t) curve for N (0) = 13 rises much more slowly and remains at a much lower level compared to the N (0) = 2 case. Thus we now understand why in Fig. 2(d) the small-noise phase variance stays close to the inverse-number formula during transience and continues to do so at later times when compared to Fig. 2(b). It is also interesting to consider the case of a fixed input strength for different levels of non-ideality. This is shown in Fig. 4(a) for N (0) = 3. For nonideal amplifiers, 1/σ(t) takes longer to reach steady state value so we have plotted the curves for a longer time. At long times (i.e. large G t ), the more ideal the amplifier is, the lower its 1/σ(t) value. This ordering is in fact preserved for all input strengths for large G t . With the help of (52) and (53), we obtain a simple expression for 1/σ(t) when G t 1 (assuming a coherent-state input with amplitude α)

The number variance V[N (t)] = E[N 2 (t)] − (E[N (t)]) 2 is thus given explicitly by
(54) From this we see that Σ(|α| 2 ) ≤ 1 (with equality when |α| 2 = 0 i.e. no input), and that Σ(|α| 2 ) −→ 0 for |α| 2 −→ ∞, so that nonideal and ideal amplifiers become indistinguishable in so far as σ(t) is concerned. For all nonzero and finite |α| 2 the behaviour of Σ(|α| 2 ) is shown in Fig. 4(b). We plot Σ(|α| 2 ) for the same nonideal amplifiers as in Fig. 4(a). We find the more ideal the amplifier is, the better it performs for any input strength. This again corroborates with discrepancy seen in Fig. 2 between the phase variance obtained from the small-noise approximation and the inverse-number expansion.

Conclusion
In this paper we have shown the often-used small-noise approximation to be inadequate for capturing phase diffusion in linear amplifiers when the input contains a few photons on average. However, the phase noise of a 'slightly' stronger input (say ten photons or more) may be described reasonably well within the small-noise approximation. We have demonstrated this even under conditions most favourable to the small-noise approximationamplification of a single-mode field by an ideal linear amplifier with large gain (equivalent to amplifying an input for a long time with a fixed amplification rate). This is also the regime where one would ideally like to operate a linear amplifier. We obtained such a result by proposing the inverse-number expansion using the P (or Glauber-Sudarshan) function.
From our examples, we saw that the inverse-number expansion with a reasonably small set of terms was sufficient to capture the phase uncertainty beyond transience. In terms of the P function, a coherent state is a completely noise-free point in phase space which makes such inputs extremely easy to handle. In this case, all of noise seen at the amplifier's output is interpreted as added noise due to the amplifier being quantum-mechanical. The disadvantage, is that it cannot predict the correct transient phase noise (though this is actually the less interesting case). The extended discussion of Appendix D implies that the W (i.e. Wigner) function is a better choice for capturing phase diffusion because it has the special property of giving true marginals. One possible extension which can be explored is to recast the inverse-number expansion in terms of the W function. This would then allow us to obtain a valid estimate for the phase noise even at small gain, or short times. The cost to this extension is that the initial inverse-number statistics become much more difficult to calculate even for a simple input such as the coherent state.

A Phase variance within the small-noise approximation
We have already said in Sec. 3.1 that the small-noise approximation replaces the field intensity by its mean value so that we have for the second moment of Φ(t), where E[N (t)] is given by This is a good approximation provided that Here we shall derive the phase variance in closed form within the small-noise approximation.
To start, we note that the phase is coupled to N (t) but it is purely noisy (i.e. it has no deterministic evolution) driven by a Wiener process. The average of Φ(t) is therefore equal to its initial value E[Φ(0)]. Integrating (55) then gives us . (57) Using (56), we write the integral of 1/E[N (t)] in a more compact form where a and b are constants defined by The exponential in the denominator can be dealt with if we make the substitution The corresponding indefinite integral can then be written entirely in terms of u: The definite integral then follows on plugging in the definition of u and then evaluating the limits. The result is simply Substituting this back into (57) and using (59) and (60) for a and b we arrive at (18) in the main text.

B Derivation of the inverse-number expansion B.1 Main proof
Recall from (24) of Sec. 3.2 that the stochastic differential equation for Υ(t) is Applying Itô's lemma to Υ n (t) we have It then follows on taking the average of (69), This proves that E[Υ n ] couples only to E[Υ n+1 ]. Let us now define a K × 1 vector x(t) by . . .
This defines the mean of the inverse-number Υ as the first component of x(t). We can then write (70) in the form of a linear matrix differential equation for x: Furthermore, A is a triangular matrix with K distinct eigenvalues given directly by {b n } K n=1 . Letting D = diag(b), an eigendecomposition of A then gives where the nth column of S is the eigenvector corresponding to b n : The solution to (73) can then be written in component form as The inverse-number expansion is then given by (79) for r = 1. This gives us a general expression for the time-dependent coefficients in terms of S and its inverse on comparison to (25): It can be shown that S is given by , m < n , Note that S is not an orthogonal matrix so that its transpose does not equal its inverse. Nevertheless, because S is an upper-triangular matrix it is fairly straightforward to analytically obtain its inverse using standard symbolic software such as Mathematica. The inverse of S is again upper triangular, given by We prove in Appendix B.2 that S is given by (81). It is interesting to note that S has a structure such that for any K, its inverse can be constructed directly by taking matrix powers. This is due to S being an upper-triangular matrix with its main diagonal given by only ones. We can then write it in terms of a nilpotent matrix N as S = I + N where I is the K × K identity matrix and N is strictly upper triangular such that N p = 0 for all p ≥ K. The inverse of S can then be obtained as a finite (geometric) series in N This provides an alternative way to compute S −1 . Since N is strictly upper triangular, its powers are not too difficult to compute for small values of K. Note that even though we need only the first row of S for g n (t), the remaining elements of S are still required to obtain S −1 . Taking the first row of S we have It is not too difficult to show, by substituting (82)-(85) into the definition of β k,n in (80) that where β 1,1 = 1, and Note the top limit of the sum in (86) is now n rather than K as a result of (82) for m > n.

B.2 Eigenvectors of A
The goal here is to prove that S has the form given in (81). This is equivalent to showing that the nth column of S, which we denote by v (n) , satisfies In component form, the left-hand side of (88) is where we have noted that v . (92) Now we note that For m > n it is clear that (81) gives zero for the left-hand of (91) and also zero for the right-hand side. Finally, for m = n it is also trivial to see that (81) gives b m for both sides of (91).

C Validity of the inverse-number expansion C.1 An iterative derivation of the inverse-number expansion and
To show that (25), (30), and (31) of Sec. 3.2 are indeed correct we consider an alternative way to derive the inverse-number expansion. Let us return to (28) and note that it could have been solved iteratively. That is, it has the formal solution This now gives us a recursive relation between any two consecutive moments of Υ(t) so we may write E[Υ(t)] up to any desired order in moments of Υ(0) by repeated application of (97). For example, application of (97) for n = 1, 2, 3 gives Using the definition of the expansion coefficients in (25), we find on repeated application of (98) the general result Equations (99) and (100) may also be seen as an alternative form for the time-dependent coefficients of the inverse-number expansion. Compared to (30) and (31) they are rather clumsy to use for large K as (100) requires one to compute an n-fold integral for g n−1 (t). However, the integrals only ever involve simple exponentials so they are not too difficult to evaluate for small n. We can at least show that they match the results in (33) and (34). Considering n = 2 and n = 3, It is clear that (101) is identical to (33). To see that (102) is consistent with (34) we only have to add the coefficients of exp(b 1 t) in (102).

C.2 Number and phase as stochastic processes
We can illustrate how the inverse-number expansion works by comparing it to the average of Υ(t) over many realisations obtained from simulating (24). This comparison is made in Fig. 5 for the simple case of an ideal linear amplifier. We first show how different realisations of Υ(t) look like in Fig. 5(a). For visual clarity only five realisations of Υ(t) are shown. We then illustrate the analytic expression of E[Υ(t)] for a few values of K. If the claimed time-dependent coefficients given in (30) and (31) are correct then we ought to find a value of K for which there is a good agreement between the analytic expression for E[Υ(t)] and that obtained from a large sample of Υ(t). In Fig. 5(b) we show how the sum in (25) changes for K = 1 (green dashed curve), K = 2 (blue solid curve), and K = 3 (black dotted curve). In Fig. 5(c) we compare the inverse-number expansion with K = 3 with the average of the five realisations of Υ(t) shown in (a). The same comparison is made with 200 realisations in Fig. 5(d), and as can be seen, the sample average (solid blue line) is much smoother due to the large sample size and is in good agreement with the analytic result (red dashed line). We note that in practice there is some threshold value of K beyond which the inverse-number expansion starts to deviate from the E[Υ(t)] calculated by averaging over many sample paths though in theory we expect this K to be arbitrarily large. We find that for a coherent-state input |α , this threshold tend to be larger for larger α. As long as K is less than the threshold we find good agreement between the analytic result given by the inverse-number expansion and the stochastic simulations.

D Phase in quantum optics and the use of the P distribution
Here we comment on the nature of our treatment of phase and its relation to other work on phase diffusion in linear amplifiers. We have chosen to work with the stochastic differential equations that are consistent with the Glauber-Sudarshan function as opposed to other quasiprobability distributions. It has been shown explicitly for linear amplifiers that it is the Fokker-Planck equation for the Wigner function (the W ) rather than the Husimi (the Q), or the Glauber-Sudarshan (the P ) which gives the correct phase diffusion coefficient in the small-noise approximation [15,16]. This is perhaps not hard to accept given that W (q, p), i.e. when parametrised by the canonical position and momentum, guarantees true marginal distributions when either q or p. The idea of using the Wigner function is that one would expect this to still be true if the Wigner function was reparametrised by number and phase (a more complete discussion of this point is given in Sec. 5 in connection to possible future directions). Then in what sense does the Glauber-Sudarshan distribution provide a valid description of the phase noise in linear amplifiers? If we use the P rather than the W then we forgo the transient dynamics of phase during amplification. The stationary statistics of phase however remain correct. It is already known that P , W , and Q all give the same limiting phase distribution for an ideal linear amplifier in the high-gain regime [17,49] (though there is no reason to expect this to breakdown for nonideal amplifiers). For a fixed amplification rate κ ↑ , the high-gain regime simply means that we amplify the field for sufficiently long. With the assistance of Ref. [17] we can actually compute the output phase probability density from the P distribution explicitly for a coherent-state input |α in the high-gain limit. The result, for α = |α| exp(iθ), is where erf(x) is the error function given by Note that here G t = exp(κ ↑ t) since an ideal linear amplifier is assumed. Using (103) the phase statistics beyond transience can now be seen explicitly. To show that it converges to the correct phase distribution we use the Pegg-Barnett probability density as a reference. The Pegg-Barnett phase probability corresponding to some arbitrary state ρ(t) is defined on a truncated s-dimensional Hilbert space spanned by the the Fock basis {|n } s n=0 [50]: where for each m, the phase state |φ m is defined by Generally, all phase statistics obtained using the Pegg-Barnett operator must be calculated with a finite value of s, i.e. on the truncated space first. To obtain the correct statistics we must then take the s −→ ∞ limit at the end (and only in the end). An exception to this limiting procedure applies for physical states which Pegg and Barnett defined to be states with finite average photon number. This is often true and is also true here. We obtained the Pegg-Barnett phase probability distribution numerically from the exact model given by the master equation (1) in Sec. 2.1. We show the result of (103) with the Pegg-Barnett distribution in Fig. 6 at two times. In both Fig. 6(a) and (b) the Pegg-Barnett phase probability is shown as the blue dashed curve while (103) is shown as the red solid curve. At t = 0 the phase variance from the P distribution would be zero. This is because the coherent state has a delta-function representation in terms of P . The Pegg-Barnett distribution on the other hand would correspond to the actual phase distribution which would have some finite width at t = 0. After a short time the two distributions then broaden a little as shown in Fig. 6(a). The P -function phase density given by (103) can be seen to be much more sharply peaked than the Pegg-Barnett phase density. This makes sense since it diffused from an initial delta function δ(φ − θ) at t = 0. As time progresses we find the P -function phase density to diffuse further, eventually converging to the Pegg-Barnett distribution at time T (the value of T is taken to be four). We also show how the phase variance evolves in the interval [ , T ] in Fig. 7. It can be seen that the phase diffuses much more quickly when described in terms of the P distribution (red solid curve). In relation to this we note that previous analysis have shown the Pegg-Barnett phase diffusion coefficient to be smaller than that from the P function, but under the smallnoise assumption [15]. In the end, it is clear that at t = T = 4 the two phase variances differ only slightly, suggesting that the Glauber-Sudarshan distribution provides a good estimate for the phase variance after transience, or equivalently, in the high-gain regime (which is the limit in which one typically would like to operate the amplifier in [29]). The P distribution then has the advantage that coherent-state inputs are easily handled as it amounts to specifying deterministic initial conditions for the number and phase. In this case, all of the output noise is interpreted as added noise due to amplification as the input is completely noise-free.