Emergence of the Born rule in quantum optics

The Born rule provides a fundamental connection between theory and observation in quantum mechanics, yet its origin remains a mystery. We consider this problem within the context of quantum optics using only classical physics and the assumption of a quantum electrodynamic vacuum that is real rather than virtual. The connection to observation is made via classical intensity threshold detectors that are used as a simple, deterministic model of photon detection. By following standard experimental conventions of data analysis on discrete detection events, we show that this model is capable of reproducing several observed phenomena thought to be uniquely quantum in nature, thus providing greater elucidation of the quantum-classical boundary.


Introduction
Since the appearance of Bell's inequality, it has become apparent that local hidden variable models cannot be compatible with the complete mathematical formalism of quantum mechanics [1,2,3,4]. Indeed, recent loophole-free experiments appear to be consistent with this conclusion [5,6,7,8]. Nevertheless, there remains the open question of which observed phenomena, in particular, are truly quantum in nature and have no classical analogue. This question of elucidating the quantum-classical boundary is of practical importance, as many new and emerging technologies, such as quantum computing, quantum communication, and quantum sensing, rely upon this distinction for their efficacy and security [9].
The field of quantum optics would seem to be a good place to explore this question, as the systems of interest are relatively simple to describe in terms of discrete field modes, while the important light-matter interactions may be restricted to the physics of photodetection devices. One of the more curious aspects of quantum optics is the concept of the vacuum or zero-point field (ZPF). In quantum electrodynamics (QED), a vacuum state is defined simply to be the lowest energy state of a given field mode [10]. The number of photons in this state is taken to be zero, yet its energy is nonzero, giving rise to the notion of "virtual" photons. Although the quantum vacuum is viewed as being only virtual, its effects are quite real. Phenomena such as the Casimir force, van der Waals attraction, Lamb shifts, and spontaneous emission are all believed to have their origin in the quantum vacuum [11].
The prominence of vacuum states in quantum optics suggests that they may be useful in developing a physical theory that explores the quantum-classical boundary. In this work, we shall proceed by supposing that the quantum vacuum of QED is real, not virtual. In doing so, we shall abandon all formal reference to quantum theory and consider a world governed solely by classical physics, albeit one in which the presence of a reified vacuum field is unavoidable. Our connection to quantum theory will lie solely in the demand that the statistical description of the real vacuum field match that of the virtual one. Our goal in doing so will be to explore which observed quantum phenomena can be explained under this supposition. In particular, we shall explore in this work the emergence of the Born rule as a statistical prediction that is applicable only within a certain regime of validity and application.
Several previous attempts have been made to derive the Born rule from first principles [12]. Max Born, in his original 1926 paper, considered the problem of perturbative scattering and suggested that the resulting energy may be interpreted as a statistical average if the scattering amplitudes, when properly squared, are interpreted as probabilities [13]. Gleason provided the first attempt at a mathematical derivation of the Born rule but relied on an assumed association of Hermitian operators with measurement observables [14]. David Deutsch, in 1999, went further to argue that elementary decision theory may be used to deduce the Born rule as a necessary consequence of the other quantum axioms [15]. This argument has since been criticized to be circular, as it requires the assumption of an agent with a particular predilection for L 2 norms [16]. Zurek has suggested decoherence as an explanation of the Born rule [17], although this view has been criticized as well to be insufficient [18]. More recently, Masanes et al. have claimed to derive the Born rule by assuming, among other things, that measurements consist of well-defined trials and always produce one of a pre-defined set of outcomes [19]. While seemingly innocuous, this assumption does not always hold in real, experimental settings where, for example, photons are detected at random times or, more often, not at all. An interesting result from Allahverdyan et al. provides a derivation of the Born rule from the dynamical law of quantum mechanics with the context of spin systems [20].
Working within the confines of the formalism does not seem a promising approach to deriving physical laws. What these and other attempts to derive the Born rule lack is any attempt to model the actual physics of measurement. This paper seeks to address that point by considering a deterministic model of measurement together with a reified quantum vacuum.
A reified quantum vacuum is the premise behind the theory of stochastic electrodynamics (SED), and we adopt a similar outlook here [21]. Previous work in SED considered the statistical behavior of physical systems immersed in the zero-point field. These included classical descriptions of the quantum harmonic oscillator ground state as well as spontaneous parametric downconversion [22]. Although these efforts were successful insofar as they predicted probability density functions identical to the corresponding quantum Wigner function, they failed to fully appreciate the critical role of measurement and experimental procedure in the observation of quantum phenomena. In particular, the role of post-selection and its relation to contextuality has received little attention within SED.
To address this deficiency, we shall consider here a local, deterministic model of photon detection wherein the only random variables determining the outcome of a measurement are those associated with the relevant vacuum states incident upon the device. This approach differs from previous work in stochastic optics, an offshoot of SED focused on quantum optics, wherein the intensity of incident waves above a given threshold determine only the probability of an outcome, leaving the actual realization to be determined by yet another, implicit, hidden variable [23]. Our approach uses a deterministic amplitude threshold crossing scheme to define detection events and is similar to the work of other researchers in this regard [24,25,26]. A key difference from previous work is the use of post-selection and the examination of asymptotic behavior to approximate ideal quantum predictions.
The structure of the paper is as follows. In section 2 we describe the mathematical model used to describe the reified vacuum field and use the single-mode approximation to make the correct correspondence with quantum optics. The connection to observation is made in section 3, where we describe a deterministic model of quantum measurement using amplitude threshold detection. From this, the Born rule is shown to arise as an emergent and approximate property of the model in the presence of measurements. Finally, in section 4 we consider the general problem of transformations of multiple vacuum modes under linear optics to arrive at a model approximating single-photon, multi-mode quantum states. Conclusions are summarized in section 5.

Continuum Description
Any classical electric field may be written in terms of a continuum of plane wave modes. Thus, the electric field at a point x and time t may be written, in Gaussian units, as where E(k) ≥ 0 is a scale factor related to the modal energy for wave vector k ∈ R 3 , a(k) ∈ C 3 gives the field direction and phase, and ω(k) ≥ 0 is the angular frequency. For a classical vacuum, ω(k) = k c, where k is the magnitude of k and c is the speed of light. The term "c.c." represents the complex conjugate of the term to the left. The magnetic field is similarly described, with a(k) replaced by k × a(k), so that specifying E(k), a(k), and ω(k) for all k ∈ R 3 provides a complete description of the electromagnetic field. Without loss of generality, we shall take a(k) to be stochastic, while E(k) is assumed fixed.
We now turn to the correspondence with quantum theory. Consistency with quantum electrodynamics will require that, at zero temperature, where we have now introduced , Planck's constant divided by 2π, as setting the fundamental scale of the vacuum field. For nonzero temperatures, E 0 (ω) is replaced by the expression where ω > 0, k B is Boltzmann's constant, and T > 0 is the absolute temperature. Since the density of states is given by ω 2 /(π 2 c 3 ), the spectral energy density is which corresponds to Planck's "second quantum theory" of blackbody radiation, with a zeropoint energy term included [27]. Note also that, at zero temperature, ρ 0 (ω) = ω 3 /(2π 2 c 3 ) is Lorentz invariant, owing to the cubic dependence on frequency, so the spectral energy density is the same in all inertial reference frames [28].
The stochastic nature of the field is described entirely by a(·), and consistency with QED requires that it be a complex Gaussian random vector field such that, for any choice of polarization vectors, E[a µ (k)] = 0 and where E[·] denotes an expectation value [29]. More generally, the n-point correlations of the field are given by with all other combinations giving a zero expectation value. Of course, this mathematical correlation structure is only an idealization; on some spatio-temporal scale, the field must surely be correlated. We would furthermore expect that the statistical character of the field, its scale and correlations, might also change over time and space. Nevertheless, we shall proceed with this modest idealization of the zero-point field, as it will provide a useful model for the quantum vacuum.

Discrete-Mode Approximation
One can approximate the continuum of wave vector modes by a set of closely spaced discrete modes in a notional box. Given a cube of length L > 0, we define a set K of discrete-mode wave vectors as follows: The continuum wave vector space may now be decomposed into notional discrete cells C(k) = k + [− π L , π L ) 3 , each centered on a wave vector k ∈ K. Since the cells are disjoint and their union comprises all of R 3 , we may rewrite equation (1) as follows: Furthermore, if the cells are small (i.e., L is large), we may make the approximation This last integral yields a complex Gaussian random variable with zero mean and a variance of ∆k = (2π/L) 3 = 8π 3 /V corresponding to the volume of each cell [30]. We may therefore write where z µ,k is a standard complex Gaussian random variable (i.e., a complex Gaussian random variable such that E[z µ,k ] = 0, E[|z µ,k | 2 ] = 1, and E[z 2 µ,k ] = 0). Equivalently, we may write z µ,k in the form z µ,k = (x+iy)/ √ 2, where x, y are independent, real-valued standard normal random variables.
We note that all discrete modes differing in either wave vector or polarization are inde-pendent since, for k, k ∈ K, We shall chiefly be concerned with descriptions in terms of the discrete-mode approximation, as this affords the clearest correspondence with quantum optics. In particular, the lowering operatorâ µ,k for discrete mode (µ, k) may be associated with the random variable z µ,k / √ 2 in the sense that the vacuum expectation of the symmetrized number operator equals the variance of the corresponding random variable. To see this, observe that and, similarly, Note that, since z µ,k and z * µ,k commute, whereasâ µ,k andâ † µ,k do not, symmetrization of the operators is important to achieve the correct correspondence.
The connection to quantum optics can be further elucidated by examining the modal energy. Quantum mechanically, the energy of the vacuum state is given by the expectation value of the HamilitonianĤ, which is simply the symmetrized number operator scaled by ω(k). Thus, 0|Ĥ |0 = 1 2 ω(k) is identified as the average energy per vacuum mode. To find the corresponding classical value, we begin by computing the energy density of the electromagnetic field for the selected mode, as given by where the single-mode electric field is and the single-mode magnetic field is Using the fact that, for any complex vector v, v and, since ∆E(x, t) 2 = ∆B(x, t) 2 , we conclude that Now consider the time average of u(x, t), a spatially independent random variable given byū The expectation value of this time average over realizations of the ZPF is therefore This result matches the quantum mechanical prediction if one integrates over a box of volume V to find the total expected energy. Of course, this volume is only notional and arises as an artifact of our discrete-mode approximation. It describes the degree to which the singlemode approximation is valid rather than any physical volume. For, say, a conical beam with a small half-angle of ∆θ and a filtered bandwidth of ∆ω, we have ∆k = π∆θ 2 ∆ω/c. Thus, as the beam is narrowed, the notional volume increases and the energy density decreases proportionally. An equivalent, and perhaps more physically meaningful, interpretation of the quantum mechanical energy, then, might be that the quantity 0|Ĥ |0 /V gives the expected energy density of a single vacuum mode when the wave vector is filtered and collimated to a resolution of ∆k = 8π 3 /V .

Coherent States
In quantum optics, coherent states are considered the closest analogue to a classical state. Previous work in SED has identified coherent states as arising from, for example, classical driven harmonic oscillators coupled to the ZPF [31]. Here we shall consider an optical analogue in which we add a classical plane wave to a single mode of the ZPF.
Recall that, previously, we had defined the ZPF to be of the form We now add to this a plane wave with wave vector k 0 and polarizationê 0 of the form where E 0 ∈ C is a complex number representing the amplitude and phase of the external plane wave. The total electric field is now In the single-mode approximation with ∆k = 8π 3 /V , the total field becomes where z is a standard complex Gaussian random variable. For reasons that will soon become apparent, we shall express E 0 in the form where α ∈ C is a complex number that will later be identified as the coherent state parameter. The combined field in the single-mode approximation may now be written The energy density of the corresponding electromagnetic field is where Taking the time average of u(x, t) gives and the expectation value ofū over realizations of the ZPF is This result matches the familiar energy density α|Ĥ |α /V of a quantum optical coherent state |α .
For general thermal states, E 0 is replaced by E T and, hence, z/ √ 2 is replaced by the scaled quantity σz, where In this case, the average energy density becomes (|α| 2 + σ 2 ) ω 0 /V . Note that nonzero temperatures merely have the effect of rescaling the ZPF for the given mode. At high temperatures (σ |α|), the coherent state becomes indistinguishable from thermal noise. Conversely, at large amplitudes (|α| σ), the coherent state becomes indistinguishable from a classical plane wave of fixed amplitude and phase.

Amplitude Threshold Detection
We have described a mathematical model for the QED vacuum in terms of a stochastic electromagnetic field. To make the important connection to observation and discrete detection events, we now introduce a simple deterministic model of photon detection based on amplitude threshold crossings and motivated by the observed behavior of real detectors.
Suppose that we have, to arbitrary precision, isolated a single angular frequency ω 0 , polarization modeê 0 , and wave vector mode k 0 of the vacuum in the discrete-mode approximation with wave vector resolution ∆k. For the vacuum and coherent states, the energy density u(x, t) at position x and time t varies sinusoidally in time and space. We imagine a detection device that reacts slowly enough as to be sensitive only to the time average,ū, of the energy density and note that this averaging eliminates both the temporal and spatial dependence of the energy density. Using a time average is justified by the fact that a typical period of light is orders of magnitude shorter than the corresponding lag time for the photoelectric effect [32]. Now, althoughū is constant across space and time, it varies from one vacuum field realization to another due to the presence of the random variable z. We now ask whether this time-averaged energy density falls above some threshold Γ 2 ≥ 0. Such an outcome will be deemed a detection event or "click" of a detector, and the probability of such an event occurring will be denoted Pr[ū > Γ 2 ]. Note that the vacuum realization z is the only source of randomness in determining this probability, and, in the single-mode limit, the coherence time of the given vacuum mode is infinite.

Dark Counts
The time-averaged energy density of the vacuum is given by equation (32) with α = 0 and, being the sum of two squared independent normal distributions, has an exponential distribution with a mean of 1 2 ω 0 /V . The probability of a detection event is therefore Since we have assumed V is large, we may take Γ 2 to be comparably small. In particular, we shall adopt the single-mode limit, analogous to the thermodynamic limit, in which for some γ ≥ 0. (Note that γ may be specific to a particular polarization, frequency and wave vector resolution.) In the single-mode limit, the probability of a detection event is exp(−2γ 2 ), which we interpret as the probability of a dark count for the vacuum state at zero temperature. In a thermal state (T > 0), we replace 1 2 ω with σ 2 ω, so the probability of a dark count becomes exp(−γ 2 /σ 2 ). This, again, becomes an effective rescaling of the detection threshold, so there is no loss of generality in supposing T = 0.
The prediction of a nonzero dark count rate at zero temperature is, strictly speaking, at variance with quantum mechanical predictions. Even under ideal conditions, our model predicts a nonzero probability of a vacuum detection event; quantum mechanically this probability should be exactly zero. However, even at extremely low temperatures, nonzero dark count rates are experimentally observed [33], and there is no reason to believe this does not hold generally.
For an explicit, albeit notional, example of a physical detection mechanism, one may consider a classical charged particle in a bifurcating harmonic potential. Such a potential has the quadratic form 1 2 mω 2 x 2 for mass m and displacement x for |x| ≤ . For |x| > , the potential is strongly repulsive and the particle quickly accelerates away, thereby creating an observable event. Since the trapped particle behaves as a high-Q linear filter, its behavior will closely match that of the resonant vacuum mode. If the polarization is linear and aligned with the displacement of the potential, the particle's motion will bifurcate and run away if the modal amplitude is sufficiently high.
Despite some similarities, the adoption of a threshold detection scheme for modeling photon detection should not be construed as a semi-classical treatment, as we are still completely within the confines of classical physics. Although we have adopted a very simple model of single-photon detection, these general qualitative observations are expected to hold in a more detailed physical model. In what follows, we shall make no further reference to the particular physical mechanism used for detection and will instead focus on the more abstract notion of threshold detection in the single-mode limit.

Emergence of the Born Rule
For coherent states, a detection event in the single-mode limit may be written With detection events so defined, we may identify the complex amplitude a, given by and note that 4|a| 2 follows a non-central χ 2 distribution with two degrees of freedom and a noncentrality parameter of 4|α| 2 . Thus, the cumulative distribution function (cdf) of |a| 2 is given by the expression [34] Pr where Q 1 (·, ·) is the Marcum Q-function, defined by and I 0 (·) is the zeroth-order modified Bessel function of the first kind.
The probability of a detection event is therefore given by We now note that, to second order in |α|, The presence of |α| 2 is this lowest-order approximation is the first indication of the emergence of the Born rule, although the correspondence is subtle and requires some discussion.
According to quantum mechanics, the probability of observing n photons given a coherent state |α is p n = | n|α | 2 = e −|α| 2 |α| 2n /n!. Hence, the probability of observing no photons at all is p 0 = e −|α| 2 , while the probability of observing at least one photon is 1−p 0 = 1−e −|α| 2 ≈ |α| 2 for |α| 1. According to equation (41), for α = 0 we have Pr[|a| 2 > γ 2 ] = e −2γ 2 , which we interpret as the dark count probability of the vacuum state. For α = 0, equation (42) will be a good approximation for |α| 2 1/(4γ 2 ). Furthermore, for γ 2 1 2 we will have low dark counts. So, for |α| 2 1/(4γ 2 ) 1 2 (i.e., |α| small and γ large) we expect to be in the near-single-photon regime. Figure 1 shows an example using α = 0.707 cos θ, γ = 1, and N = 10 4 random realizations. Examining N Pr[|a| 2 > γ 2 ] as a function of θ, we observe a near-perfect sinusoidal pattern with a period of π that has a minimum of N e −2γ 2 ≈ 1353 and a maximum of N Q 1 (1.414, 2) ≈ 3942. Subtracting the dark counts and renormalizing by the resulting maximum value, as one normally does in practice, gives a good approximation to the cos 2 θ probability law one would expect for an application of the Born rule to single-photon detection. Furthermore, reducing the magnitude of α, and of course ignoring the many non-detection events, gives arbitrarily good agreement. (If α is identically zero we will have a constant dark count rate which, when subtracted out, gives the quantum mechanical prediction of zero.) Now, for a general coherent state |α , quantum mechanics does not actually predict a probability of cos 2 θ, as our detector only indicates the presence of one or more photons. The actual predicted probability is 1−e −|α| 2 , which is only approximately sinusoidal. Comparing this to equation (41), suitably normalized, we observe a subtle difference. For α = cos θ and γ = 1, our model predicts a slightly lower probability than the quantum prediction of 1 − e −|α| 2 . (See figure 2.) For γ = 0.5 it is slightly higher. Treating γ as an adjustable parameter, then, allows for an arbitrarily good fit.
A further comparison to experimental observations can be made. For Poisson-distributed photon statistics, experimentalists often use a parametric model of the form where p is the probability of a detection event, δ is the dark count probability, and η ∈ [0, 1] is the detection efficiency [35,36]. Our model conforms with this general expression in the  (41). The dotted blue line is the quantum mechanical prediction for detecting one or more photons.
small |α| limit if we take δ = e −2γ 2 and Note that, in this interpretation, the effective detection efficiency increases as the threshold γ is decreased, attaining near unit efficiency for γ ≈ 0.8; however, for values much lower than this the efficiency is over unity and this interpretation is no longer valid.
Finally, another important quantity in experimental quantum optics is the interferometric visibility, which measures the degree of coherence in the prepared state. This may be defined as the ratio of the difference in maximum and minimum probabilities to their sum, which in our case is Taking γ to be large, with α fixed, therefore gives a fringe visibility arbitrarily close to unity.
For larger values of |α|, corresponding more closely to the classical regime, the convergence to unity occurs more rapidly. We illustrate this in figure 3, plotting visibility as a function of the threshold for different values of α. It is important to note that the visibility described here is in terms of the probability of detection, Pr[|a| 2 > γ 2 ], not the intensity, |a| 2 , which is random, nor the expected intensity, E[ |a| 2 ], which would give a visibility of one half. This point is important for a proper comparison with quantum mechanics, which predicts visibilities as high as one for actual measured counts, not classical intensities.

Dual-mode Detection
Previously, we considered measurements along a single polarization mode and found that the associated probabilities follow the Born rule, albeit with a threshold-dependent rescaling and fixed offset in accordance with equation (43). Such measurements cannot distinguish between a missed detection and an event that would have resulted in a detection in an orthogonal polarization. Dual-modal detection provides an alternative method for comparing against the Born rule that overcomes this deficiency.
Letê H andê V denote the horizontal and vertical polarization unit vectors for given wave vector. A linearly polarized coherent state for this wave vector may be described by the complex amplitude vector where z H and z V are independent standard complex Gaussian random variables. Note that θ = 0 corresponds to a vacuum state in the orthogonal polarization mode, which is always assumed to be present.
As a consequence of the independence of z H and z V , the random variables |a H | 2 and |a V | 2 are also independent, and their joint cdf is given by the product of their marginal distributions. Let us suppose a dual-mode detector that will register separate events if either |a H | > γ or |a V | > γ are true. This would be the case if the detector were, say, a pair of bifurcating harmonic oscillators oriented in the horizontal and vertical polarization directions. Equivalently, we may consider a polarizing beam splitter that separates the components to two single-mode detectors. The probability of no detection occurring is then Likewise, the probabilities for the three possible detection events are where P H is the probability of a single detection of H, P V is the probability of a single detection of V , and P HV is the probability of both.
In actual experiments with coherent light it is common to reject events in which there are two detections and, of course, ignore those with none. Out of a notional, unknown number N of independent trials, one measures S H = N P H single counts for H, S V = N P V single counts for V , and N P HV "accidental" coincidence counts. If we post-select on single detection events, the conditional probability, p H , of detecting H is We may now compare p H with the Born rule prediction of cos 2 θ. An example is plotted in figure 4 for |α| 2 = 0.5 and γ = 1. The agreement is perfect when θ = 45 • , 135 • (diagonal and anti-diagonal polarization, respectively), resulting in balanced probabilities and a conditional probability of 1 2 . For other values of θ, we find 1 is the visibility. For our particular case, V = 0.61, so 0.19 ≤ p H ≤ 0.81.
The maximum discrepancy arises when θ is 0 • or 90 • . For θ = 0 • , the polarization of the wave is horizontal, but we are still not guaranteed an H outcome, even conditionally, because the probability of a "false" V detection is still nonzero. Similarly, for θ = 90 • , the polarization of the wave is vertical, but an H outcome is still possible due to dark counts. In any realistic experiment, such events will be unavoidable and are quantified by a visibility below unity. Such anomalous events are effectively removed by renormalization, resulting in a modified conditional probability of the form This renormalized conditional probability gives excellent agreement with the Born rule prediction, as shown in figure 4.

Particle-like Behavior
Consider a coherent state prepared in some polarization modeê 0 and spatial mode k R traveling to the right that is incident upon a 50/50 beam splitter (BS). The outgoing beams have orthogonal spatial modes of k R and k D traveling right and down, respectively, each with the same polarization mode. The initial state may be described by the vector where z R and z D are independent standard complex Gaussian random variables corresponding to the ZPF components of the two spatial modes. For simplicity, we ignore the orthogonal polarization modes.
The beam splitter acts as a Hadamard gate H, transforming a into Note that z R = (z R + z D )/ √ 2 and z D = (z R − z D )/ √ 2 are again independent standard complex Gaussian random variables, so the noise term for a has the same form as that for a.
If we place single-mode detectors at each output port of the beam splitter, there will be four possible outcomes with four corresponding probabilities: no detections (P 0 ), a single detection for mode k R (P R ), a single detection for mode k D (P D ), and coincident detections on both modes (P RD ). These probabilities are as follows: Note that P RD ≥ P R P D , since each detection event is independent of the other. A similar result is found in the semiclassical treatment of photon detection [37]. In the single-photon regime (|α| 1) one would expect particle-like behavior, so coincident detections should be quite rare. Quantum mechanically, the probability of a coincident detection for a true, single-photon state would be exactly zero.
Experimentally, one counts the number of single-detection events, S R = N P R and S D = N P D , for transmitted and reflected light, respectively, as well as the number of coincidences, C = N P RD , where N is the nominal number of trials. The difficulty with such experiments is that N is often unknown or perhaps unknowable. If N is known, the ratio R = CN/(S R S D ), more commonly associated with the degree of second-order temporal coherence g (2) (0), would be expected to have a value no less than one, since If C = 0, as predicted by quantum mechanics, and S R , S D > 0, then R = 0, thereby violating the inequality. Early experiments of this sort were performed by Grangier et al. using both a light-emitting diode (LED) [38] and a heralded photon source [39]. The LED light source was turned on briefly using a controlled electronic trigger, allowing N to be know precisely.
Since the LED light was strongly attenuated, a value of R near unity, and consistent with the inequality R ≥ 1, was measured, as one might expect.
In the case of the heralded photon source, N was taken to be the number of trigger events, N t , each of which was taken to indicate the presence of a single, heralded photon. Under this assumption, the experimenters obtained a value of R t = CN t /(S R S D ) significantly less than one. A value less than one is generally considered to be evidence of photon antibunching. The true value of N , however, could not be known and may well have been much larger than N t , in which case a value below unity would not be surprising. A similar experiment, also using heralded events, was performed recently by Thorn et al. using a modern parametric downconversion source and avalanche photodiodes, with similar results [40].
For our model, the single-photon regime provides a good approximation to a true, singlephoton state, so long as we ignore non-detection events. Taking N to be N d = S R + S D + C, the total number of detection events, we obtain a result similar to heralding. From this we may compute the ratio This may equivalently be seen as replacing the absolute probabilities P R , P D , P RD in the expression for R with the conditional probabilities p R = P R /(1−P 0 ), p D = P D /(1−P 0 ), p RD = P RD /(1 − P 0 ). Such conditioning is similar to the experimental procedure of using heralding to define the number of trials. It is now easy to show that R d can be less than unity when either |α| 1 or γ 1. As an example, figure 5 shows the values of R and R d as a function of |α| for γ = 1. In this way, a purely classical model of light, when analyzed in a similar way, can exhibit the same anomalous quantum behavior.

Single-Photon, Four-Mode Entanglement
In quantum mechanics, a single photon can be entangled across multiple modes. Similar behavior can be modeled classically. Consider a coherent state prepared with polarizationê H traveling to the right and incident upon a 50/50 beam splitter (BS). The initial, four-mode state may be written where z RH , z RV , z DH , z DV are independent and identically distributed (iid) standard complex Gaussian random variables arising from the zero-point field and corresponding to the two spatial modes (k R and k D ) and polarization modes (ê H andê V ).
After the beam splitter, the state becomes To perform a measurement of all four modes, each spatial mode is put into a dualmode detector and threshold detection is performed. There are four components and, so, 16 possible outcomes, including multiple detections. For |α| 1, the most likely outcome is no detections at all, with single detections being the next most likely outcome. At the opposite extreme, for |α| 1 the most likely outcome is detection on all four modes. For small values of |α|, the probability of a single detection on either |R, H or |D, V (both equally likely) is much more likely than a single detection on |D, H or |R, V .
These probabilities are illustrated in figure 6 for α = 1. We see that the dominant modes peak in probability at threshold values somewhat greater than 1 but are relatively much larger than the other two modes. This comports with the general behavior one would expect of a single-photon state that is hyperentangled in spatial and polarization modes. If we consider only single-mode detection events (i.e., detections on one spatial mode and one polarization mode), then the conditional probability of each dominant mode converges to 0.5, the ideal quantum prediction, when γ is large. Conversely, the conditional probability converges to a nonzero value, which is dependent on α, when γ is small. Qualitatively similar behavior is found when |α| is varied while holding γ fixed. Thus, a correspondence with quantum mechanical predictions is achieved, but only in the limit of larger threshold values and only when one post-selects on single-mode detection events.

Wave/Particle Duality
In quantum mechanics, photons can exhibit both particle-and wave-like behavior. This, too, can be modeled classically. Consider, an initial quantum state |R, H that undergoes a transformation via a 50/50 beam splitter and a phase shifter in the |D, H mode. Using a pair of mirrors, the two paths are recombined in a second beam splitter to form a Mach-Zehnder interferometer. The two output ports are then measured with detectors. Quantum mechanically, the final state (before measurement) is where R φ is the phase shift gate Accordingly, the probability of finding a photon in the |R, H mode is cos 2 (φ/2).
We can model the problem classically by starting with an initial coherent state a = α |R, H + z/ √ 2 and transforming it via the same linear operations into The conditional probability of a detection in mode |R, H , given a single detection in either mode |R, H or |D, H , is now found to be The resulting interference pattern, as shown in figure 7, is similar to what one would expect from classical light if one were observing intensities; however, we are showing probabilities. The pattern also reflects the nonlocal nature of the interferometer: light travels along both arms and interferes only when recombined. In this way, our classical model exhibits the wave-like nature of light in terms of discrete detection events.
To recover the particle-like nature of the system, we may create a Wheeler delayed-choice experiment by removing the final beam splitter The resulting state is now The conditional probability of a detection in mode |R, H given a single photon detection in either mode |R, H or |D, H is now 1 2 , independent of φ. In addition, the probability of a double detection in both modes may be made arbitrarily small by decreasing |α| or, equivalently, increasing γ. This is the behavior one would expect from a localized particle. Note that it does not matter when the choice to remove the final beam splitter is made.
A similar result is obtained if we simply provide "which way" information by marking one of the two arms with, say, a change in polarization. If we apply an X gate on the lower arm before the final beam splitter, the resulting state will be The interference pattern is once again lost (i.e., the detection probabilities are independent of φ). Each of the four modal outcomes occurs now with equal probability, with the likelihood of multiple detections again vanishing as |α| is decreased or γ is increased. Replacing the NOT gate with a unitary gate that is close to the identity will result in a diminished but still discernible interference pattern, so one may consider measuring the path information weakly as well. So, if there is only partial which-way information, the interference pattern will simply diminish by degrees.

General Multimodal States
The transformation of coherent light via a sequence of linear optical components can be described, in general, by a d × d unitary matrix U. Without loss of generality, we may suppose that the initial state is of the form where z is a d-dimensional vector of iid standard complex Gaussian random variables and α ∈ C. Following the transformation, the new state is where z = Uz is, again, a d-dimensional vector of iid standard complex Gaussian random variables, owing to the unitarity of U.
Detection measurements on the d modes will result in one of 2 d possible outcomes. Let (n 1 , . . . , n d ) ∈ {0, 1} d denote the outcome in which mode 1 has n 1 detections, mode 2 has n 2 , etc., and let P (n 1 , . . . , n d ) denote the probability of this outcome occurring. Since the random variables z 1 , . . . , z d are statistically independent, this probability is given by where q i = Q 1 (2|αψ i |, 2γ) is the probability of a threshold crossing event for mode i. (We assume, for simplicity, that all detectors have the same threshold.) We will be most particularly concerned with single-detection events (i.e., those for which n 1 + · · · + n d = 1), as these would be interpreted as single-photon detections. Although such events occur with vanishingly small probability as |α| becomes large (and low probability for |α| small), we may condition, via post-selection, on only such events and thereby obtain a nonvanishing probability. Specifically, let p i denote the probability that a single-detection event occurs on mode i, given that a single-detection event occurs on any one mode. It follows that provided that q k = 1 for all k. Note that, if there exists an i such that |ψ i | > |ψ j | for all j = i, then p i → 1 as |α| → ∞. In other words, for bright light only the most probable mode will have a single detection. For states with no unique maximum, the asymptotic probability is spread uniformly amongst the maxima. The latter case is consistent with the quantum mechanical predictions, while the former does not. The right correspondence with quantum mechanics is then to be expected for small or intermediate values of |α|.
To examine the validity of our model, we performed linear quantum state tomography (QST) on a random sample of pure states formed by applying unitary matrices drawn from a Haar measure [41]. For a given transformed state ψ, QST was performed using a complete set of d-dimensional Hermitian basis matrices B 1 , . . . , B d 2 that are orthonormal in the Hilbert-Schmidt inner product. Each B k is diagonalized by a unitary matrix U k such that (U † k B k U k ) ij = β ki δ ij . To measure in this basis, we therefore transformed ψ to ψ = U † k ψ and compute p i according to equation (77). The expectation value of B k was taken to be p 1 β k1 + · · · + p d β k1 , so the inferred density matrix for ψ is defined to be Since the basis matrices are such that Tr[B † 1 B 1 ] = 1 and Tr[B † k B k ] = 0 for k > 1, the trace Tr[ρB k ] gives the expectation value of B k , as defined above. This allows us to identify ρ as playing the role of a quantum mechanical density operator.
With ρ so computed, we examined the fidelity, defined by the vector inner product as a function of d, |α|, and γ over an ensemble of pure states ψ. In figure 8 we have plotted F versus |α| for d = 4 and γ = 1 for an ensemble of 30 randomly drawn pure states. For this case, tensor products of the Pauli matrices were used for the orthonormal basis. We observe that F = 1/d (corresponding to p i = 1/d) for |α| = 0, as expected for pure vacuum noise. As |α| increases, F increases monotonically to a value near unity. However, for sufficiently large values of |α| the inferred density matrix acquires negative eigenvalues and becomes invalid. For general quantum states, taking |α| ∼ γ d/2 tends to give near unity fidelity, albeit with invalid density matrices. For "classical" states (i.e., those for which |ψ i | = 1 for exactly one index i) the density matrix remains valid for all |α| and the fidelity asymptotically approaches unity as |α| → ∞. Qualitatively similar behavior is found when γ is varied while holding α fixed.
Negative eigenvalues in density matrices obtained through linear state tomography are a common occurrence in experimental quantum optics, particularly for low-entropy, high fidelity states. Their presence might be interpreted as an observed deviation from the Born rule, but they are more commonly ascribed to mere "experimental inaccuracies and statistical fluctuations" [42]. According to our model, such results are an inevitable consequence of the parameter regime investigated and the data analysis methods used to infer the quantum state. Since we compute the probabilities exactly to perform state tomography, we may also conclude that the potential for negative eigenvalues is an intrinsic property of the model and not one due to mere sampling errors.
To address the problem of invalid density matrices obtained from linear state tomography, maximum-likelihood estimation (MLE) methods are often used [43,44,45]. In this approach, one parameterizes a general, positive semi-definite density matrix and estimates the parameters of this matrix using an optimization scheme based on an assumption of Gaussian errors. By construction, this approach always yields a valid density matrix. We reexamined our results using the MLE-based state tomography tools provided by the Kwiat Quantum Information Group [46]. The results of five randomly sampled states with d = 4 are shown in figure 9. For |α| less than or close to unity, the MLE results agree with the previous linear tomography results. However, for larger values of |α|, the curves peak near unity and then slowly decrease as we approach the classical regime of |α| 1. Qualitatively similar behavior is found when γ is varied while holding α fixed. This shows that it is possible to infer density matrix estimates from our model that are both valid and of high fidelity.
The density matrix derived from QST may also be used to examine entanglement. According to the Peres-Horodecki positive partial transpose (PPT) criterion, a density operator that acts on a tensor product Hilbert space H A ⊗ H B will be separable with respect to H A and H B if all the eigenvalues of its partial transpose are positive [47]. In our case, ⊗ is the Kronecker product, H A = C d A , and H B = C d B , for d A , d B ∈ N. If we write the density matrix ρ as ρ = ij k where e A i , e A j and e B k , e B are the standard unit vectors in C d A and C d B , respectively, then the partial transpose with respect to H B is Negative eigenvalues of the partial transpose are a necessary, though not sufficient, condition for the density matrix to be nonseparable (i.e., entangled). For certain cases, such as d A = d B = 2, this condition is also sufficient and therefore may be used as an entanglement witness [48]. In figure 10 we have plotted the minimum eigenvalue of the partial transpose for d A = d B = 2 as a function of |α| for a maximally entangled Bell state using a detection threshold of γ = 1 and the aforementioned MLE method to infer the quantum state. It is perhaps surprising that, although our inferred density matrix does not have perfect fidelity, it is nevertheless entangled (i.e., nonseparable), as witnessed by the negative eigenvalues of the partial transpose for values of |α| above 0.6. The behavior for large |α| shows an asymptotic approach to −0.5, the value predicted by quantum mechanics for an ideal Bell state. Qualitatively similar behavior is found when γ is varied while holding α fixed.

Conclusion
Assuming a classical zero-point field and deterministic threshold detectors, we have shown that one is able to reproduce many of the experimentally observed phenomena attributed to single photons and thought to be uniquely quantum in nature. In so doing we have established that such phenomena do, in fact, have classical analogues. Weak coherent light in combination with a reified zero-point field considered in the single-mode regime are found to give probabilistic outcomes that are in close agreement with the Born rule for single-photon, multi-mode states when post-selection on single-detection events is performed. This agreement was verified explicitly by performing quantum state tomography and computing the fidelity of the resulting density matrix. The model results are not always in perfect agreement with the idealized quantum mechanical predictions, but they are consistent with experimental observations and data analysis methods in the appropriate parameter regimes. This model therefore provides a local, realistic picture of wave/particle duality and single-photon entanglement that is grounded in a physical and wholly classical model. A similar classical description of homodyne measurements, temporal behavior, and multi-photon entanglement are left for future work.