Linear optics and photodetection achieve near-optimal unambiguous coherent state discrimination

Coherent states of the quantum electromagnetic field, the quantum description of ideal laser light, are prime candidates as information carriers for optical communications. A large body of literature exists on their quantum-limited estimation and discrimination. However, very little is known about the practical realizations of receivers for unambiguous state discrimination (USD) of coherent states. Here we fill this gap and outline a theory of USD with receivers that are allowed to employ: passive multimode linear optics, phase-space displacements, auxiliary vacuum modes, and on-off photon detection. Our results indicate that, in some regimes, these currently-available optical components are typically sufficient to achieve near-optimal unambiguous discrimination of multiple, multimode coherent states.


Introduction
Quantum mechanics places fundamental limits on the distinguishability of non-orthogonal quantum states. This fact underpins applications in quantum information science and technology, most notably in quantum cryptography [1], and constrains the performance of quantum sensors [2,3], quantum communications [4,5], and of probabilistic algorithms in computation [6]. However, quantum mechanics also provides the tools to identify these limits and approach them through the design of practical receivers [7][8][9]. Within the framework of optical quantum information science and technology, coherent states of the electromagnetic field play a prominent role as information carriers in terrestrial and space-based quantum networks [5,10]. This is essentially since these states are relatively easy to prepare and control experimentally, and are robust to loss. They are in fact the only pure states that have a classical limit, yet they exhibit Jasminder S. Sidhu: jsmdrsidhu@gmail.com Cosmo Lupo: cosmo.lupo@poliba.it fundamental quantum properties when sufficiently attenuated [11,12]. This paper focuses on the quantum-limited detection of coherent states. We present a few explicit and practical schemes for unambiguous state discrimination (USD) of coherent states. In particular, instead of considering global bounds, our goal is to design practical receivers that can be realized with linear optics and on-off photodetection. For brevity, we refer to them as linear receivers. In some cases, we show that linear receivers nearly saturate the global bounds, which can be computed using the general theory of USD [13][14][15][16][17][18][19][20][21][22].
A well-designed quantum receiver combines practicality with high performance, where the latter is quantified through a suitable task-dependent figure of merit. In ambiguous state discrimination (ASD), the receiver is designed to always provide an output, with the goal of minimizing the average error probability. The latter is ultimately limited by the Yuen-Kennedy-Lax (YKL) conditions [23] (which reduces to the Helstrom bound for two states [24]). Alternative figures of merit are preferred for quantum key distribution [25][26][27][28] and quantum digital signatures [29], where an exact state identification improves the transfer of secure information that cannot be forged or repudiated. This framework is known as USD, where the receiver either identifies the state without error, or outputs an inconclusive result, and the goal is to minimize the average probability of such inconclusive result [13,14]. The ultimate bounds to USD can be computed using the theory of Peres and Terno [15] or that of Sun et al. [21] and Bergou et al. [22], which in a sense provide the analogue for the YKL conditions for USD. These bounds can be computed efficiently through semidefinite programming [16], and sometimes allow for an exact analytical solution [21,22]. However, experimentally attaining the bounds, in particular for the discrimination of coherent states, may require high non-linearities that may be un-accessible even with state-of-the-art or near-term future optical technologies.
There is a wide body of literature devoted to estab-lishing the global bound for USD for different families of quantum states [17][18][19][20][21][22]. However, despite these advances, very little is known about practical USD receivers for coherent states, and whether they can achieve the global bounds [27,[30][31][32][33], with most of the available works focusing on phase-shift keying. For classical communications using coherent-state modulation, a joint quantum measurement acting on the received coherent-state code word that performs USD achieves the optimal communication capacity allowed by quantum mechanics [34], known as the Holevo capacity [35]. Furthermore, when acting on a finitelength inner code comprised of tensor product of coherent states, the USD measurement can even attain a higher channel capacity-Shannon capacity of the super channel induced by the inner code and the receiver-compared to the optimal ASD measurement that minimizes the average probability of error of choosing between the modulated-received inner code words [36][37][38]. For phase-shift encoded signals, the optimal USD measurement in the limit of small photon numbers consists of mode-wise displacement operations followed by photon-number-resolving detectors [27]. For the so-called binary phase-shift-keying (BPSK) alphabet, this scheme is sufficient to attain the optimal USD performance without adaptive predetection displacements [39]. Post-selecting the measurement result can further reduce the error rate for a fixed probability of inconclusive results [40]. For single-mode coherent state constellations with more than two states, there is a substantial gap between the optimal USD performance and non-adaptive receivers [27,30]. An adaptive receiver for quadrature phase-shift-keying (QPSK) has demonstrated an improvement to correct state identification [31] in an effort to close this gap.
Here we outline a general theory of USD of coherent states with a receiver that leverages only limited resources, such as multi-mode linear passive optics, phase-space displacement operations, auxiliary vacuum modes, and mode-wise on-off photon detection. We apply our theory to a number of examples of different coherent-state modulations, using as a benchmark the global bounds on USD. The latter are computed explicitly using the theory of Refs. [21,22]. We demonstrate that in some regimes this practical, yet restricted, set of physical operations is typically sufficient to deliver near-optimal performance. This work establishes a theoretical framework to understand and master the design of receivers to enable near-optimal USD of coherent states.

Summary of results
Previous works in USD have primarily focused on developing efficient computational methods to determine the global bounds [15,16,22], providing necessary and sufficient conditions that define optimal measurement schemes [17,18,21], or considering specific examples of quantum states to construct measurement operators [19,20,32,41,42].
Here we focus on the design of practical receivers. We introduce a family of receivers based on multimode linear optics and on-off photodetection. A particular strength of these receivers is that they do not require complex adaptive strategies or non-linear optics, which may be challenging to implement. Our work readily addresses the optimal design to minimize the average probability of an inconclusive event. Further, for multimode coherent states, we provide intuitive insights into how the performance of linear receivers depends on the number of modes and on the number of photons detected. This intuition is useful to provide an understanding of how to experimentally realize improvements when using just linear optics.
We test our receiver design on randomly generated multimode coherent states, showing that linear receivers provide near-optimal performance for a range of coherent states. Non-typical, highly-degenerate constellations of coherent states may necessitate coincidence measurements to achieve USD. We also address an alternative figure of merit, the communication capacity, showing that linear receivers achieve near-optimal performance both in the asymptotic regime and for finite block length. Table 1 provides a high-level summary of the different receiver designs that we introduce in this work, along with their optimal performance bounds, conditions required to saturate these bounds, and how the designs can be adapted to handle general codes. Our class 1 receivers are comprised of vacuum auxiliaries, LOP transformations, and on-off photon detection and are capable of discriminating any full-rank codebook. Our class 2 receivers improve the performance beyond class 1 by using additional mode-wise displacement operations and also extend their applications to codebooks with single degeneracy. Finally, class 3 receivers use the same resources as class 2 but measure detection events across multiple modes. Our results significantly advance the field by clarifying the receiver designs that are both near-optimal for many codes and immediately accessible with current technologies.

Outline
We begin in Section 2 by introducing our notation and the basic theoretical tools to describe our receivers based on linear optics and photodetection. We provide qualitative bounds that introduce key insights into the performance of different detection schemes in Section 3. Then, in Section 4 we establish a theory to formalize in a quantitative way the problem of USD with limited resources. In Section 5, we discuss applications of linear receivers to different constellations of coherent states, including the pulse-position modula-    1, c]. Each class implements on-off photodetection and differs in either the detection strategy or resources used. Note that class 3 receivers can readily be generalized to code words with arbitrary degeneracy by implementing multimode detection events (greater than two). The performance for each class of our linear optical receivers is defined through an optimization problem. We developed a numerical optimizer to address each optimization and derive the optimal solutions. Intuition into the performance of each receiver is provided in section 3. Note the construction of each receiver class uses linear optics that can be readily implemented, which is in contrast to the globally optimal scenario summarised in the final row.
tion codes, random codes, and non-typical degenerate codes with single and double degeneracy. These applications demonstrate the strengths of linear receivers when benchmarked against the global bound for USD. Finally, conclusions and open questions are provided in Section 6.

Linear optics in phase space
Before delving into the details of the USD receivers, we first need to introduce our notation and a few basic elements from the toolbox of linear optics [43]. Consider a collection of m bosonic modes with annihilation and creation operators {a j , a † j }, for j = 1, . . . , m, satisfying the canonical commutation relations, [a j , a † j ′ ] = δ jj ′ . These modes may represent a number of physical degrees of freedom, e.g., polarization, transverse wave vector, time of arrival, orbital angular momentum, as long as they are all degenerate in frequency. A coherent state on mode j is denoted as |α⟩ j and is characterized by its complex amplitude α. We have where is the Fock state with k photons on mode j, and |0⟩ is the vacuum state. A multimode coherent state is the direct product of m coherent states: Such a state is uniquely identified by its amplitude vector α = (α 1 , α 2 , . . . , α m ). As |α j | 2 is the mean photon number in mode j, is the total mean photon number in the state. The fundamental mathematical structure in quantum mechanics is the scalar product defined in the Hilbert space. The scalar product between two multimode coherent states |α⟩ and |β⟩ reads is the scalar product between the amplitude vectors. The Hilbert-space scalar product is invariant under the action of general unitary transformations in the Hilbert space and is the central mathematical structure underlying the global bounds on USD [15]. Note that this global bound is known to be achieved by some measurements. However, for generic states, we do not expect any particular form for such optimal measurement.
Since our focus here is on achieving USD using a particular subset of measurements, which includes linear optics and photodetection, the Hilbert-space scalar product may not be the most useful mathematical tool. Therefore, we consider an alternative notion of scalar product that seems more naturally suited to describe quantum mechanics under a restricted set of allowed measurements. To identify this alternative scalar product, we need to consider in more detail the set of unitary linear optics operations, which preserve the total mean photon number. These operations identify the group of Linear Optical Passive (LOP) unitary transformations [44,45].
LOP unitaries map coherent states into coherent states. The components of the amplitude vector transform as follows, where [U jk ] is a m × m unitary matrix. Multimode LOP transformations can be implemented physically by combining linear optics elements as beam splitters and phase shifters, and are mathematically described as unitary matrices. Given a unitary matrix description of the LOP transformations, there are known, efficient procedures to simulate it as a network of beam splitters and phase shifters [46][47][48].
Note that the scalar product between amplitude vectors in Eq. (6) is invariant under the action of LOP unitaries. Therefore, given a pair of m-mode coherent states |α⟩, |β⟩, with amplitude vectors α = (α 1 , α 2 , . . . , α m ) and β = (β 1 , β 2 , . . . , β m ), we define the phase-space scalar product as The phase-space scalar product will play in our analysis a similar role played by the Hilbert-space scalar product in the theory of USD developed by Peres and Terno [15], and it will guide us in designing our linear receivers for USD of coherent states.
In the rest of the paper, we will consider examples of codes comprising c ≥ 2 m-mode coherent states, identified by c amplitude vectors α 1 , α 2 , . . . , α c . We can arrange these vectors into a rectangular matrix with c rows and m columns, It is well known that pure states can be unambiguously discriminated if and only if they are linearly independent [17]. In our setting, where we only allow for limited resources, this condition needs to be modified. As a matter of fact, we need to look at the notion of linear independence in phase space, not in the Hilbert space. This alternative notion of linear independence, which we simply call phase-space linear independence, is strictly related to the phasespace scalar product introduced above. Therefore, we will say that the given coherent states are linear independent in phase space if their associated amplitude vectors are linear independent, i.e., if the matrix R in Eq. (9) has maximum rank. Note that this is possible only if c ≤ m. Below we describe a scheme for USD of coherent states that works when the associated Rmatrix is full rank. In general, our receivers will also exploit a number of auxiliary modes. We will then introduce m ′ additional optical modes (as we see below, it is sufficient to use m ′ ≤ m auxiliary modes), characterized by the canonical operators {a m+j , a † m+j }, for j = 1 to m ′ . The totality of m + m ′ will be mixed at a LOP unitary over m + m ′ modes, which is represented by a unitary matrix of size m + m ′ . If all modes are initially prepared in a coherent state, with amplitude vector α = (α 1 , . . . , α m , α m+1 , . . . α m+m ′ ), then the output is also a coherent state, with amplitude vector β = (β 1 , . . . , β m , β m+1 , . . . β m+m ′ ), where the ith output amplitude reads (10) where M and N are submatrices of U , with M ij = U ij for j = 1, . . . , m, and N ij = U ij for j = m+1, . . . , m+ m ′ . For our applications, we are often interested in the case where the m ′ ancillary modes are prepared in the vacuum state, i.e., α m+j = 0. In this setting, we simply have When the output coherent state is measured by mode-wise photodetection, for any i we will have a certain probability of detecting a photon in the ith mode. This probability is readily computed as where and The photon detection probability in Eq. (12) is the key quantity to characterize our linear receivers, as we discuss in Section 4. Finally, we recall that the displacement operator, denoted as D(γ), maps a coherent state into a coherent state with a displaced amplitude: In the m-mode case, the displacement operator is identified by a complex displacement vector, γ = (γ 1 , γ 2 , . . . , γ m ), such that

Qualitative analysis of linear receivers for USD of coherent states
This work addresses the following general question: Given a collection of c, m-mode coherent states |α j ⟩, with amplitude vectors α j = (α j 1 , α j 2 , . . . , α j m ) and prior probabilities p j , is it possible to discriminate them unambiguously using linear optics, m ′ auxiliary vacuum modes, and mode-wise photodetection? In this Section, we provide a qualitative answer to this question. These qualitative results will give some insight into the numerical analysis discussed later in the paper.
Consider a set of m-mode coherent states with n mean photon number. If we use m ′ ancillary vacuum modes, then the mean photon number per mode is ν = n/(m + m ′ ). These states are processed through a LOP unitary, which preserves the photon number, and then measured by mode-wise photodetection. On average, the probability of a single photodetection event in a given mode is expected to be about where the factor f depends on the details of the input states and LOP transformation applied. For weak signals, it is unlikely to observe more than one photodetection event. Therefore, we cannot discriminate more than m + m ′ states, i.e., no more than the number of signal plus auxiliary modes, and the receiver yields an inconclusive result every time no photon is detected. This happens with probability Therefore, we expect the probability of an inconclusive event to decrease linearly with n for n ≪ 1. This probability can be further decreased if instead of vacuum auxiliary modes we use auxiliary coherent states. In fact, this would increase the mean photon number from n to n ′ > n. This strategy is equivalent to introducing phase-space displacement in the pool of allowed resources [49]. In Section 5.3, see Fig. 4(a) in particular, we will show that this qualitative behaviour is confirmed quantitatively, both with and without phase-space displacement. For brighter coherent states, or when the first order term in n vanishes, we need to consider joint detection events on pairs of modes. In this case, the number of distinct outcomes is which gives the maximum number of states that can be unambiguously discriminated by observing coincidence detection on two modes. From Eq. (17), and recalling that a multimode coherent state is a product state, see Eq. (3), we obtain the probability of a coincidence By ignoring joint detection events in more than two modes, the probability of obtaining an inconclusive result is Recalling that ν = n/(m + m ′ ), we obtain In this case, the probability of the inconclusive event decreases quadratically with n and, for m + m ′ large enough, is essentially independent of the number of modes and ancillas. As for the previous case, using coherent states ancillas instead of vacuum ones, we may further decrease P 0 , as this is equivalent to replacing n with n ′ > n. This qualitative result will be confirmed by the quantitative analysis of Section 5.6, see Fig. 7(b) in particular, where we will discuss examples of degenerate codes where the linear term in Eq. (18) vanishes.
In principle, a similar qualitative analysis can be applied to USD schemes based on an arbitrary number of joint detection events. We naturally expect that events of joint detection of n d photons will contribute to P 0 with a term proportional to n n d . However, in this work, we only discuss receivers based on single and double-detection events. In the following Section, we will focus on photodetection events on at most one mode, and provide quantitative analysis and explicit USD schemes. Later in Section 5.6 we will consider an example of USD based on joint detection on two modes.

Explicit schemes for USD of coherent states
Below we present three schemes for linear receivers. The first two, in Sections 4.1 and 4.2, exploit a single detection event to achieve USD of coherent states. The third scheme in Section 4.3 exploits double detection events. These schemes are not necessarily optimal for the given resources, however, we show that they are optimal or near-optimal in a number of instances.

Single detection events
We now introduce an explicit scheme for USD of multimode coherent states, based on single photodetection events. Consider a set of c coherent states over m modes, identified by the amplitude vectors . In order to discriminate these states, we first introduce m ′ auxiliary vacuum modes, then mix them on a LOP transformation identified by the (m + m ′ ) × (m + m ′ ) unitary matrix U , and finally apply mode-wise on-off photodetection. This setup is shown schematically in Fig. 1. As recalled in Section 2, the probability that, given the state |α j ⟩ in input, a photon is detected on mode i in output, is where the vectors M i are determined by the elements of the matrix U as discussed in Section 2.
If we want to achieve USD using the information obtained from a single photodetection event in one of the output modes, then it is obvious that we need to impose that the input states are in one-to-one correspondence with the output modes. Without loss of generality, this means that for the jth state in input all the output modes are in the vacuum except the jth mode. That is, the matrix U needs to be chosen in such a way that Expressed in terms of the vectors M i , this condition reads . . . Note that the LOP unitaries can be efficiently implemented through a network of beam splitters and phase shifters [46][47][48].
We note that a non-zero matrix M that solves these equations exists if and only if the amplitude vectors are linearly independent, i.e., the matrix R in Eq. (9) is full rank 1 .
The conditions in Eq. (28) ensure that the detection of a photon in output mode j unambiguously identifies the coherent state |α j ⟩ in input. Otherwise, no conclusion can be drawn in the event that no photon is detected in any of the output modes. This is the inconclusive event. If the input coherent state |α j ⟩ has probability p j , then the average probability of the inconclusive event is Our goal is to design an explicit receiver that minimizes this probability. This corresponds to performing an optimization on the c × m matrix M , keeping in mind that the latter is by construction a submatrix of a larger unitary matrix. This latter condition is expressed by the matrix inequality where I is the identity matrix (see Appendix C for a derivation of this condition). In solving the constrained optimization, we parameterize the elements of the matrix M as where k i ≥ 0 are c non-zero coefficients, and v ij are the components of c unit vectors for i = 1, . . . , c. The unit vectors v i are proportional to the vectors M i , therefore condition (28) becomes In conclusion, using this parameterization, the optimal linear receiver is determined by solving the constrained optimization: The minimal value represents the minimum probability of obtaining an inconclusive event for distinguishing the given set of coherent states using only vacuum auxiliary modes, LOP unitaries, and on-off photodetection. This value can be compared to the global bound obtained when general operations and measurements are allowed, which can be computed analytically or numerically using results already available in literature [22].
Finally, we remark that the optimization problem (34) is not explicitly dependent on the number of auxiliary modes m ′ . However, once an optimal form for the matrix M is obtained, one needs to find a unitary matrix that extends it. In general, such a matrix exists only if m ′ is chosen sufficiently large. However, one can assume m ′ ≤ m without loss of generality. An explicit construction is given in Appendix C.

Phase-space displacement improves USD
Better USD schemes, i.e., with a lower probability of the inconclusive event, can be obtained by enlarging the set of allowed resources. Here we describe a USD scheme obtained by adding the operation of phase-space displacement. We consider a setup where multimode displacement is first applied to the input modes, followed by the LOP unitary and mode-wise photodetection. This is shown in Fig. 2.
The optimal setup is then obtained by minimizing the probability of the inconclusive event, where now there are m additional complex degrees of freedom, corresponding to the components of the displacement vector γ: The examples of Section 5 will show that the introduction of phase-space displacement may improve the performance of USD substantially.
As we noted above, the constraint can be satisfied if and only if the displaced amplitude vectors β j = α j + γ are linearly independent. The use of displacement operations has the added value of casting a linearly-dependent system with single degeneracy 2 into an independent one. This extends the range of coherent states that can be discriminated unambiguously. To see this, consider a linearly dependent set of coherent states with c ≤ m, and rankdeficient R matrix with rank(R) = c − 1. The matrix can be made full rank by adding to each row a linear independent vector γ. The new matrix, has full rank and represents the set of displaced coherent states. 2 We call degenerate code a set of coherent states whose amplitude vectors are linearly dependent in phase space. Therefore, the associated R matrix is not full rank. We say that the code has single degeneracy if the rank of the R matrix is one unit below the maximum value, and has double degeneracy if it is two units below the maximum value.
Displacement also comes to the rescue when the number of states is larger than the number of modes, in the case c = m + 1 and rank(R) = c − 1. To make the matrix full rank, we first add one auxiliary mode, then displace the m + 1 modes by (γ, γ m+1 ). The new matrix reads and has rank c for suitable choices of (γ, γ m+1 ). An explicit example of this method is presented in Section 5.7.

Double detection events
When the amplitude vectors are not linearly independent and the matrix R has multiple degeneracies, auxiliary modes and phase-space displacements are not sufficient to make the matrix full rank. Note that the only way to do that is to use a state-dependent displacement, which would imply some prior knowledge of the code word.
To bypass this problem, here we focus on an alternative approach based on multiple detection events.
As an example, we consider the simplest family of doubly degenerate codes, which is obtained for c = 3 and m = 1, i.e., a code of three coherent states over one mode: |α 1 ⟩, |α 2 ⟩, |α 3 ⟩. To discriminate these states, we introduce an explicit linear receiver design that makes use of two auxiliary vacuum modes. The three modes (one signal and two auxiliary modes) are first mixed in LOP unitary, then displaced by γ 1 , γ 2 , γ 3 , and finally detected as schematically shown in Fig. 3. Unlike the receivers of Sections 4.1, 4.2, here two joint detection events unambiguously determine the input state. Without loss of generality, we require that input state |α 1 ⟩ yields the vacuum in the first output mode, whereas the other two output modes both have a non-zero probability of photon detection. Similarly, we require that input state |α 2 ⟩ yields the vacuum state on the second output mode, and state |α 3 ⟩ yields the vacuum on the third output mode.
To write this condition explicitly, consider the unitary matrix U ki that represents the LOP transformation. Then the amplitude on the output mode k, given the input state |α j ⟩ is We require that the output amplitude vanishes for j = k, i.e., from which we obtain three constraints: for j = 1, 2, 3. Given the input state |α 1 ⟩, the probability of obtaining two joint photodetection events on modes 2 and 3 is where in the last equation we have used condition (41). Note that P (2, 3|1) is the probability of identifying the input state |α 1 ⟩, as we use double photodetection events to unambiguously discriminate the input states. From this expression, given the prior probability p j for coherent state |α j ⟩, we compute the average probability of obtaining an inconclusive event: The optimal performance of linear receivers based on double detection events is then obtained by minimizing this expression for P 0 under the constraint that the matrix U is unitary: This constrained optimization problem can be solved using Lagrange multipliers. Using the same setup, we can generalize this approach to codes with higher degeneracies.

Examples
In this Section, we explore with a few examples the performance of our linear optical USD receivers relative to the global bound, for different coherent states and figures of merits. We illustrate that, despite its simplicity, decoders constructed using only linear components and on-off photodetection can generate near-optimal USD: this is observed for randomly generated coherent states, where phase-space displacement is necessary to achieve near-optimal performances. The situation appears to be different for non-typical coherent states with multiple degeneracies, in which case we need to exploit double-detection events.

Discrimination of PPM codes and equivalent codes
A set of orthogonal vectors can always be discriminated perfectly. Since coherent states with finite energy are never orthogonal, they cannot be perfectly discriminated (though their discrimination can be improved by increasing their distance in phase space). In this Section, we explore a particular family of coherent states that share some formal features with orthogonal states. A pulse-position modulation (PPM) code is a set of c = m coherent states over m modes: such that the matrix R = αI is a full-rank. Note that these coherent states are mutually orthogonal with respect to the phase-space scalar product (defined in Eq. (8)), with Gram matrix (α i , α j ) = |α| 2 δ ij . The PPM code is often discussed in the context of quantum communications [36,37,50]. In general, it represents a case study for quantum state discrimination and hypothesis testing, see for example Ref. [51]. In general, there is a gap between the global bound and the performance of linear receivers. However, for a PPM code, it is easy to show that the gap vanishes as the PPM code can be optimally discriminated against using on-off photodetection only. In fact, to unambiguously discriminate the states in the PPM code, it is sufficient to apply mode-wise photodetection. If a photon is detected on mode j, then we know the input state necessarily was |α j ⟩ with no ambiguity. The inconclusive event is when no click is recorded, which happens with probability P 0 = e −|α| 2 . It is well known that this inconclusive probability saturates the global bound. To show this, one can apply the results of Bergou et al. in Ref. [22] (reviewed in Appendix B).
The optimality of linear receivers extends to a larger class of codes beyond PPM. First note that, since the phase-space scalar product is invariant under LOP unitaries, it immediately follows that LOP unitaries and on-off photodetection are sufficient to optimally discriminate any set of coherent states that has the same Gram matrix as the PPM code If the Gram matrix is not diagonal, we may try to make it diagonal by applying a phase-space displacement α j → β j = α j + γ, which changes the Gram matrix into Therefore, the code can be optimally discriminated with LOP unitaries, displacements, and photodetection, if there exists γ and τ > 0 such that This is a system of c 2 real equations and 2m + 1 real unknowns (the components of the complex displacement vector γ and τ ). Therefore, in general, we expect this system of equations to have solutions if c 2 ≤ 2m + 1.
As we have seen above, sometimes the use of an auxiliary mode can improve the effectiveness of linear receivers. In fact, adding an auxiliary mode allows us to introduce one additional real degree of freedom, i.e., |γ m+1 | 2 . The new system of equations reads and comprises c 2 equations and 2m + 2 unknowns. Therefore, we expect this to generally admit a solution if c 2 ≤ 2m + 2. Note that there is no benefit in adding more than one auxiliary mode 3 .

Dual of the PPM code
An example of code that can be reduced to PPM by applying phase space displacement is the following |α 1 ⟩ = |0, α, α . . . , α⟩ , where the state α j has the vacuum on mode j and a coherent state of given amplitude α in the other modes. Note that this code is mapped into a PPM code by displacing each mode by −α. Therefore, the linear receiver is optimal and saturates the global bound on the inconclusive event probability, P 0 = e −|α| 2 .

Random codes
In this Section, we illustrate the performance of linear receivers on random states. Random codes are particularly beneficial in benchmarking the performance of receivers [52]. As an example, we consider c = 3 coherent states over m = 3 modes. We sample their amplitudes (α 1 , α 2 , α 3 ) uniformly from a sphere of radius √ n, where n = i |α i | 2 , where for simplicity we restrict to real-valued amplitudes. Figure 4(a) illustrates the minimized inconclusive probability P 0 achieved from our optimized framework without (34) and with (35) displacement. Notice that the dependence of P 0 on the average photon number is compatible with the exponential law obtained in the qualitative analysis of Section 3. Without displacement, the linear receivers perform poorly. However, when equipped with displacement the performance improves significantly and nearly matches the global bound. To see this more clearly, Fig. 4(b) illustrates the statistical distributions of P 0 for n = 0.6 with N = 6600 random codes. The distribution of the inconclusive probability for the linear receivers with displacement closely matches the global bound, with small variations at smaller values of P 0 . The global bound has been computed using the method of Ref. [22]. This result reassuringly demonstrates that practical receivers based on linear optics are nearoptimal and sufficient to match the global bound, as long as phase-space displacement operations are accessible.

Codes with single degeneracy
Random codes are typically non-degenerate. This means the coherent states in the random codes are almost surely linearly independent, i.e., the associated R is full rank. We now consider examples of degenerate codes. As discussed above, our receiver, endowed with phase-space displacement, can map a code with single degeneracy into a non-degenerate one. As an example, consider the code with singledegeneracy [53] In Fig. 5(a), we illustrate the optimized inconclusive probabilities for linear receivers with displacement. Since these amplitude vectors are not linearly independent and have single degeneracy, receivers limited to LOP unitaries alone perform poorly, and we must leverage displacement operations. By allowing for phase-space displacement, our receivers match the global bound for small intensities (e |α| 2 < √ 2). More details on the optimal receiver design are provided in Appendix D. The analytic expression of the global bound can be obtained using the method of Ref. [22], which yields The numerical optimality of linear receivers extends to larger signal intensities for alternative figures of merit, as discussed in the next Section.

Alternative figures of merit
The gap between linear receivers and the global bound can be further reduced for alternative figures of merit. Consider for example the communication capacity associated with the given input code words and a given detection strategy. The coherent states in Eq. (54) can be used as code words for a communication protocol where the sender (Alice) uses these coherent states to encode a random variable X that takes values x = 1, 2, 3 with associated probabilities p X (x). The receiver (Bob) decodes this information by applying either the linear receiver or a globally optimal USD receiver. The outcome of the receiver is described by a random variable Y that takes four possible values, y = 0, 1, 2, 3, with probability p Y (y), where y = 0 is the inconclusive event. The maximum asymptotic communication rate achievable in this way is given by the Shannon capacity [54] where is the Shannon entropy of Y , and is the conditional entropy, where p Y (y|x) is the conditional probability of Y = y given X = x.
Since the receivers are unambiguous, P 0 (x) := p Y (y = 0|x) is the probability of an inconclusive event for given input, and p(y|x) = δ yx [1 − P 0 (x)] for y = 1, 2, 3. The key quantity that determines the capacity is thus the conditional inconclusive probability P 0 (x). For linear receivers with displacement, this is given by From this we obtain where P 0 = x p X (x)P 0 (x) is the average inconclusive probability. A comparison between our scheme (with LOP and displacement) and the global bound is shown in Fig. 5(b). The two schemes yield nearly equal capacities, the gap being too small to be visualized in the scale of the plot. In addition to the inconclusive probability and the Shannon capacity, we explore the optimality of our receiver using the finite communication block length rate, F . This rate is the communication rate attainable when both the code length, L, is finite, and there is a block error probability threshold, ϵ, imposed on the communication [55]. The normal approximation to the finite block length rate is given by [56] where Q(x) = 1/ √ 2π ∞ x dt exp[−t 2 /2], and V denotes the variance of the information transition probabilities of the channel, and .
(63) Figure 6 illustrates the outcome of this maximization with different code lengths L and for different receivers. We find that linear receivers match the finitecode-length performance generated from the global bounds. Note that in the asymptotic regime of large code block lengths, the finite rate tends towards the channel capacity with ϵ = 0, which are illustrated in horizontal lines.

Codes with double degeneracy
To discriminate codes with higher degeneracy we may exploit joint detection events on pair of output modes.  Consider the single-mode code with double degeneracy Single-mode codes find important applications in quantum sensing and communications protocols where individual measurements are preferred at each instance. Given the double degeneracy, our displacement-based receiver is not able to unambiguously discriminate signal states in this code. However, we can still discriminate these coherent states with LOP unitaries, vacuum auxiliary modes, displacement, and on-off detection by constructing receivers that exploit double detection events as described in Section 4.3. The minimal inconclusive event probability of this class of receivers is determined as the solution to optimization (47). By noting the equivalence between arbitrary auxiliary modes and vacuum auxiliary modes followed by mode-wise displacement operations, an explicit receiver that saturates the performance of linear receivers for this code is illustrated in Fig. 7(a). It requires two auxiliary modes: one prepared in the vacuum and the other in a coherent state with amplitude α/ √ 2 (as noted above using coherent state ancillas is equivalent to using vacuum ancillas and phase-space displacements). The modes are then mixed at two 50/50 beam splitters and the output modes are measured by on-off photodetection. For an input amplitude α j chosen from (64), we write the output coherent state amplitude at mode k as ζ j k . This input-output transformation on the coherent state amplitudes is summarised in Table 2. By inspection of the table, it is evident that the coherent states can be unambiguously discriminated if two detectors click. For example, a joint detection on modes one and two identifies the input code word |α 1 ⟩ = |−α⟩ without error. From table 2, the probability of a joint photodetection event in modes k and l can be determined through  Table 2: Input-output transformation of the coherent state amplitudes through our linear receiver with two-mode photon detection. The receiver is shown in Fig. 7(a).
For this receiver, an inconclusive event corresponds to the absence of a joint photodetection event. Given a prior distribution p j , the average probability of obtaining an inconclusive event is then given by which for a uniform prior, p j = 1/3, yields This result can be compared with the global bound, which we compute analytically using Ref. [22] (see Appendix E for proof): The performance of the linear receivers is benchmarked with the global bound in Fig. 7(b). We note that the behavior of P 0 for small |α| 2 is compatible with the quadratic law in Eq. (21). Finally, we note that while photon number resolving detectors (PNRDs) could help improve the performance of our detectors, we do not expect their use to replace the reliance on detection events across multiple modes. In the case of the example in Eq. (64), α 1 and α 2 differ only in phase and have the same photon statistics. Therefore PNRDs, by themselves, cannot be sufficient to discriminate these two states.

Phase-shift keying
A common way to encode information into coherent states is by phase modulation, i.e., phase-shift keying. Previous works have considered practical schemes for USD of phase-shifted coherent states, both with and without feedback [27,30,31]. In this Section we analyze USD of phase-shifted coherent states using our linear receivers, re-obtaining some of the results of van Enk [27]. Binary phase-shift keying (BPSK) is a code with two states on one mode (m = 1, c = 2): It is well-known how to optimally discriminate BPSK, here we express this method in the framework of our linear receivers. Indeed, to discriminate these states we can apply the strategy described in Section 4.2.
Note that these states are not linearly independent in phase space. To make them independent, first, we add an auxiliary vacuum mode, generating the two-mode states |α, 0⟩, | − α, 0⟩. Second, we displace the auxiliary mode by α, which yields two coherent states with linear independent amplitude vectors, namely |α, α⟩ and | − α, α⟩. Finally, we note that the Gram matrix of these latter states is diagonal, thus the code is equivalent to the PPM code. Applying the argument of Section 5.1, we conclude that the BPSK code is optimally discriminated with linear receivers. In general, we may consider M -PSK codes with M states over a single mode: for j = 0, . . . , M − 1. Note that for M ≥ 3 the code has multiple degeneracies. This means that it can be discriminated only using multiple detection events. For example, the 3PSK (M = 3) ensemble of coherent states can be discriminated by exploiting double detection events. For this, we make use of two auxiliary vacuum modes and construct linear receivers according to the strategy described in Section 4.3. The best linear receiver saturates optimization (47) and its performance is shown in Fig. 8 along with the global bound. Note that we recover the quadratic law of P 0 for small |α| 2 as predicted by Eq. (21). For USD of  3PSK, it is easy to find the analytical expression for the optimized probability of the inconclusive event for linear receivers Our numerical search suggests that using more than two vacuum auxiliary modes delivers no additional advantage to USD. Finally, the global bound, obtained from [22], reads

Conclusions
While quantum states cannot in general be discriminated against without error, they can be discriminated against unambiguously if we allow for an inconclusive outcome. The theory of USD allows us to identify the globally optimal measurements and to compute the ultimate bounds on the probability of obtaining an inconclusive outcome, see for example [15,21,22]. However, existing literature has rarely explored USD under realistic experimental constraints and limitations in what measurements are currently feasible, especially for the discrimination of coherent states of the quantum electromagnetic field.
Here we have outlined a theory of USD for multimode coherent states that focuses on practical receivers that can be feasibly realized with current technologies. Our receivers operate under physical resources described entirely through linear optical passive (LOP) unitaries, phase-space displacement operations, auxiliary vacuum modes, and on-off mode-wise photon detection. We have benchmarked the performance of these linear receivers against the global bound and found examples where they are optimal or near-optimal. In particular, this happens for random codes. Our findings show that high-performance USD receivers can be readily realized with currently available technologies, and suggests that, at least for typical, non-degenerate codes, high-order non-linearities, feedback, or more advanced quantum technologies may only provide small improvements.
We do not have a complete theoretical explanation of why linear receivers perform well. However, we think that the reason may be related to the fact that, at least for small amplitudes, the probability of multiple detection events is suppressed. Therefore, using a photon number resolving detector may not be particularly useful unless the code is degenerate. Furthermore, as the photon number is subject to quantum fluctuations in coherent states, it is unlikely that more detailed information on photon statistics can improve USD (whereas it may be useful for ASD). Also, we have seen that linear receivers achieve the global bound for the PPM codes, and for all codes that can be reduced to them by linear optics. By symmetry, and by an argument of concentration of measure, we may indeed expect that, up to statistical fluctuations, a random code is not too different from a PPM code: this is at least true in the regime where the number of modes (m) is much larger than the number of code words (c). In fact in this regime the scalar product (α i , α j ) (which is zero on average) has fluctuations of order 1/ √ m and the Gram matrix becomes nearly diagonal if c ≪ m. However, for small values of m, the displacement operation seems to play an important role that is not captured by this argument.
A number of questions remain open. First, an analytical expression for the inconclusive probability, be-yond our numerical results, would characterize more clearly the comparison with the global bound, and allow us to apply our analysis to an arbitrary number of modes and code words. In the regime of weak signals, this problem may be more naturally framed within the language of Poisson quantum information [57]. Second, an extension of our theory is required to handle highly degenerate codes, when feedback operations could be useful. This is especially important when the number of states is much larger than the number of optical modes. In this regime, which is of particular interest for quantum communications, USD may be achieved only by exploiting joint photodetection events on multiple modes. Finally, our work might be formulated as a resource theory, similar to Refs. [58,59]. This approach may provide insight and help in comparing different sets of resources, for example by including homodyne or heterodyne detection, photon addition, and subtraction, or some mild non-linear interactions.

Acknowledgments
This work was supported by the EPSRC Quantum Communications Hub, Grant No. EP/T001011/1, and from the European Union's Horizon Europe research and innovation programme under the project "Quantum Secure Networks Partnership" (QSNP, grant agreement No. 101114043. SG acknowledges the NSF Center for Quantum Networks, awarded under grant number EEC-1941583. MSB acknowledges the University of Arizona's Information in a Photon course for being introduced to the problem of optimal linear-optic USD receiver designs. CL acknowledges financial support from PNRR MUR project PE0000023-NQSTI. We are grateful to the anonymous referees for their insightful and stimulating comments.

A Peres-Terno theory for USD
In this Appendix we review the theory of Peres and Terno [15] for the USD of a set of linearly independent vectors |u j ⟩ with j = {1, . . . , c}, associated with prior probabilities p j (satisfying c j=1 p j = 1). These vectors span a c-dimensional Hilbert space H c . We define a unique set of (not necessarily normalized) vectors |v j ⟩ ∈ H c such that We use these vectors to define a POVM with n elements for j = 1, . . . , n. The POVM element corresponding to an inconclusive event is The parameters k j are chosen in such way to ensure A 0 ≥ 0, and I is the identity in H c . For a suitable choice of the parameters k j 's, this POVM allows for unambiguous discrimination. The corresponding probability of an inconclusive outcome is A globally optimal USD measurement corresponds to the one that minimizes P 0 subject to the positivity of A 0 . This is obtained as the solution to the constrained maximization problem: which defines the globally optimal bound on the probability of the inconclusive outcome.

B Method of Bergou, Futschik, and Feldman
Consider the c × c matrix with elements Note that the diagonal elements are the probabilities of an inconclusive event conditioned on the vector |u j ⟩, and the off-diagonal elements are simply the overlaps between the code words, If the vectors |u j ⟩ are linearly independent, the condition of non-negativity of the operator A 0 , A 0 ≥ 0, is equivalent to C ≥ 0. Therefore, following the theory of Bergou et al. [22], the optimal average inconclusive probability is obtained by solving the constrained optimization minimize q 1 , . . . , q j c j=1 p j q j , Note, the solution to this optimization when q j = q for all j is known to be the minimal eigenvalue of the Gram matrix, that is, q [18]. More generally, for nonequal q j , this problem can be solved analytically or semi-analytically. The case of c = 3 is discussed in detail in Ref. [22].
We can immediately apply this method to show that PPM codes are optimally discriminated by modewise photodetection. To prove this, recall that the PPM codes are defined such that the matrix R is square, with R = αI and I the c-dimensional identity matrix. The off-diagonal entries of the matrix C are all equal to e −|α| 2 , while the diagonal entries are all equal, i.e., C jj = q. Therefore, the objective function in the minimization (81) is equal to q. The eigenvalues of C are (c − 1)e −|α| 2 + q (with multiplicity 1) and q − e −|α| 2 (with multiplicity c − 1). The smallest value of q such that the eigenvalues remain non-negative is therefore q = e −|α| 2 , which matches the inconclusive event probability obtained through mode-wise photodetection in Section 5.1.

C Framework for optimized USD receivers
In this Appendix, we determine a condition for a c×m matrix M (with c ≤ m) to be a submatrix of a larger unitary matrix U . First, if c < m, extend M into a square, m×m matrix, M 0 by appending m−c rows of zeros. Second, apply the singular value decomposition M 0 = UDV, where U and V are unitary matrices, and D is diagonal with non-negative entries. For D ≤ 1, the following 2m × 2m matrix is unitary: where I is the identity matrix. By multiplying V by U and V we obtain another unitary matrix, which is a unitary extension of M 0 , As M 0 is an extension of M , it follows that U is a unitary extension of M . We conclude that a matrix M can be extended into a unitary, if and only if its singular eigenvalues are not larger than 1. This condition can equivalently be written as Note that with this construction, the unitary matrix has at most size 2m. For our application, this means that we need at most m auxiliary modes to implement the receiver. In particular, the minimum number of auxiliary modes equals the number of singular values of M 0 that are strictly smaller than 1.

D Receiver for a code with single degeneracy
Here we discuss in more detail the optimal linearoptics receiver for the code in Eq. (54) By applying a displacement γ = (−α, −α) we can map the third coherent state into the vacuum, and the first two into a PPM code: Each time a photo-detection event is observed, we have un-ambiguous discrimination. This scheme is shown in Fig. 9. Note that in this way, the third coherent state is never discriminated. It is easy to check that the probability of the inconclusive event achieved in this way is P 0 = 1 3 2e −4|α| 2 + 1 and thus, comparing with Eq. (55), the scheme is optimal for e |α| 2 < √ 2.  Figure 9: Optimal receiver (in low-photon regime with |α| 2 ≤ ln √ 2) for the two-mode, single-degeneracy CB (85), comprised of displacing each mode by −α before on-off photodetection.
To frame this scheme into our theory, we need to add an ancillary mode, initially in the vacuum state, then displace the three modes by γ = (−α, −α, z). We then add another vacuum ancillary mode and swap the third and fourth modes. USD is then achieved by the photodetection of the first three modes. This is shown in Fig. 10. Note that the third detector never clicks, which is expected as the third code word is not discriminated in this scheme.
For larger values of α, i.e., for e |α| 2 > √ 2, LOP receivers are only near-optimal. Our numerical search indicates that the optimal displacement is of the form γ = (−x, −x, z), including one ancillary vacuum mode, where the positive parameter x is in general smaller than |α|. It follows that the unit vectors v i in 2) for the two-mode, single-degeneracy CB (85), comprised of two vacuum auxiliary modes, displacement across the first three modes, a swap operation between modes three and four, before on-off photodetection.
where M and N are normalization factors. Numerically, we find we can assume without loss of generality k 1 = k 2 = k 3 =: k. Therefore, the matrix M reads which in turn can be completed into a 6 × 6 (realvalued) unitary matrix following for example the procedure of Appendix C. This scheme is shown in Fig. 11.  Figure 11: Near-optimal receiver (in the higher-photon regime with |α| 2 > ln √ 2) for the two-mode, singledegeneracy CB (85).
(95) Note that the first constraint follows from the positivity of q j , the second from the positivity of the failure POVM Π 0 , and the third from the linear independence (LI) of the states. Collectively, these constraints ensure the non-negativity of C in Eq. (81).
A comparison of the two values is illustrated in Fig. 13(a), which shows that P 0 (A) ≤ P 0 (B). Therefore, we discard the point B from the search for the minimized P 0 value. Next, we minimize P 0 over the interior points between A to B alongq 3 = 2(q + 1) −1 : which amounts to a single parameter optimization. This function has a stationary point at q in = e 3|α| 2 /2 − 1 yielding The feasibility of this solution is conditioned on the stationarity point q in residing within the allowed domain between vertices A and B, which holds if and only if e 3|α| 2 /2 − 1 ∈ [2e |α| 2 − 1, e 2|α| 2 ] =⇒ e |α| 2 ≥ 4 .
This solution is illustrated together with P 0 (A) in Fig. 13(b). The solution P in 0 is plotted only when e |α| 2 ≥ 4. The two solution crosses exactly at |α| 2 = ln (4). Hence, the global minimum for the doubledegeneracy (DD CB is characterized by P 0 (A) for e n < 4, and P in 0 for e n ≥ 4: which concludes our proof.