The role of coherence theory in attractor quantum neural networks

We investigate attractor quantum neural networks (aQNNs) within the framework of coherence theory. We show that: i) aQNNs are associated to non-coherence-generating quantum channels; ii) the depth of the network is given by the decohering power of the corresponding quantum map; and iii) the attractor associated to an arbitrary input state is the one minimizing their relative entropy. Further, we examine faulty aQNNs described by noisy quantum channels, derive their physical implementation and analyze under which conditions their performance can be enhanced by using entanglement or coherence as ex-ternal resources

such map. An exponential increase of the storage memory of an aQNN, with respect to its classical counterpart, was already shown in [7] by means of quantum search algorithms. Also, in [8], the same result was recovered by using a feed-forward interpretation of the quantum Hopfield neural networks (for a recent development of this model see [9,10]). More recently, in [11], the explicit form of the CPTP maps possessing the maximal number of stationary states was derived.
Interestingly, such CPTP maps correspond to non-coherence-generating operations. Such observation motivates our choice of addressing aQNNs from a coherence-theoretic approach. Within this framework, we characterize the properties of aQNNs, such as their physical implementation and the depth of the network, that is, the number of times the map has to be applied to retrieve faithfully the state which is closest to the initial input. We then focus on the realistic scenario of faulty aQNNs, i.e., the case when some error in the realization of the network is taken into account.
In what follows we briefly review some basic notions regarding both aQNNs and the resource theory of coherence. In Section 2, we present our results regarding error-free aQNNs. We show that the evolution of aQNNs with maximal storage capacity is described by a genuinely incoherent operation (GIO). Besides, we demonstrate that, for such aQNNs, the equivalent of the Hamming distance is the quantum relative entropy. After deriving their physical implementation, we define their depth and establish a relation to the concept of decohering power. Further, we show that, in the case of noiseless aQNNs, neither coherence nor entanglement can be exploited as resources to enhance their performance. In Section 3, we address the above issues in the context of faulty aQNNs, i.e., when we consider the presence of some source of error in the CPTP maps that implement the aQNNs. In this case, we demonstrate that the corresponding aQNNs are described either by strictly incoherent operations (SIOs), or by maximally incoherent operations (MIOs), thus opening the possibility, in the latter case, to an enhancement of their performance by using coherence as an external resource.

.1 Attractor quantum neural networks
An attractor QNN of the Hopfield type consists of a network of n d-dimensional artificial neurons (qudits) which evolve under a quantum channel, i.e., a non-trivial CPTP map Λ : B(H in ) → B(H out ). The stored memories correspond to the stationary states of the map, that is, the states ρ S such that Λ(ρ S ) = ρ S [11]. For an arbitrary input state ρ = ρ S , the successive applications of the map will bring the state to one of the stationary states of the map ρ S . In what follows, we restrict to the case where dim(H in ) = dim(H out ) = N = d n . As demonstrated in [11,12], a non-trivial CPTP map can have up to N stationary states Λ(|µ µ|) = |µ µ|, where {|µ } N −1 µ=0 forms an orthonormal basis of H in . Such a map has the form of a generalized decohering map, i.e., where α µν ∈ C.
To determine the complete-positivity of the map, it is easier to work in terms of its Choi state, J Λ ∈ B(H in ⊗ H out ), obtained by means of the Choi-Jamiołkowski isomorphism, so that Λ is CPTP iff J Λ ≥ 0 and Tr out (J Λ ) = 1 in , where Tr out denotes the partial trace over the subsystem H out . The Choi state of the map of Eq.(1) reads The positivity requirement, J Λ ≥ 0, demands that |1 + α µν | 2 ≤ 1 , ∀α µν (and α µµ = 0 ∀µ), as well as the positivity of all minors of |J Λ |, which can be checked by, e.g., the Sylvester's condition [11]. In the most general case checking positivity is evidently hard, but here we simplify our analysis by restricting to the particular cases where α µν = α νµ = α ∈ R for every µ = ν. Upon this requirement, we find that Λ is CPTP whenever We remark that our results are, nevertheless, general and apply also when this restriction is lifted as long as the map Λ is CPTP. Throughout this work, we will only consider aQNNs with maximal storage capacity, that is, those whose evolution is given by Eq. (1). With an abuse of language, we will sometimes refer to the map Λ in Eq.(1) metonymically as aQNN.

Coherence theory
In any resource theory, one should firstly introduce the sets of free states and free operations. Given a Hilbert space H of dimension N , we denote by B(H) the set of the bounded operators acting on H. The set of free states in the resource theory of coherence, denoted as I, comprises the so-called incoherent states, that is, all the states δ ∈ B(H) that are diagonal in a fixed basis Free operations are the CPTP maps, E, that leave incoherent states incoherent, i.e., E(I) ⊂ I. Stated differently, E fulfills ∆ • E • ∆ = E • ∆, where • denotes the composition between two maps and ∆ is the complete-dephasing map in the chosen basis, i.e., ∆(·) = i |i i| · |i i| [13]. Operations satisfying the above relation are said to be non-coherence-generating, since they are unable to create coherence on any incoherent state. In contrast to what happens in the resource theories of asymmetry, athermality or entanglement [14], in coherence theory the set of free operations is not unique. This can be grasped by looking at the Kraus structure of the corresponding CPTP maps (E(·) = α K α · K † α ). The largest class of non-coherence-generating operations are the maximally incoherent operations (MIOs), whose Kraus operators, {K α }, fulfill α K α IK † α ⊂ I [15]. A subset of MIOs are the incoherent operations (IOs) [16], consisting of all MIOs whose Kraus operators satisfy the relation K α IK † α ⊂ I for all α. Inside the set of IOs we find the strictly incoherent operations (SIOs) [17], for which the Kraus operators further fulfill that K † α IK α ⊂ I for all α. Finally, genuinely incoherent operations (GIOs) [18] are SIOs preserving every incoherent state, i.e., E GIO (δ) = δ for all δ ∈ I. As a consequence, GIOs present diagonal Kraus operators.
Besides free states and free operations, one should also introduce a proper measure of the resource considered. To quantify the amount of coherence present in an arbitrary state ρ ∈ B(H), a coherence measure [16] must be defined as a functional C : B(H) → R ≥0 satisfying two main conditions: (i) faithfulness, meaning that C(δ) = 0 for all incoherent δ ∈ I, and (ii) monotonicity, i.e., C(ρ) ≥ C(E(ρ)), for all non-coherence-generating operations E. Among the most typical coherence measures we find the robustness of coherence [19], the relative entropy of coherence, i.e., where S(ρ) = −Tr(ρ log ρ) is the Von Neumann entropy [16], and the l 1 -coherence measure, which is a valid measure under IOs, but not MIOs [20]. Finally, every coherence measure achieves its maximum value on the set of maximally coherent states (S MCS ), defined, in dimension N , as S M CS : 2 Error-free aQNNs 2.1 aQNNs as non-coherence-generating operations A direct inspection of Eq.(1) shows that, in the basis {|µ } N −1 µ=0 , the condition ∆ • Λ • ∆ = Λ • ∆ holds, implying that aQNNs are not able to generate coherence on any input state. In particular, since Λ(δ) = δ for all δ ∈ I, it follows that the set of attractors of the aQNN is equivalent to the set of incoherent states I, and that: Remark 1 aQNNs are described by GIOs.
As stated in the introduction, this observation justifies addressing aQNNs from a coherencetheoretic perspective. Moreover, analogously to the case of aNNs, aQNNs are bona fide models for associative memory. Indeed, in the asymptotic limit, they are able to retrieve the stored attractor which is closest to the input state, in terms of their relative entropy. We show this fact in the following lemma: Lemma 2 After r → ∞ iterations, an aQNN outputs the stored attractor that minimizes the relative entropy with respect to the input state ρ, i.e., S(ρ|| lim r→∞ Λ r (ρ)) = min δ∈I S(ρ||δ), where S(ρ||σ) = Tr(ρ log ρ) − Tr(ρ log σ) is the quantum relative entropy fo ρ with respect to σ. Equivalently, C r.e. (ρ) quantifies the minimum relative entropy between ρ and the set of the attractors of the aQNN.
Proof. From Eq.(1) one notices that applying Λ a sufficient number of times on an input state ρ results in a complete dephasing of ρ, i.e., lim r→∞ Λ r (ρ) = ∆(ρ). Now, let us write the relative entropy between ρ and an incoherent state δ as S(ρ||δ) = S(∆(ρ)) − S(ρ) + S(∆(ρ)||δ). It is immediate to see that that is, the minimum relative entropy between an input state ρ and the set of incoherent states (or attractors) is achieved on ∆(ρ), i.e., the state retrieved by the aQNN after a sufficient number of applications. As proven above, such minimum distance between the input state and the retrieved attractor is quantified by the relative entropy of coherence of the input.

Physical realization of aQNNs
Physical operations on a system can always be understood as unitary dynamics and projective measurements on a larger system. Indeed, given a quantum channel E :

there always exists an ancillary Hilbert space A of arbitrary dimension and a unitary operation U ∈ B(H ⊗ A) such that
for any ρ ∈ B(H), where Tr A denotes the partial trace on the subsystem A and |a 0 a 0 | is the initial state of the ancilla. The corresponding unitary U is known as the Stinespring dilation of the map E [22]. Thus, aQNNs can be physically realized by appending an ancillary qudit to the network qudits, letting the composite system evolve under the corresponding Stinespring dilation, and finally discarding the ancilla. Knowing that aQNNs are associated to GIOs allows us to derive the Stinespring dilation of the former in a straightforward way: The Stinespring dilation of an N -dimensional aQNN is given by where {|µ } N −1 µ=0 is an orthonormal basis and U µ is a unitary operator such that Proof. Let {|µ ⊗ |a µ } be an orthonormal basis of the composite Hilbert space H ⊗ A. In [23] it was proven that the action of the Stinespring dilation of a GIO can be expressed as where |c µ = i c (i) µ |a i and {|c µ } is a set of normalized but not necessarily orthogonal states. Expressing the state ρ in the basis {|µ } N −1 µ=0 , i.e., ρ = µν ρ µν |µ ν|, and making use of Eq. (7), we find that Eq.(6) takes the form Let us observe that, due to the normalization of the states so that E GIO (ρ) = ρ for any diagonal state ρ, and the action of E GIO does not increase the value of the off-diagonal elements. A direct comparison between Eq.(9) and the map of Eq.(1) shows that the two maps are equivalent if which completes the proof.

Depth of aQNNs and decohering power
Consider the simple case of a maximally coherent qubit |Ψ 2 = 1 √ 2 (|0 + |1 ) suffering decoherence under the action of an aQNN, i.e., Λ(Ψ 2 ) = 1 from now on, we use the notation Ψ := |Ψ Ψ|. From here it is easy to see that an aQNN with a smaller value of |1 + α 01 | needs to be applied less times on a state in order to completely destroy its coherences. To quantify the ability of operations to cause decoherence, the notion of decohering power is invoked. The decohering power of a map E : B(H) → B(H), dim(H) = N , with respect to some coherence measure C was introduced in [24]: When considering the l 1 -coherence measure we immediately find: The l 1 -decohering power of an N -dimensional aQNN described by the CPTP map Λ is given by We define the depth of an aQNN as the minimum number of times, r, that the map Λ has to be applied on a state until it becomes stationary up to some tolerable error η, that is, until the classification process is accomplished with sufficient accuracy (see Fig.1a). At that moment, the coherence of the input is small, i.e., C l 1 (Λ r (ρ)) = η, with 0 < η 1.
Note that the lower bound for r is tight, since in this case C l 1 (Λ r (Ψ N )) and D C l 1 (Λ) are exactly related.   2 shows the minimum number of layers that a 100-dimensional aQNN of this kind needs to have in order for stationarity to be achieved within an error η = 0.01. Moreover, it illustrates how the depth of an aQNN decreases with its decohering power. Turning to generic aQNNs, however, it is not possible to find a tight lower bound for the depth, since under non-uniform decoherence the main quantities C l 1 (Λ r (Ψ N )) and D C l 1 (Λ) are not equivalent.

Performance of aQNNs cannot be enhanced either by coherence or entanglement
In both quantum and classical neural computing two main stages are distinguished: the training phase and the inference phase. Throughout this work, we have considered already trained aQNNs and we have investigated their properties during the inference phase. In this section, we are interested in analysing whether the performance of an aQNN can be improved at the inference stage itself. As it is common in the literature about neural computing, enhancing the performance of a neural network implies: i) increasing its accuracy, and ii) accelerating the inference process, that is, reducing the depth of the network (as we defined it in Section 2.3). Here we want to investigate whether the performance of an aQNN can be improved by resorting exclusively to quantum resources. To that aim, one can begin by implementing some channel N i on a given layer i capable of mitigating some of the errors occurred in previous layers, which results in an increased overall accuracy, and/or reducing the number of layers left until the inference process is accomplished, i.e., decreasing the overall r. We consider a scenario where an aQNN of arbitrary dimension is coupled to The procedure would be as follows: i) append a coherent ancilla ω i ∈ B(A) to the input state ρ i ∈ B(H), ii) apply Λ on the composite system, and iii) discard the ancillary state (see Fig.1b). Formally, we can express this process as As discussed in [25], a non-coherence-generating operation M is able to realize a coherent channel in this way only if it can activate coherence, i.e., if it fulfills ∆•M•∆ = ∆•M. Noting that GIOs violate this condition [13], the following no-go result holds: Proposition 6 Coherence cannot be used to enhance the performance of aQNNs.
Since GIOs are unable to exploit the coherence of ω i to help implement N i (ρ i ) (unlike MIOs [26] or IOs), aQNNs cannot use coherence to boost their own performance.
Another strategy to reduce the depth of an aQNN by increasing its decohering power relies on the exploitation of initial correlations [27]. Consider an input state ρ i ∈ B(H), purified by the entangled state ψ i ∈ B(H ⊗ A), i.e, Tr A (ψ i ) = ρ i . The question is to find whether using such an entangled input state causes a stronger decoherence in the output, thus reducing the number of times that the map Λ : B(H) → B(H) has to be applied before the classification task is completed. Stated differently, we want to investigate whether Fig.1c). We hereby show this is not possible: Proposition 7 Initial entanglement cannot be used to reduce the depth of aQNNs.
Proof. Consider a generally mixed input state ρ = k p k |φ k φ k |, where p k ∈ [0, 1] and the states |φ k are not necessarily orthonormal. A purification of ρ is given by |ψ = k √ p k |φ k |k , where {|k } is an orthonormal basis of A. Expressing ρ in this basis, i.e., Therefore, C(Λ(ρ)) = C(Tr A {(Λ ⊗ id)(ψ)} and initial correlations cannot produce a faster decoherence in the input state ρ.

Faulty aQNNs
We examine now the realistic scenario of non-error-free aQNNs, that is, the case where some error in the implementation of the network is taken into account. In particular, we denote as faulty an aQNN such that the associated map, Λ , preserves the stationary states up to a certain error ∈ [0, 1], i.e., According to this definition, it is clear that there exist many maps satisfying the above requirement. In what follows we consider one, denoted as Λ ,γ , whose action over a generic quantum state ρ can be written as with ∈ [0, 1] and γ ∈ C. Notice that Eq.(17) corresponds to a faulty map where represents the error on achieving the stationary states and γ is a damping factor in the off-diagonal terms. We define a faulty ( , γ)-aQNN as that associated to the map Λ ,γ of Eq. (17), whose corresponding Choi state is Notice that Eq. (18) can be cast as a direct sum, i.e., where the bar symbol denotes the complex conjugation and each 2 × 2 matrix J µν appears with multiplicity N (N − 1)/2. Following the same arguments of Section 1.1, Λ ,γ is a CPTP map iff J Λ ,γ ≥ 0. Choosing |γ| ∈ [0, /(N − 1)] guarantees that J µν ≥ 0 for every µ = ν, but finding analytical conditions on the parameters such that J ≥ 0 is, in general, a cumbersome task. Nevertheless, recalling that α µν = α νµ = α ∈ R for every µ = ν, Λ ,γ is a CPTP map whenever α ∈ [( −N )/(N −1), − ] and |γ| ∈ [0, /(N −1)].
Proof. First, we derive the Kraus operators, defined as K i = √ λ (i) mat(λ (i) ), where λ (i) is an eigenvalue of the Choi state and mat(λ (i) ) the row-by-row matrix representation of the corresponding eigenvector |λ (i) . The diagonalization of J Λ ,γ can be made simpler thanks to the direct sum decomposition of Eqs. (19)- (20). Notice that the diagonalization of Eq. Hence, when converting the extended eigenvector into a matrix, it is immediate to find that this operation always yields a diagonal Kraus operator, regardless of the particular eigenvector considered. Let us now inspect the eigenvectors of the operator J µν of Eq. (19). First notice that, for any 0 ≤ µ < ν ≤ N − 1, J µν can be written in the chosen basis as The diagonalization of J µν yields a couple of eigenvectors of the form |λ (±) Jµν = (±(λ Jµν ) 0 , (λ Jµν ) 1 ) T . However, differently from the previous case, when extending these vectors to dimension N 2 , we need to add N 2 − 2 zeroes whose position will vary according to the specific matrix J µν considered. It is easily found that the zeroes of the extended eigenvector correspond to the elements of the basis of the form |µ ν with µ , ν = µ, ν. Thus, the Kraus operators are given by K µν = κ (1) µν |µ ν| + κ (2) µν |ν µ| , for some κ For every incoherent state δ it holds and K † µν δK µν is obtained by relabelling µ → ν. Hence, for every δ ∈ I, it holds that Further, we provide the expression of the distance between the two quantum channels Λ and Λ ,γ . In order to do so, we introduce the diamond distance, denoted by D , which is formally defined, for any pair of CPTP maps, as [28,29] D (E, F) = 1 2 max where ρ AB ∈ B(H A ⊗ H B ) and X 1 = Tr √ XX † is the usual trace norm. Operationally, the diamond distance quantifies how well one can discriminate between two quantum maps. Indeed, it is possible to show that E and F become perfectly distinguishable whenever D (E, F) = 1 [30]. The computation of the diamond distance between two CPTP maps can be cast as a semidefinite program (SDP) which admits a simple formulation in terms of their Choi states [26], i.e., Taking into account Λ and Λ ,γ , the solution of the above SDP program does not admit, in general, a simple analytical expression. However, upon suitable conditions, we prove the following result: Proposition 9 Let α µν = α νµ ≡ α ∈ R for all µ = ν and γ = 0. Then, the diamond distance between Λ and Λ is given by D (Λ, Λ ) = .
Proof. Notice that, since the difference between the Choi states yields a diagonal matrix, the SDP program (22) can be solved by restricting to the diagonal matrices Z = diag(z 00 , . . . , z N −1,N −1 ) satisfying the constraints Z ≥ J Λ ,γ − J Λ and λ1 A ≥ Tr B (Z). The former condition is easily satisfied by choosing z ii = N −1 whenever J Λ ,γ − J Λ ii = N −1 and z ii = 0 elsewhere. With this choice, we find Tr B (Z) = 1, so that the latter condition reduces to λ ≥ . Hence, the minimization over λ yields , which completes the proof.
Notice that, when restricting to the case of Proposition 9, for = 1 it is possible to fully discriminate between Λ and Λ . Moreover, we have numerically found that, also when γ = 0, Proposition 9 holds true, thus implying that the diamond distance is independent of the choice of γ.
Regarding the physical realization of ( , γ)-aQNNs, the following result holds: Proposition 10 The Stinespring dilation of an N -dimensional ( , γ)-aQNN is given by where π k is a permutation function swapping two states |µ and |ν ∀µ = ν, i.e., |π k (µ) = |ν , |π k (ν) = |µ , and {|c is a set of normalized states fulfilling Proof. The Stinespring dilation of a SIO is given by [14] where π k is a permutation function labelled by the index k and the coefficients {c In order to relate the above expression with the one of Eq.(17), we rewrite it as Let us now denote by k = 0 the identical permutation that leaves unchanged the elements of the chosen basis, i.e., |π 0 (µ) = |µ for all µ = 0, . . . , N − 1. Hence, the first term of Eq. (27) can be cast as Comparing Eq.(28) with the diagonal terms in Eq.(17), we find To find the rest of the conditions let us rewrite the second term of Eq.(27) as A direct comparison between Eq.(29) and the off-diagonal terms of Eq. (17), shows that we need to impose some restrictions on the permutation function. Choosing π k to be a swap between any two pair of orthogonal states, i.e., |π k (µ) = |ν and |π k (ν) = |µ with µ = ν, we find: As a consequence of Proposition 10, an error-free aQNN may turn faulty if the unitary operator that physically implements it degrades from U aQNN to U ( ,γ)-aQNN . We conclude this section by observing that also SIOs are non-coherence-activating operations [13], which results in the following no-go proposition: Proposition 11 Coherence cannot be used to enhance the performance of ( , γ)-aQNNs.
In addition, entanglement cannot be exploited either to accelerate the inference process in this case: Proposition 12 Initial entanglement cannot be used to reduce the depth of ( , γ)-aQNNs.
So far, we have considered the case when faulty aQNNs are described by SIOs, showing that neither coherence nor entanglement can be used to enhance the performance of the associated aQNN. Nevertheless, it is possible to show that, when other sources of error are considered, this is not necessarily the case. In particular, we define a map Λ ,γ,λ defined as: Λ ,γ,λ (ρ) = Λ ,γ (ρ) + µ<ν [ρ µν λ|µ + 1 ν + 1| + h.c.] , where λ ∈ C and Λ ,γ (ρ) is the map of Eq. (17). It can be checked numerically that Λ ,γ,λ corresponds to a MIO, but not IO, thus possibly allowing the use of coherence to enhance the performance of the related aQNN, as proven in [26].

Discussion
In this work we have shown the usefulness of coherence theory in the characterization of aQNNs of the Hopfield type. Such networks are always described by quantum channels that do not generate coherence. In the case of error-free aQNNs, the associated CPTP maps, Λ, correspond to genuinely incoherent operations (GIOs), and the network retrieves the stationary state (attractor) which minimizes its relative entropy with the input state.
Further, using the concept of decohering power of a channel, we have provided the analytical expression of the depth of the network, that is, the number of layers required to complete a classification task. For error-free aQNNs, coherence theory shows that neither coherence nor entanglement can act as catalysts to improve their performance. Finally, we have studied the effect of errors in these networks. We have found that the associated quantum channels, Λ ,γ,λ , correspond either to strictly incoherent operations (SIOs), or to maximally incoherent operations (MIOs). While GIOs and SIOs do not allow the use of external resources to improve the performance of the related aQNNs, in the case of MIOs an enhancement could be obtained by using an external source of coherence. As such, we believe that our coherence theoretic analysis of aQNNs represents a valuable tool to address the most relevant questions regarding the performance of quantum neural networks. Such novel perspective should also be considered when inspecting other classes of more complex quantum neural networks, where this approach could bring key insights in the characterization of their performance.