A Converse for Fault-tolerant Quantum Computation

As techniques for fault-tolerant quantum computation keep improving, it is natural to ask: what is the fundamental lower bound on redundancy? In this paper, we obtain a lower bound on the redundancy required for $\epsilon$-accurate implementation of a large class of operations that includes unitary operators. For the practically relevant case of sub-exponential depth and sub-linear gate size, our bound on redundancy is tighter than the known lower bounds. We obtain this bound by connecting fault-tolerant computation with a set of finite blocklength quantum communication problems whose accuracy requirements satisfy a joint constraint. The lower bound on redundancy obtained here leads to a strictly smaller upper bound on the noise threshold for non-degradable noise. Our bound directly extends to the case where noise at the outputs of a gate are non-i.i.d. but noise across gates are i.i.d.


Introduction
The idea of a quantum computer was proposed in [3].The computational advantage of a quantum computer over its classical counterpart was first shown mathematically in [5], followed by [6].However, this initial excitement for quantum computers was confronted with a practical issue, noise in quantum circuits.
To tackle noise in quantum circuits, the exciting area of quantum fault-tolerance emerged, thanks to the work in [18], followed by [19,1,14] and many others.The seminal works in [18,19,1,14] essentially showed that if the strength of the noise at the gates is below a threshold, almost accurate quantum computation can be realized.However, this is achieved at the cost of a poly-logarithmic increase in the size of the circuit with respect to the ideal circuit made of noiseless gates.In subsequent works, this poly-logarithmic space overhead has been improved.In [10,8], it was shown that a constant space overhead can be achieved if the noise strength is below a threshold.This naturally raises the question: what is the minimum space overhead requirement?
Relatively fewer works have addressed the question of minimum space overhead requirement.However, as better and better codes are found, the interest in understanding the fundamental limit on space overhead is growing [17,12,9].
In this paper, we obtain a lower bound on the required space overhead for a broad class of noise models.In the practically likely regime of sub-exponential (in the number of input qubits) depth and sub-linear gate size, [10,8], our bound is strictly better than the existing bounds in [11,17,12,9].This bound is obtained by connecting the fault-tolerant computation problem to a set of finite blocklength communication problems whose accuracy requirements satisfy a joint constraint.

Related Work
Influenced by the gradual improvements in the space overhead of the error correcting schemes for quantum circuits [19,1,10,8], improved lower bounds on space overhead were sought.
Harrow and Nielsen obtained a threshold of 0.74 for depolarizing noise [11].In [17], Razborov obtained an improved gate size dependent threshold 1 − 1 g , where g is the gate size, i.e., the maximum number of inputs to a gate.Kempe et al. [12] improved this bound to 1 − 2 1 g − 1 for mixture of unitary gates of size g.Buhrman et al. [4] and Virmani et al. [20] showed classical simulability of noisy quantum circuits beyond a threshold under assumptions on special gate operations and noise.These indirectly showed that, under certain assumptions, a quantum computer loses its edge over a classical computer if the noise is more than a threshold.
Recently, Fawzi et al. [9] obtained a threshold for quantum computation in terms of the quantum capacity of a channel with the same noise.Their model includes any arbitrary gate operation and allows free noiseless classical computation.Though their threshold does not depend on the gate size g, it is strictly better than the threshold in [12] for g ≥ 2.Moreover, they provide a lower bound on the space overhead when the noise is below the threshold.

Our Contribution
The main result in this paper is a lower bound on the number of physical qubits needed for fault-tolerant quantum computation with a quantum state of d (logical) qubits.The bound is obtained by first establishing a connection between fault-tolerant computing and a set of finite blocklength quantum communication problems [13] whose accuracy requirements share a joint constraint.Then, using results from finite blocklength quantum communication [13] we obtain a converse, which is then optimized to obtain the final lower bound.Our approach is inspired by information theoretic converse for classical noisy circuits in [16,7], but the techniques used are quite different.
In the likely practical scenario of sub-exponential circuit depth and constant gate size, our bound is tighter than the best known bound from [9].Specifically, if the input has d qubits and the gate noise is given by the quantum channel N , then [9] shows that the number of physical qubits needed is d Q(N ) , where Q(N ) is the quantum capacity of the channel N .We show that for sub-linear gate size, i.e., g = o(d), the lower bound is d g Ic(N ⊗g ) , where I c (•) denotes the coherent information of the corresponding quantum channel.Since , our bound is tighter.An implication of our bound is an improved upper bound on noise threshold for a broad class of noise and gate models.For noise that are not degradable, this upper bound is strictly lower than the known ones and depends on gate size g like [17,12].Interestingly, in the case where noise at a gate are correlated but noise across gates are i.i.d., bounds on space overhead and noise threshold can be obtained by replacing I c (N ⊗g ) with I c (N (g) ) in our bound.Here, N (g) is an arbitrary non-i.i.d.noise at the output of a gate of size g.

Organization
In Sec. 2, the quantum computation model is presented.The main result is presented in Sec. 3 followed by a discussion about its implications.Proof of the main result is presented in Sec. 4, which draws on a few intermediate results, whose proofs are presented in Appendix A and B. We conclude in Sec. 5 with a short discussion on interesting directions for further explorations.

Quantum Computation Model
We consider a fault-tolerant quantum circuit whose goal is to compute a function f (•) on a d-qubit input, i.e., a 2 d dimensional input.At the input to the circuit, there are total N physical qubits on which quantum operations are performed sequentially in layers (Fig. 1).The initial input, ρ (d) , is placed on d physical qubits and the remaining N − d physical qubits are ancillas, which can be used for error correction.Thus, the initial state of the N physical qubits is given by ρ (N ) = ρ (d) ⊗ ⊗ N −d a=1 σ a , where σ a are ancilla qubits.
A quantum circuit of depth D has D layers and does D operations on ρ (N ) in a sequential manner.That is, given the quantum operations O l corresponding layers 1 ≤ l ≤ D, the output at the end of the final layer The operation O l in layer l is realized using noisy quantum gates of size g.If a set of qubits are not processed by any gate at a particular layer, we model them to be processed by a dummy g-input identity gate.Thus, in the first layer, there are at least ⌈ N g ⌉ gates, including the dummy identity gates.Like [17,9], at each layer, introduction of fresh ancilla qubits and noise free classical computations are allowed.
Following prior work [17,12,9], a noisy gate is modeled by a perfect gate followed by g i.i.d.noisy quantum channels N , given by N ⊗g .Noise across gates and layers are assumed to be i.i.d.
Finally, an arbitrary noiseless quantum operation D is allowed after the final layer.It maps to a quantum state of the same dimension as that of f (ρ (d) ).This operation is equivalent to the noiseless decoder in [9] and the partial trace operation in [17].We refer to the final output of the noisy circuit after the operation D as QC(ρ (d) ).

Criteria for fault-tolerance
In the above computational model, the criteria for ϵ-accurate (or fault-tolerance) computation of f (•) using QC(•) is where, F (•, •) is the standard fidelity [13].
The usual arguments for achieving fault-tolerance against sufficiently small but constant noise using concatenated codes [15, Ch. 10.6.1]imply that criteria C0 can be achieved.In particular, it can be achieved using concatenated codes and poly-logarithmic space overhead for any sufficiently small ϵ > 0, where for any state σ and a completely positive trace preserving map E, N (σ) = (1 − ϵ)σ + ϵ E(σ).Space overhead may be improved using better fault tolerance schemes like [10,8].However, the focus of this work is on converse results: a lower bound on required space overhead and an upper bound on noise threshold.Our computational model, fault-tolerance criteria and objective are similar to that in [9,17].

Converse for ϵ-accurate Computing
The following proposition gives a lower bound on the number of physical qubits needed for fault-tolerant computation.

Proposition 1. There exists a class of functions on d-qubit states such that the number of physical qubits needed for implementing ϵ-accurate circuits for computing these functions is lower bounded by
for ϵ ∈ (0, 0.11) and d ≥ 2g.Here, I c (N ⊗g ) is the coherent information of the channel N ⊗g .
Proof.The proof of Prop. 1 is given in Sec. 4.
First, we would like to mention that the class of functions in Proposition 1 includes all unitary transformations.Hence, this class of functions can simulate evolutions of physical systems and implement important quantum computation modules like quantum Fourier transform.
The bound in Proposition 1 directly extends to circuits where noise at the outputs of a gate are not independent, but noise are i.i.d.across gates.In that case, instead of I c (N ⊗g ), the bound would have I c (N (g) ), where N (g) represents the potentially correlated noise acting on the g-qubit output of a gate.This would be evident from the proof of Proposition 1.

Lower bound on space overhead
Next, we consider circuits where D is sub-exponential in d and the gate size g = o(d) since the probable practical implementations lie in this regime.The following corollary, which follows directly from Proposition 1, gives a lower bound on the minimum required space overhead in this regime.When D is sub-exponential and g = o(d), the best known lower bound on limiting space overhead is inf k≥1 k Ic(N ⊗k ) [9].Note that this is equal to the bound in Corollary 1 when N is degradable [13, §3.25].However, for depolarizing noise, and in general, for noise that are not degradable, the bound in Corollary 1 is strictly higher.
Another interesting aspect of the bound in Corollary 1 is that it captures the effect of the gate size on minimum space overhead, which is not captured by the existing best bound inf k≥1 k Ic(N ⊗k ) .

Upper bound on noise threshold
Noise threshold for a noise model parameterized by a single parameter (e.g., depolarizing) is defined as the strength of noise beyond which quantum computation is not possible.Its dependence on the gate size was captured in [17] and [12].On the other hand, in [9], a tighter bound involving only quantum capacity was given.Here, we obtain a tighter upper bound on noise threshold in terms of both gate size and a quantity closely related to quantum capacity.
We state results for generic noise that may have multiple parameters and characterize the parameter region where fault-tolerant computation is not possible using reasonable space overhead.The following is a direct corollary of Proposition 1.
Corollary 2. For g = o(d) and ϵ ∈ (0, 0.11), there exist a class of functions that include unitary transformations, such that for any parameter region of the noise N where I c (N ⊗g ) = 0, ϵ-accurate computation requires N d = Ω( d ln d ), i.e, sub-linear space overhead (upto a logarithmic factor) is necessary.This corollary of Proposition 1, unlike the threshold results in [9,17,12], does not give an impossibility result.Despite that it says something quite useful: constant or polylogarithmic space overhead, which are the gold standards for fault-tolerant schemes, cannot be achieved when I c (N ⊗g ) = 0.In comparison, the best existing result says that linear or poly-logarithmic space overhead is not possible when sup k≥1 . Thus, for noise that are not degradable, Corollary 2 gives a strictly larger parameter region where no scheme can achieve poly-logarithmic space overhead.An immediate implication of Corollary 2 is an upper bound on noise threshold for depolarizing noise which is strictly lower than the best existing upper bound.
Next, we derive a much stronger phase transition of the required space overhead for the same parameter region as given by Corollary 2. However, this result does not directly follow from Proposition 1 and is derived using intermediate results from the proof of Proposition 1.
Proposition 2. For g = o(d) and ϵ ∈ (0, 0.11), there exist a class of functions that include unitary transformations, such that for any parameter region of the noise N where I c (N ⊗g ) = 0, ϵ-accurate computation for ϵ < 0.11 is not possible for N = o(exp(d)).
Proof of this proposition is presented in Appendix C.This result also extends to noise that are correlated at a gate but are independent across gates.
For all practical purposes, Proposition 2 is an impossibility result since it proves that in the parameter region where I c (N ⊗g ) = 0, accurate computation requires at least exponential space overhead.An even stronger impossibility result is discussed in Appendix D.
In summary, Proposition 2 gives an upper bound on the noise threshold and Proposition 1 gives a lower on bound the minimum space overhead when the noise strength is below that threshold.These bounds are applicable to a broad class of noise and gate models, and for circuits with sub-exponential depth, these are tighter than the respective existing bounds.

Discussions
The gate size dependent bounds on noise threshold and minimum space overhead in Proposition. 2 and 1, respectively, can be useful in choosing the right experimental implementations.Consider the scenario where there are multiple options for experimental implementations using different approaches.These possible implementations have different gate sizes and encounter different types of noise.In this scenario, the bound in Proposition. 1 can be a thumb rule for choosing the best experimental implementation in terms of space overhead requirement.As the bounds in Propositions 1 and 2 extend to any qudit circuit, this can also be used as a thumb rule for comparison across different qudit and qubit technologies.
When the noise at the outputs of a gate are correlated, the minimum required space overhead and the noise threshold are decided by I c (N (g) ), where N (g) is the generic noni.i.d.noise at the outputs of a gate.In this context, it is important to understand the kind of correlations that hurt the computations most and plan to avoid such correlations in the experimental realizations.
Unlike [17,12], the bound in Proposition. 1 and that in [9] require the knowledge of coherent information of a k-fold channel, I c (N ⊗k ).This often does not have a closed form and obtaining I c (N ⊗k ) requires to solve a 2 k -dimensional non-convex optimization problem.With the increasing interest in understanding non-convex optimization in the machine learning community, a search for provably efficient algorithm for computing I c (N ⊗k ) can be of independent interest.

Proof of Proposition 1
For proving the lower bound in Proposition 1, we first state a converse for computing a class of functions f on d-qubit states of dimension.Theorem 1. Suppose the function f to be computed using the noisy quantum circuit satisfies the following conditions.
(i) f has an inverse f −1 that can be accurately computed if we have access to a noiseless quantum circuit, (ii) f −1 exists and for any two d-qubit states η Then, for ϵ-accurate computation of f , for d ≥ 2g, the required number of physical qubits is lower bounded by where G = ⌈ N g ⌉, and h 2 (•) is the binary entropy function.Proof of this theorem is presented in Appendix A. Proof of Proposition 1 follows from this theorem.
Proof of Proposition 1.For proving Proposition 1, we first consider the second term in the bound in Theorem 1: On the other hand, Hence, the second term in the bound in Theorem Next, we consider the first term in the bound in Theorem 1: , and obtain a lower bound for this term.Note that an upper bound on h 2 ( 2g d ln 1 1−ϵL ) will give a lower bound on this term.
The last inequality follows from the fact that G G−1 is monotonically decreasing in G and G ≥ d g .If f is unitary, then f −1 satisfies all the three conditions in Theorem 1 for L = 1.
Hence, the condition 0 , is satisfied by a large class of f , including unitary transformations, if 0 < ϵ < 1 − e − 1 8 ≤ 0.11.Thus, the derived bound is applicable to all practically important f , including the well known quantum Fourier transform, which is the fundamental building block of many interesting algorithms.
This completes the proof of Proposition 1.

Conclusion and Future Work
In this paper, inspired by the information theoretic bounds for noisy classical circuits [16,7], a connection between fault-tolerant quantum computation and finite blocklength communication is cultivated.This leads to a lower bound on the required space overhead for fault-tolerant computation and is given by gate size g divided by the coherent information of a g-fold noisy channel.This bound is tighter than the existing bounds and can be extended to the case where the noise on the outputs of a gate are correlated.It gives a tighter upper bound on the noise threshold.
In future, we would like to combine our approach with techniques developed in [17,12,9] for tightening the bound and design fault-tolerant schemes for achieving that bound.
As discussed in Sec. 3, obtaining I c (N ⊗k ) involves a 2 k -dimensional non-convex optimization problem.Since understanding non-convex optimization is of mathematical interest due to its usefulness in modern machine learning, an exploration for efficient computation of I c (N ⊗k ) can be of independent mathematical interest.

A Proof of Theorem 1
For proving 1, we first state an important lemma.

Lemma 1. Suppose the conditions (i)-(iii) in Theorem 1 hold. Then, for ϵ-accurate computation we need
, where, {ϵ i } are tunable auxiliary variables taking values in [0, 1] and satisfying the condition: Proof of this lemma is presented in Appendix B. Here, we prove Theorem 1 using this lemma.
First, by using the fact that 1 − x ≤ exp(−x), we obtain the following upper bound. max Clearly the above upper bound is bounded by the sum of the maximum of the following optimization problems: P1 and P2.

P1: max
P2: max As P1 is a maximization of sum of convex functions subject to a linear constraint, P1 is maximized at an extreme point, as it is a convex maximization.Under condition (iii) in Theorem 1, 2 ln 1 1−ϵL < 1 2 .Thus, the optimum in P1 is obtained when ϵ 1 = 2 ln 1 1−ϵL and ϵ i = 0 for i ≥ 2. Note that the optimum of P2 is upper-bounded by the optimum of the following problem.

P3: max
As P3 is a maximization of sum of concave functions subject to a linear constraint, by the symmetry of the problem, the optimum solution is Summing the optimum of P1 and P3, the resulting upper bound on d becomes Note that as every physical qubit goes through a gate of size g (including dummy identity gates as discussed in Sec. 2), N ≥ (G − 1)g and hence, This implies This completes the proof of Theorem 1.

B Proof of Lemma 1
For f −1 satisfying conditions (i)-(iii) in Theorem 1, satisfaction of the ϵ-accuracy condition C0 (Eq. 1) implies satisfaction of the following condition. C1: Let us denote ) and hence, f −1 • QC(ρ (d) ) can be written as H • N ⊗N • Ō1 (ρ (N ) ) for some H.Here Ō1 is the noiseless operation at layer 1, i.e., O 1 = N ⊗N • Ō1 .Hence, satisfaction of condition C1 implies satisfaction of the following condition.
The input to Ō1 is ρ (d) ⊗ ⊗ N −d a=1 σ a , which is a state of N physical qubits, where d of them are input qubits (from ρ (d) ) and N − d of them are ancilla qubits used for error correction.
Let in an ϵ-accurate implementation of QC, O 1 is implemented using G gates of size g in parallel.In that implementation, let the ith gate on O 1 , for 1 ≤ i ≤ G, take d i qubits from ρ (d) and rest from ancilla qubits as inputs.
Without loss of generality, let us assume that in O 1 , the first d 1 qubits are input to gate 1, next d 2 qubits are input to gate 2 and so on.Let the noiseless computation by gate i in layer 1 be given by G i (•).Then, for an input of the form ρ , where, σ a are the ancilla inputs to gate i.So, for an ϵ-accurate implementation, the following special case of Eq. 12 must be satisfied: ∃H s.t. for all ρ , and ρ (d) = ⊗ G i=1 ρ d i , if there exists an H satisfying the condition in Eq. 13, then there also exists an H = ⊗ G i=1 H i satisfying the same condition.This is because is independent of G j (•) and ρ d j for j ̸ = i and noise are independent across gates.
Thus, a necessary condition for satisfaction of C0 is: By the fact that 14) is equivalent to: for some {ϵ i ∈ (0, 1) : i}, for each i, where, G i=1 (1 − ϵ i ) ≥ 1 − ϵ L. For satisfaction of condition C2, a necessary condition is the following: for some {ϵ i ∈ (0, 1) : i}, for each i, where and hence, a necessary condition is ϵ i < 1 2 for 1 ≤ i ≤ G.Note that condition C3 is equivalent to the ϵ i -accuracy criteria for one-shot communication of quantum information over the quantum channel N ⊗g [13].Hence, by [13, Sec.9.1.2],a necessary condition for satisfaction of C3 is: for where h 2 is the binary entropy function.
From this, the following upper bound on d for ϵ-accurate computation is obtained: for some The same proof goes through after replacing N ⊗g by a general 2 g -dimensional channel N (g) .Hence, the bound applies if the noise at the outputs of a gate are not independent, but noise are independent across gates.

C Proof of Proposition 2
Consider the bound (8) obtained in the proof of Theorem 1.When I c (N ⊗g ) = 0 and ϵ < 0.11 this bounds become Then, using the facts that h 2 (x) is monotonic over [0, 0.5] and h 2 (x) ln 2 ≤ x ln 1 x + 1 1−x , we obtain the following bound.
The first inequality follows because 2 ln This completes the proof of Proposition 2.

D Towards a Stronger Impossibility Result
Using the same proof as that of Lemma 1 and the results in [13, Sec. 9.
It is known that I c (N ⊗g ; α) ↓ I c (N ⊗g ) as α ↓ 1 [13] and hence, it can be possible to take the impossibility threshold arbitrarily close to I c (N ⊗g ) = 0.
The fact I c (N ⊗g ; α) ↓ I c (N ⊗g ) as α ↓ 1 implies for α ≤ α th , for some α th > 1, Ic(N ⊗k ;α) k .This implies that no computation with number of qubits d > α th 3 2(α th −1) is possible for certain noise strengths, which include the noise strengths for which the quantum capacity is zero, i.e., sup k≥1 Ic(N ⊗k ;α) k = 0.

Corollary 1 .
For g = o(d) and ϵ ∈ (0, 0.11), there exist a class of functions that include unitary transformations, such that the limiting space overhead, lim d→∞ N d , required for ϵ-accurate implementation of those functions is lower bounded by g Ic(N ⊗g ) .

1 . 2 ]−1 log 2 1 1 −
, one can obtain another bound on d, in terms of α-sandwiched Renyi coherent information I c (N ⊗g ; α) of the channel.Here, α is a parameter more than 1.d = G i=1 d i ≤ G i=1 I c (N ⊗g ; α) + α αϵ i , where, G i=1 (1 − ϵ i 2 ) ≥ 1 − ϵ L. Therefore, the bound becomes GI c (N ⊗g ; α) + α α − 1 max By allowing noiseless forward classical communication, for the same rate of quantum communication, fidelity of 1 − η can at most improve to 1 − η 2 [2, Sec.VIII].Using this fact, if follows that by allowing noiseless classical computation and classical buffers, the bound can at most be the following.