Computational power of one- and two-dimensional dual-unitary quantum circuits

Quantum circuits that are classically simulatable tell us when quantum computation becomes less powerful than or equivalent to classical computation. Such classically simulatable circuits are of importance because they illustrate what makes universal quantum computation different from classical computers. In this work, we propose a novel family of classically simulatable circuits by making use of dual-unitary quantum circuits (DUQCs), which have been recently investigated as exactly solvable models of non-equilibrium physics, and we characterize their computational power. Specifically, we investigate the computational complexity of the problem of calculating local expectation values and the sampling problem of one-dimensional DUQCs, and we generalize them to two spatial dimensions. We reveal that a local expectation value of a DUQC is classically simulatable at an early time, which is linear in a system length. In contrast, in a late time, they can perform universal quantum computation, and the problem becomes a BQP-complete problem. Moreover, classical simulation of sampling from a DUQC turns out to be hard.


Introduction
Quantum computation is widely believed to be intractable by classical computers. However, there also exist certain types of quantum circuits that can be efficiently simulated classically despite being able to generate highly entangled states. Famous examples are quantum circuits which consist of Clifford gates [1] or matchgates, corresponding to free-fermionic dynamics [2][3][4][5]. Such classically simulatable quantum circuits are of importance because they illustrate what makes universal quantum computation different from classical computers. Moreover, they have practical applications such as randomized benchmarking [6][7][8], simulation by stabilizer sampling [9], and estimation of an error threshold of a quantum error correction code [10].
Classically simulatable or exactly solvable quantum circuits are also important in the study of dynamics of isolated quantum systems [11][12][13]. For example, Clifford circuits include quantum dynamics which are both integrable and non-integrable 1 , and they have been investigated from the perspective of quantum thermodynamics, and especially thermalization [14][15][16]. Moreover, physical quantities, such as entanglement entropy and out-of-time-ordered correlators, of an ensemble of one and higher dimensional Haar random unitary circuits are calculated exactly [17][18][19]. Because, in general, it is notoriously difficult to treat quantum dynamics analytically except for one-dimensional integrable systems [20], above quantum circuit representations of quantum dynamics are powerful tools to analyze their physical properties. In addition, since two-dimensional quantum dynamics are not explored in comparison with one-dimensional cases, solvable models in two dimensions are in great demand.
Recently, a new class of quantum gates called "dual-unitary gates" has been introduced [21,22]. Dual-unitary gates are unitary gates which remain unitarity under reshuffling their indices. The dynamics consisting of dual-unitary gates can be either integrable or non-integrable. In Refs. [22,23], it has been shown that dynamical correlation functions and time evolution of operator entanglement entropy under dual-unitary quantum circuits (DUQCs) are calculated exactly. Because these quantum circuits have one spatial dimension, we call them one-dimensional (1D) DUQCs. Moreover, it has been shown that when the system size is infinite, time evolution of local observables, correlation functions, and entanglement entropy of 1D DUQCs arising from certain initial states can be calculated exactly [24]. As we will define later, initial states are described by matrix product states (MPSs) whose matrices satisfy certain conditions. They are called solvable initial states [24]. The simplest example is a chain of EPR pairs, and the simplest counter-example is a product state.
Interestingly, despite the above property, dual-unitary gates contain arbitrary single-qubit gates and a certain class of two-qubit entangling gates. Note that these gates form a universal gate set if we can apply them freely [25]. Nevertheless, the carefully constructed initial states and the infinite size limit allow us to compute the expectation values efficiently. Here, we ask whether they are classically simulatable or allows universal quantum computation if the system size is finite. The finiteness of the system size enables us to consider the circuit depth which scales with the system size and characterize their computational power.
In this paper, we investigate quantum computational power of 1D and two-dimensional (2D) DUQCs. Specifically, we characterize the computational complexity of the problem of calculating expectation values of local observables and the sampling problem of 1D and 2D DUQCs with finite system sizes. Additionally, we study classical simulatability of correlation functions of 2D DUQCs. Here, 2D DUQCs takes certain initial states which are product states of solvable initial states. Note that the generalization to the 2D lattice is of interest not only from the viewpoint of quantum computing but also from the viewpoint that exactly solvable quantum dynamics in two spatial dimensions is limited.
Our results are summarized in Fig. 1. For the first problem, we show that expectation values of local operators O of 1D and 2D DUQCs are exponentially close to Tr(O) until time, or circuit depth, t ∼ 1 2 N and t ∼ N , respectively, where N is a system length. In other words, local expectation values do not depend on specific choices of dual-unitary gates in those time regions, and they are classically simulatable. On the other hand, in later time, local expectation values of DUQCs can depend on dual-unitary gates, and we find that DUQCs can simulate universal quantum computation after time poly(N ). It means that the problem becomes BQP-complete after time poly(N ). As we will discuss later, the depth overhead for simulating quantum circuits consisting of nearest-neighbor CZ gates and single-qubit gates with DUQCs is O(N ). This contrasts to conventional classically simulatable quantum circuits with a fixed gate set, such as Clifford or matchgate circuits, where classical simulatability does not change depending on the circuit depth. In addition, we show that sampling of the output of 1D and 2D DUQCs is intractable for classical computers after time t ∼ 1 2 (N − √ N ) and t ≥ 4, respectively, unless polynomial hierarchy (PH) collapses to its third level. This result is based on the fact that if a quantum circuit with post-selection can simulate universal quantum computation, the output probability distribution cannot be sampled by a classical computer efficiently unless PH collapses to its third level [26]. It implies that, especially in the case of two dimensions, the sampling problem of constant-depth DUQCs is as hard as that of general constant-depth quantum circuits.
Moreover, we find that correlation functions of 2D DUQCs along a special direction, which is determined by the initial state and defined later, become the trace of operators in linear depth, and hence they are classically simulatable. In contrast, the value of correlation functions along the other direction can depend on a choice of unitary gates, and they do not seem to be classically simulatable. We leave as an open problem whether or not the problem of calculating correlation functions of 2D DUQCs in linear depth is BQP-hard. Finally, we also show a sufficient condition of 2D lattices on which local observables of 2D DUQCs at an early time are classically simulatable. For instance, the lattices satisfying this condition include honeycomb lattices. In summary, we reveal that the computational power of DUQCs strongly depend on their circuit depth and problem settings. This provides a novel quantum computational model to investigate both classical simulatability of quantum computation and physical properties of non-equilibrium quantum systems.
The rest of the paper is organized as follows. In Sec. 2, we introduce and review 1D DUQCs. In Sec. 3, we characterize the complexity of the problem calculating local expectation values and the sampling problem of 1D DUQCs. In Sec. 4, we generalize 1D DUQCs to two spatial dimensions, and characterize the complexity as with the 1D case. After that, we discuss classical simulatability of correlation functions of 2D DUQCs and a generalization of lattices of qubits. Sec. 5 is devoted to conclusion and discussion.

Dual-unitary gates
We consider a 2N -qubit system. Its computational basis is denoted by |i 1 i 2 . . . i 2N , where i j = 0, 1 indicates a state of the j-th qubit. A dual-unitary gate is a two-qubit gate in the following form: where SWAP is the swap gate, CZ is the controlled-Z gate, 1 and v 2 are arbitrary single-qubit gates, and both φ and α are arbitrary real numbers. Alternatively, Eq. (1) is rewritten as It has been shown that these gates can describe both integrable and non-integrable periodically driven quantum systems, or Floquet systems [22]. For example, the time evolution operator of a periodically driven quantum system with XXZ interaction, is one of the dual-unitary gates. The time evolution operator of a self-dual kicked Ising chain can also be written in terms of dual-unitary gates as follows: where T is the time ordered product, δ(t) is the Dirac delta function, and h is a real number. Dual-unitary gates have the following nice property. Let U be a nearest-neighbor two-qubit gate. DefineŨ , called the dual gate of U , such that k| l|Ũ |i |j = j| l|U |i |k .
Then,Ũ is a unitary gate if and only if U is a dual-unitary gate [22]. This property can be expressed graphically by using a tensor-network representation of quantum circuits. We represent a two-qubit gate U and U † as where (i, j)-legs and (k, l)-legs of U and U † serve as inputs and outputs, respectively. The unitarity, U U † = U † U = I, can be written as Similarly, the dual gate of U and its Hermitian conjugate are represented as where (i, j)-legs and (k, l)-legs ofŨ andŨ † serve as input and outputs, respectively. The property of a dual-unitary gate, namelyŨŨ † =Ũ †Ũ = I, can be written as

Dual-unitary quantum circuits
1D DUQCs are quantum circuits with 2N qubits which consist of nearest-neighbor dual-unitary gates. They are defined as follows: where t is an even number, and U i,j (τ ) is a dual-unitary gate acting on qubit i and j at time τ , or graphically, We note that dual-unitary gates can differ from each other, that is, the quantum dynamics can be inhomogeneous in space and time. In Eqs. (12) to (15), we assume a periodic boundary condition (PBC) in space, that is, U 2N,2N +1 = U 2N,1 .

Solvable initial states
In a 1D DUQC, when an initial state satisfies certain conditions and a system size is infinite, it has been shown that time evolution of local observables, correlation functions and entanglement entropy can be calculated exactly [24]. Such an initial state is called a solvable initial state and can be described in terms of a matrix product state (MPS) [24,39].
Here we describe two conditions that make a two-site shift invariant MPS, where A (i,j) is a χ-dimensional square matrix, a solvable initial state. |Ψ N (A) can alternatively be represented by a tensor-network as, , … .

(18)
The first condition is that its transfer matrix has a unique eigenvector with a maximum eigenvalue λ 0 . A transfer matrix associated with |Ψ N (A) is defined as or graphically, .
Note that Ψ N |Ψ N = Tr(E N ) ≈ λ N 0 for large N , which implies λ 0 = 1 is needed in order to normalize |Ψ N (A) , namely Ψ N |Ψ N = 1 in the limit of N → ∞. The second condition is that A satisfies the following condition: Furthermore, Eqs. (21) and (23) imply that the transfer matrix has as right and left eigenvectors with an eigenvalue 1. Strictly speaking, the second condition considered here is more restrictive than that of Ref. [24]. However, they are equivalent in the thermodynamic limit (see Theorem 1 in Ref. [24]), and we adopt Eq. (22) for clarity. The simplest example of solvable states is a chain of EPR pairs |EPR ⊗N , where |EPR = 1 √ 2 (|00 + |11 ), or graphically, whose χ is zero. This example has been studied to analyze the entanglment dynamics of self-dual kicked Ising chains [40].

Local expectation values of 1D DUQCs
Let us briefly describe how expectation values of local observables can be calculated for dual unitary circuits with solvable initial states [24]. We consider a time-evolved transfer matrix where each number of the right-side 2t + 2 legs indicates the input space which E(t) acts on. It can be shown by Eqs. (11) and (22) that is both right and left eigenvectors of E(t) with eigenvalue 1. In fact, |I(t) is the unique eigenvector with maximum eigenvalue. This leads to the following equality: By virtue of Eq. (28), one can calculate an expectation value of a local observable O as follows: where is a solvable initial state evolved to time t. It means that an expectation value of a local observable does not depend on a specific choice of dual-unitary gates. From a similar argument, it has been also shown that correlation functions of 1D DUQCs are classically simulatable [24].

Local expectation values
Although local observables in a 1D DUQC are calculated exactly when the system size is infinite, a dual-unitary gate can contain arbitrary single-qubit gates and the CZ gate, which can form a universal gate set in principle [41]. Thus, it is natural to ask whether or not DUQCs can perform universal quantum computation when their system size is finite. Finiteness of the system size enables us to consider the circuit depth which is scaling with the system size and characterize their computational power.
In this section, we answer the above question affirmatively and characterize computational complexity of the problem of calculating local expectation values for dual-unitary circuits. More precisely, we consider a local observable with length l where i 0 is an integer, and O i 0 +i is an observable on qubit Here, we note that |Ψ N (A) is not generally normalized for a finite N . Then, we define the following decision problem, which has a parameter t.

Problem 1 (local expectation values of 1D DUQCs). Consider a 1D DUQC in time
We show that Problem 1 with time t ≤ (1 − δ)N − l/2, for an arbitrary 0 < δ < 1, is in P. In contrast, with time t = poly(N ), it becomes BQP-complete. We first prove the above statements in the case where the initial state is a chain of EPR pairs Eq. (25), which is normalized for any N , for simplicity. Then, we extend to general solvable initial states.

1D DUQC is classically simulatable at an early time
First, we prove the former statement. In this case, an expectation value where l and i 0 are respectively assumed to be even for clarity, and the normalization coefficient which arises from Eq. (25)  We can easily generalize the above results to general solvable initial states. Let us note that a transfer matrix of a solvable MPS to the power M can be written as where ε M is a matrix such that leading order of non-zero elements are O(|λ 1 | M ) and λ 1 is the second largest eigenvalue of E. By using Eq. (32), an expectation value of O with t ≤ (1 − δ)N − l/2 can be calculated as follows: where , deriving from the matrix ε M , is O |λ 1 | δN . Then, by applying Eqs.
Here, we note that if an initial state is a chain of EPR pairs, δ can be chosen as zero, that is, On the other hand, for time t > N − l/2, the expectation value can be written as In contrast to Eq. (31), we cannot use Eqs. (9) and (11)

1D DUQC is universal in late time
The inclusion of the problem in BQP is trivial; we can just execute the dual-unitary circuit to obtain an expectation value. What is left to be shown is that the problem is BQP-hard. To do that, we consider a BQP-complete problem calculating whether an local expectation value , and U is a 1D quantum circuit consisting of poly(n) nearest-neighbor two-qubit gates. Then, we show that Problem 1 is as hard as the BQP-complete problem after time poly(N ).
Firstly, we construct the CZ gate acting on an even numbered site and an odd numbered site by cancelling swap gates from a DUQC, as follows: where Remember that the index 2N + 1 is treated as 1 according to a PBC. Graphically, Eq. (35) can be written as Fig. 2. Additionally, one can apply CZ gates in parallel by substituting CZ · SWAP gates for some of SWAP gates in Eqs. (39) and (40).
Secondly, it can be easily shown that an arbitrary product of single-qubit gates can be implemented by substituting suitable u 1 ⊗ u 2 · SWAP gates for some of SWAP gates in Eqs. (35) and (36). A quantum circuit consisting of arbitrary one-qubit gates and CZ gates has capability to efficiently simulate arbitrary quantum circuit consisting of poly(n) nearest-neighbor two-qubit gates [41]. Therefore, we can construct a DUQC U 1D with poly(N ) depth such that holds, which means that Problem 1 with time t = poly(N ) and input a chain of EPR pairs is a BQP-complete problem. Here, we remark that, from our construction, if U is written as a depth-d quantum circuit consisting of nearest-neighbor CZ gates and single-qubit gates, the depth of U 1D which simulates U is O(dN ). This is because U 1D requires O(N ) simulation cost for CZ gates in U in each time. It means that the depth overhead for simulating U with U 1D is O(N ).

Sampling problem
We now move on to discuss the sampling complexity of 1D DUQCs. We show that classical simulation of sampling from linear depth 1D DUQCs is hard. Let {p z } be the probability distribution with p z being the probability of obtaining output z ∈ {0, 1} 2N when |Ψ t is measured in computational basis. We define that a probability distribution {p z } is sampled by classical computers efficiently with a multiplicative error c if there exists a classical probabilistic polynomial-time algorithm that outputs z with probability q z such that |p z − q z | ≤ cp z , for all z. It has been shown that if an n-qubit quantum circuit with post-selection can simulate universal quantum computation, the measurement output of n qubits cannot be sampled by classical computers efficiently with a multiplicative error unless polynomial hierarchy collapses to its third level [26]. On the other hand, measuring a square lattice cluster state in an arbitrary measurement basis with post-selection can perform universal quantum computation [42]. A square lattice cluster state is defined as where |+ is H |0 , we assigned a qubit in |+ state to each vertex of a square lattice, and E is the set of all edges of the lattice.
Here, we prove that a 1D DUQC can generate a square lattice cluster state with an EPRchain initial state in time N − √ 2N /2 + 1, which implies that sampling from U 1D |EPR ⊗N is classically hard.
We assume that 2N is a square of an even number 2m, which enables us to make a one-toone correspondence between the 1D qubits and ones on square lattice. Note that the following equation holds: where H is the Hadamard gate. With this in mind, we obtain a square lattice cluster state by applying the following dual-unitary circuit to the initial state |EPR : where, and U cluster (t) at other odd and even times are SWAP 1 and SWAP 2 , respectively. Graphically, the above DUQC can be written as Fig. 3 (a), and it is equivalent to Fig. 3 (b). As shown in Fig. 3 (b) and (c), by rearranging the 1D qubit array (b) to the square lattice (c), the final state of the DUQC is equivalent to the square lattice cluster state. Moreover, because dual-unitary gates include arbitrary single-qubit gates, we can measure the final state in an arbitrary measurement basis. Therefore, sampling from depth-(N − √ 2N /2 + 1) 1D DUQCs is unlikely to be classically simulatable.

Generalization to 2D DUQCs
In this section, we generalize DUQCs to two spacial dimensions and characterize their computational power.
We define U (1) , U (2) , U (3) , and U (4) as where U D (i,j),(k,l) is a dual-unitary gate acting on qubit (i, j) and qubit (k, l), and U (i,j),(k,l) is an arbitrary two-qubit unitary gate acting on qubit (i, j) and qubit (k, l). We also define U {2,4} as a matrix which is an arbitrary product of U (2) and U (4) , for example, U (2) , U (2) U (4) , or U (2) U (4) U (2) . We note that the fact that a unitary gate in U (2) or U (4) can be an arbitrary unitary gate is a significant difference between one and two spatial dimentions. In Eqs. (49) to (52), we assumed a PBC in space, that is, U (2N,k),(2N +1,k) = U (2N,k), (1,k) and U (j,2M ),(j,2M +1) = U (j,2M ),(j,1) for all k and j. Then, we define 2D DUQCs are quantum circuits with 2N × 2M qubits as follows: where t is a multiple of four and U i (τ ), i = 1, {2, 4}, 3, is a unitary U i at time τ . In the following subsections, we consider initial states |Ψ A which are product states of solvable initial states |Ψ N A aligned in rows: or graphically, where i j,k indicates the state of (j, k)-th qubit. We call these initial states 2D solvable initial states in the sense that, as we show in the next section, local expectation values are classically simulatable at an early time.

Local expectation values
We characterize computational complexity of the problem of calculating local expectation values for 2D DUQCs. We consider a local observable in an l × l square region where O i 0 +i,j 0 +j is an observable of qubit (i 0 + i, j 0 + j) for some integers i 0 and j 0 . Hereafter, for clarity, we assume l, i 0 , and j 0 to be odd number, even number, and even number, respectively. Note that the discussion of the following subsections are not limited to this choce, and similar argument can be applied to other choice of parities. We denote by |Ψ A (t) a 2D solvable initial state Eq. (55) evolved at time t, that is, |Ψ A (t) = U 2D (t) |Ψ A . Then, we define the following decision problems as with the 1D case.

and a local observable O whose operator norm is 1 with length
Similary to the 1D case, we show that Problem 2 is in P with time t ≤ 2(1 − δ)N − l, for 0 < δ < 1, and BQP-complete with time t = poly(N ).

2D DUQC is classically simulatable at an early time
The way to calculate local expectation values is similar to the 1D case except for contractions of unitary gates at even time. The procedure is to contract dual-unitary gates at odd time by using Eq. (9) and unitary gates at even time by using Eq. (11). As a result, if t ≤ 2(1 − δ)N − l and t ≥ l + 1, we obtain Origins of the conditions t ≥ l are the same as those of 1D cases. We explain the procedure graphically in the case that an initial state is rows of EPR pairs (Fig. 4). Firstly, we remove unitary gates out-side of the causal-cone. Then, the local expectation value can be represented as Fig. 4 (a). A tensor network in dotted line of Fig. 4 (a) is depicted in further detail in Fig. 4 (b). In the first equality in Fig. 4 (b), we use the definition of dual-unitary gates Eq. (11), and remove them. After that, unitary gates, which are contracted with removed dual-unitary gates (unitary gates painted in red and orange in Fig. 4) Problem 2 with t ≤ 2(1 − δ)N − l is in P. Note that this is true even if U (2) and U (4) are replaced by U {2,4} . For example, in the case of U {2,4} = U (4) U (2) , dual-unitary gates and unitary gates are contracted and removed as Fig. 4 (c). We note that classical simulatability of 2D DUQC are also interesting as a solvable model of quantum dynamics because analytical research on quantum dynamics in 2D systems is much more difficult than 1D cases. In fact, all local expectation values of 2D DUQCs become Tr(O) in the thermodynamic limit, which means that a local density matrix of a 2D solvable initial state evolved by 2D DUQCs is identical to a thermal equilibrium state at infinite temperature. Therefore, thermalization of solvable initial states can be shown analytically in the thermodynamic limit. Understanding conditions when thermalization happens is one of the most important problems in non-equilibrium physics [11][12][13]43], and it means that 2D DUQCs are rare toy models of 2D non-equilibrium quantum physics.

2D DUQC is universal in late time
Based on the fact that Problem 1 with time t = poly(N ) is BQP-complete , BQP-completeness of Problem 2 with time t = poly(N ) becomes trivial by noticing that U 2D (t) acting on any row of 2N -qubit can be regarded as 1D DUQCs when both U (2) and U (4) are identity operators.

Sampling problem
We now move on to discuss the sampling complexity of 2D DUQCs. We show that a constant depth 2D DUQC can generate a square lattice cluster state. We obtain a square lattice cluster state |CS by four depth 2D DUQCs as follows: Therefore, sampling the output of constant depth 2D DUQCs is unlikely to be classically simulatable.

Correlation functions
Let us discuss classical simulatability of two-point correlation functions. For simplicity, we assume that the initial state is |EPR , but one can easily generalize the following argument to general solvable initial states. In such a case, we expect two-point correlation functions to be anisotropic because solvable initial states are anisotropic. We consider two types of correlation functions, one of which is classically simulatable, and the other does not seem to be classically simulatable. First, we consider the following correlation function, which is classically simulatable: where |EPR t is |EPR evolved at time t, and O i,j is an observable of qubit (i, j). When 2N ≥ 2t − r holds, C 1 (r, t), which can be graphically represented as Fig. 5 (a), can be straightforwardly calculated as with calculation of expectation values shown in Fig. 4. As a result, we have C 1 (r, t) = 0. In other words, qubit (i, j) and qubit (i + r, j) cannot be correlated for an arbitrary 2D DUQC in the time region. Second, we consider the following correlation function, which is expected to be classically intractable: (a) (b) Figure 6: Equality of a honeycomb lattice and a square lattice up to longitudinal edges. As illustrated by (a), a honeycomb lattice can be deformed into a lattice which is made by rectangles. This deformed lattice is equal to a square lattice up to longitudinal edges, as illustrated by (b).
We also assume that 2N ≥ 2t − r holds. C 2 (r, t) can take a nonzero value only in the case of r = t + 1 and odd j, r = t, or r = t − 1 and even j. In those cases, we conjecture that C 2 (r, t) is unlikely to be classically simulatable because of the following reason. C 2 (r, t) can be graphically represented as Fig. 5 (b). After contracting unitary and dual-unitary gates of Fig. 5 (b) by using Eq. (9) and Eq. (11), the remaining gates of it can be represented as Fig. 5 (c). This is similar to correlation functions of 1D DUQCs (see Fig. 3 of Ref. [24]), but important difference is that uncontracted gates form a 2D tensor-network in 2D DUQCs. Because of this, calculating correlation functions in 2D DUQCs seems to be hard for a classical computer. Therefore, classical simulatability of correlation functions seems to depend on the relative position of two local observables. It is still an open problem whether or not calculating C 2 (t, t) with the condition 2N ≥ 2t − r is BQP-hard.

General lattice pattern
So far, we only consider dynamics in a 2D square lattice. It is natural to consider a generalization to other lattices, such as a honeycomb lattice. We note that unitaries at even-time of dynamics defined in Sec. 4.1 can be chosen as identity gates. If a unitary gate U (j,k),(j,k+1) is an identity gate at all time, the edge between qubit (j, k) and qubit (j, k + 1) can be effectively removed. So, if we construct lattices by eliminating edges in kdirection, 2D DUQCs with such a lattice can be treated as with ones with a 2D square lattice. We name such lattices solvable lattices. Such lattices include, for example, a honeycomb lattice. This is illustrated by Fig. 6. One can construct an arbitrary solvable lattice in the same way. Therefore, local expectation values and correlation functions C 1 of 2D DUQCs with solvable lattices are classically simulatable at an early time as same as those with square lattices.

Conclusion and discussion
We have investigated computational complexity of the problems calculating physical properties of 1D and 2D DUQCs. First, we have shown that the complexity of calculating local expectation values of dual-unitary quantum circuits highly depends on their circuit depth. Second, we have shown that classical simulation of sampling from 1D and 2D DUQCs after linear and constant depth, respectively, is hard.
The first result is in contrast to conventional classically simulatable quantum circuits with fixed gate sets such as Clifford circuits and matchgates, whose classical simulatability does not change depending on the circuit depth. The dual-unitary quantum computational model is the first example of the model, where quantum computational power makes a transition between O(N ) time and poly(N ) time. It is reminiscent of dynamical phase transitions of computational complexity which have been recently studied in other contexts [44][45][46]. It would be interesting to investigate whether or not local expectation values of DUQCs at an early time slightly later than that considered in this paper are also classically simulatable. Another future direction would be to study the computational power of DUQCs in linear depth with non-solvable initial states. In this context, it is well known that Clifford circuits and matchgates circuits with certain initial states, so-called magic states, have potential to outperform those without magic states [47,48]. Thus, it is natural to expect that there exist magic states enhancing the computational power of DUQCs at an early time, and we leave it for future work.
In the second result, we have considered classical sampling from a DUQC with a multiplicative error. Here, we note that classical simulation of sampling with an additive error from certain quantum computing models, such as linear optical circuits [49], IQP circuits [50], and the DQC1 model [51], is known to be hard under plausible assumptions. With this in mind, because of the computational universality of DUQCs, sampling with an additive error from DUQCs which simulates IQP circuits is also intractable for classical computers. Another sampling problem, which attracts much attention from both theorists and experimentalists, is random circuit sampling [52,53]. It would be interesting to investigate whether or not classical sampling with an additive error from a random DUQC, where every gate is chosen randomly from the set of dual-unitary gates, is hard.
Moreover, we remark that our argument on the computational universality and sampling complexity of one-and two-dimensional DUQCs can be straightforwardly extended to the case of open boundary conditions (OBCs). However, under the condition, classical simulation of DUQCs at an early time becomes harder since, in general, unitary gates at the boundary of causal-cone cannot be cancelled. We discuss this point in some more detail in Appendix D, and we leave a full characterization of their computational power for future work.
Besides, because analytic research on quantum dynamics in 2D systems is much more difficult than 1D cases, generalization of DUQC to two spatial dimensions are also interesting as a solvable model of quantum dynamics. It would be important to generalize DUQCs to higher than two spatial dimensions and construct more general solvable initial states, for example, using higherdimensional tensor-network states such as projected entangled pair states (PEPSs) [54] . We expect that a high-dimensional DUQC will deepen our understanding of a non-equilibrium phenomenon in a high-dimensional quantum system, such as a 2D self-dual kicked Ising model.

A A calculation procedure for local expectation values of a chain of EPR pairs in 1D DUQCs
In this appendix, we show that local expectation values for 2N ≥ l + 2(t − 1): is equal to 1 2 l Tr(O). The procedure is similar to that in Refs. [21,22,24]. To begin with, Eq. (65) is equal to By adapting Eq. (11) to the leftmost and rightmost unitaries, Eq. (66) becomes By repeating the process, Eq. (67) becomes (68) Finally, by using Eq. (9) repeatedly, Eq. (68) becomes .
Taking the renormalization coefficient into account, one obtains Ψ N t |O|Ψ N t = 1 2 l Tr(O).

B Local expectation values of general solvable initial states.
In this appendix, we show that local expectation values at time t with general solvable states are approximated by 1 2 l Tr(O). Local expectation values without normalization can be written as where N and M are the total number of tensors A and ones which are outside of the causal-cone, respectively. By inserting identity operators I = χ α=1 |α α|, one obtains * * * * * * * * * * ! . (71) The second factor of Eq. (71) is equal to a matrix element of E M , where E is the transfer matrix defined in Eq. (20). By the assumption that |I is the unique eigenvector of the matrix E with the latgest eigenvalue 1, the Jordan canonical form of E can be written as where D is the number of Jordan blocks, P i is the diagonal part, N i is the nilpotent part, and λ i is less than 1 and ordered in descending order. Let ε be i (λ i P i + N i ).
where B is a matrix, the following holds: By the binomial theorem, ε M is expanded as where d i is the dimension of the i-th Jordan block, that is, the dimension of the space on which P i acts nontrivially, and we used the fact that N d i i is the zero matrix. Then, ε M is upper bounded as follows: for j = 0, 1, · · · , d i , where the first inequality follows from the fact that for a matrix B, B is less than its Frobenius norm B F = i,j=1 B 2 ij [55], and the second inequality follows from the fact that the maximum value of matrix elements of ε M is max i,j M C j λ M −j i due to Eq. (76) . From Eqs. (75) and (79), the following holds: where we used the inequality d i < χ 2 and the assumption χ = O(1). Second, we evaluate the first factor of Eq. (71), namely, * * * ! . (81) Then, Eq. (81) can be written as and this is upper bounded as follows: where in the first inequality, we used Eq. ). Therefore, local expectation values with t ≤ (1 − δ)N − l/2, for 0 < δ < 1, are classically simulatable.

C Quantum circuit representation of a 2D self-dual kicked Ising model
In this appendix, we show that a 2D self-dual kicked Ising model is represented as a 2D DUQC. Let us consider a 2D self-dual kicked Ising model, associated with a 2N × 2N square lattice: where δ(t) is the Dirac delta function, |J| and |b| are equal to π 4 , h is an arbitrary real number, and we adopted PBCs, that is, Z j,2N +1 = Z j,1 and Z 2N +1,k = Z 1,k . The Floquet operator associated to Eq. (93) can be written as where T denotes a time ordered product, and we define U K , U I1 , U I2 , U I3 , and U I4 as follows: 2N k=1 (JZ2j,kZ2j+1,k+hZ2j,k) , 2N k=1 (JZ2j−1,kZ2j,k+hZ2j−1,k) , Then, the integer powers of the Floquet operator have forms: where we define Using the fact that U KI1 and U KI3 are written as dual-unitary gates [22], it follows that the quantum circuit (U I2 U I4 U KI3 U I2 U I4 U KI1 ) t is one of 2D DUQCs. We note that this is even true if interaction strength J in Eqs. (99) and (101) are replaced by arbitrary real numbers. This is because unitary gates of 2D DUQCs in k-direction can be chosen arbitrarily.

D Local observables of DUQCs with open boundary conditions
In this appendix, we discuss classical simulatability of DUQCs under OBCs. First, we show that local expectation values of 1D DUQCs with OBCs at an early time become dependent on dualunitary gates but still classically simulatable. We define 1D DUQCs on 2N qubits with OBCs as the following: where without loss of generality we assume that a local observable is supported on the left half of chain. From the argument in Appendix C and by using Eqs. (11) and (22) It is dependent only on dual-unitary gates on the boundary of the causal-cone, and therefore it is classically simulatable. However, when we generalize the above argument to two-dimensional cases, local expectation values do not seem to be classically simulatable in linear depth because uncontracted unitary gates on the boundary of the causal-cone form 2D tensor-networks. This situation is similar to that of correlation functions for 2D DUQCs discussed in Sec. 4.4 of the main text. Besides, it is reminiscent of matchgate circuits, where classical simulatability depends on their connectivity [56]. As written in Sec. 5, it would be interesting future work to characterize the computational power of DUQCs with various connectivity.