Bell nonlocality with a single shot

In order to reject the local hidden variables hypothesis, the usefulness of a Bell inequality can be quantified by how small a p-value it will give for a physical experiment. Here we show that the expected p-value is minimized when we maximize the difference between the local and Tsirelson bounds of the Bell inequality, when it is formulated as a nonlocal game. We develop an algorithm for transforming an arbitrary Bell inequality into such an optimal nonlocal game, and show its results for the CGLMP and $I_{nn22}$ inequalities. We present explicit examples of Bell inequalities such that the gap between their local and Tsirelson bounds is arbitrarily close to one, and show that this makes it possible to reject local hidden variables with arbitrarily small p-value in a single shot, without needing to collect statistics. We also develop a new algorithm for calculating local bounds which is significantly faster than the methods currently available, which may be of independent interest.

If the maximal probability of winning the game with local hidden variables ω (G) is close to zero and the probability of winning it with the optimal quantum strategy ωq(G) is close to one then this nonlocal game makes it possible to reject local hidden variables in a single round.
free Bell tests [9][10][11], the approach taken was instead to calculate the probability that LHVs could reproduce the observed data, and declare them rejected if this p-value was below some threshold.
To obtain a stronger -or easier -rejection of LHVs, it is then interesting to search for the form of the Bell inequality that minimizes this p-value. The best way to address this question is to reformulate the Bell inequality as a nonlocal game [12], as the expected pvalue decreases monotonically with the size of the gap of the nonlocal game, which is the difference between its Tsirelson bound and its local bound. We develop an algorithm to translate a Bell inequality into an equivalent nonlocal game with the largest possible gap, showing that finding such an optimal game reduces to solving a linear programming problem. We present analytical solutions for unique games and the CGLMP inequalities, and numerical solutions for the I nn22 inequalities.
A crucial feature of nonlocal games is that they are played in rounds, and in each round the players either win or lose. This makes it possible to calculate the pvalue of obtaining any number of victories in any number of rounds, and raises the natural question of what is the minimal number of rounds necessary to reject LHVs with a given p-value. Perhaps surprisingly, we show that it is possible to do so with a single measurement, for any chosen p-value, as had been speculated before in Refs. [13,14]. To see that, note that the p-value of obtaining a single victory in a single round of the nonlocal game with LHVs is simply the local bound of game. If we then find a nonlocal game with local bound smaller than the desired p-value, obtaining a single victory in a single round is then enough to reject LHVs. If the Tsirelson bound of the game is close to one, it is very likely that this will happen when playing the game with quantum mechanics.
We present two ways of constructing nonlocal games with the desired properties. The first is by using the parallel repetition technique, well-known in computer science, where we play n instances of a nonlocal game in parallel, in a single round, instead of in n consecutive rounds as it is usually done. As shown by Rao [15], this will turn any nonlocal game with Tsirelson bound strictly larger than the local bound into a nonlocal game with local bound arbitrarily close to zero and Tsirelson bound arbitrarily close to one. The second construction uses the Khot-Vishnoi game; as shown by Kempe et al. [16] there exists a choice of parameter for which its local bound is arbitrarily close to zero and its Tsirelson bound is arbitrarily close to one. In both cases a quantum state with unreasonably large dimension is required to obtain a single-shot rejection of LHVs.
This raises the question of what is the minimal quantum state dimension required to obtain a given gap. We show that the size of the gap is closely related to the so-called largest violation of a Bell inequality [17,18], and this relationship allows us to derive upper bounds on the size of the gap as a function of the dimension. The gaps of the constructions from the previous paragraph are much smaller than the upper bounds we found, which raises the possibility of constructing a nonlocal game that makes possible a much easier single-shot rejection of LHVs.
The paper is organised as follows: Section 1 introduces Bell inequalities and nonlocal games. Section 2 discusses how to obtain the p-value for a given nonlocal game, and shows that nonlocal games with a large gap have small p-value. Section 3 presents the algorithm for transforming a Bell inequality into an equivalent nonlocal game with largest possible gap, and present its results for the CGLMP and I nn22 inequalities. Section 4 presents the nonlocal games that allow rejection of LHVs with a single shot. Section 5 presents the bounds on the size of the gap. Section 6 presents the algorithm for calculating local bounds.

Bell inequalities and nonlocal games
There are two main approaches to the study of Bell nonlocality. In the physics literature it is common to use Bell inequalities. In a scenario where we two non-communicating parties, Alice and Bob, produce outcomes a and b when given settings x and y, a Bell inequality is the expression where p(ab|xy) are conditional probabilities, M ab xy are real coefficients and L is the local bound, which is the maximal value of the lhs when the probabilities admit a LHV model. When instead the probabilities are obtained from quantum mechanics, the supremum of the lhs is called the Tsirelson bound [19]. Bell inequalities are often written instead in terms of correlators A x B y := p(a=b|xy)−p(a =b|xy) when the outcomes a and b can take only two possible values. The prototypical example is the CHSH inequality [2] which has a Tsirelson bound of 2 √ 2. For a more indepth introduction to Bell inequalities see Ref. [20].
On the other hand, in the computer science literature it is common to use nonlocal games. Curiously, they were studied for a long time in relation to the complexity class MIP [21] before the connection to nonlocality was noticed [12]. Again for the bipartite scenario, a nonlocal game is a cooperative game in which a referee sends questions x, y sampled from a probability distribution µ(x, y) to two parties, Alice and Bob, which then provide answers a, b. The referee then accepts their answers with probability V (a, b, x, y), and the parties win the game if the referee has accepted their answers. The local bound of a nonlocal game G is denoted ω (G), and it is the maximal probability of winning the game with LHVs, and the Tsirelson bound, denoted ω q (G), is the supremum of the probability of winning the game with quantum mechanics. The prototypical example is the CHSH game, introduced by Tsirelson [22].
are Iverson brackets, i.e., [Π] = 1 if the proposition Π is true and 0 if the proposition Π is false. Its local bound is ω (G CHSH ) = 3/4 and its Tsirelson bound is ω q (G CHSH ) = (2 + √ 2)/4. There is a very important sense in which Bell inequalities and nonlocal games are completely equivalent: as remarked in Refs. [23,24], and as we explore in Section 3, all Bell inequalities can be transformed into a nonlocal game without affecting their ability to detect nonlocality. As we show in Appendix A, this is not true if we restrict the predicate V (a, b, x, y) to be deterministic, as is often done in the literature. As we show in Appendix B, however, if we also allow lifting, i.e., embedding the Bell inequality in a scenario with more inputs, then it is possible to transform all two-outcome Bell inequalities into nonlocal games with deterministic predicate. This does not hold for three or more outcomes. This transformation does have a cost, however: as we show in Section 3.1, the optimal nonlocal games corresponding to the CGLMP inequalities must have probabilistic predicate, even though they can always be turned into an equivalent nonlocal game with deterministic predicate and smaller gap.
In a wider sense, though, there are relevant differences between these concepts. In fact, there are several advantages to using the nonlocal game formulation: 1. The local and Tsirelson bounds of a nonlocal game are physically meaningful, and more generally we refer always to the probability of winning the game, as opposed to obtaining some value in the left hand side of the Bell inequality. This is very convenient for statistical analysis, and makes comparison between different nonlocal games meaningful.
2. It immediately suggests a simple and powerful statistic to analyse the results of multiple rounds of playing: the number of victories. The Bell inequality formulation suggests, on the other hand, that we should estimate each individual term and sum these estimates. This not only makes it impractical to do experiments with Bell inequalities that have a large number of terms (which is the rule, not the exception), but this statistic is vulnerable to the memory loophole: as shown in Ref. [13], it becomes possible for the LHVs to use knowledge of the past settings to slightly increase the value of the estimate. As we show in Section 2, the number of victories statistic is not vulnerable to the memory loophole.
3. It is much more pedagogical. Nonlocal games are easy to understand, and highlight essential features of Bell inequalities: that the questions x, y must be random, that particular answers a, b correspond to particular questions in particular rounds, and that the details of the experimental apparatus are irrelevant.
It's also worth mentioning an advantage of the Bell inequality formulation: they have a direct geometrical interpretation as separating hyperplanes in the set of correlations, and are very convenient to use when searching for the facets of the local polytope, which correspond to the so-called tight Bell inequalities [25].

p-values
When reporting the result of a Bell experiment, several authors write that they observed a violation of some number of standard deviations above the local bound, where this standard deviation refers to the variance of the experimental statistics [6][7][8]. For the purpose of rejecting LHV models this is not relevant, as discussed in detail in Ref. [26]. What is relevant is the p-value of the observed data according to the null hypothesis, in this case that the world is described by LHVs, as was reported in the recent loophole-free Bell tests [9][10][11].
As shown in Ref. [27], the probability of obtaining v or more victories out of n rounds with LHVs is given simply by the binomial distribution 1 even when taking into account the memory loophole. This is a proof that when using the number of victories in the nonlocal game as the statistic the memory loophole allows LHVs to play the game no better than simply playing them independently. The key idea behind this proof is Gill's observation that this statistic is a supermartingale [28], as was argued informally in Ref. [13] and further developed in Refs. [26,29,30]. The probability of obtaining v victories out of n rounds when playing G with the optimal quantum strategy is then and therefore the expected p-value is As we show in Appendix C, where is the gap of the nonlocal game G. Also interesting to consider is the p-value of the expected number of victories nω q (G) , where · is the ceiling function. As we also show in Appendix C, Note that in both cases the upper bound goes down monotonically with increasing gap for any n, and 1 Note that although the authors restricted the predicate V (a, b, x, y) to be deterministic, their proof holds without change for the general case. The authors treated the general case by demanding the predicate to be deterministic but allowing it to attribute a score to the players, instead of just a win or a loss. They could only prove a looser bound in this case, showing that this is a bad choice. moreover it gets arbitrarily close to zero as the gap becomes arbitrarily close to one.
We want to emphasize that the gap χ G is a good quantity to maximize if we are interested in a small p-value, because in the literature it is common to maximize instead the ratio ω q (G)/ω (G) (further explored in Section 5), but having a large ratio does not imply having a small p-value. For example, in the Khot-Vishnoi game (explained in Section 4.2) we can get for d ≥ 2 3 and a power of two. It is easy to see that the ratio ω q (G KV d )/ω (G KV d ) grows without bound with d, but both upper bounds we present go to one.

Optimal nonlocal game for a Bell inequality
In this section we derive the optimal nonlocal game corresponding to a Bell inequality, in the sense of maximizing the gap χ G . To start, let's define things more formally. A behaviour P is a tensor consisting of conditional probabilities for some Bell scenario, P ab xy := p(ab|xy). A Bell functional is defined by a tensor M , and the value of the Bell functional on a behaviour P is given by The local bound L(M ) is defined as where L is the set of local behaviours, and the Tsirelson bound Q(M ) is defined as where Q is the set of quantum behaviours. A Bell inequality is the expression M, P ≤ L(M ). A nonlocal game is defined analogously for a tensor G, with the additional restriction that where µ(x, y) is the probability that the referee sends questions x, y to the players, and V (a, b, x, y) is the probability that the referee accepts answers a, b on questions x, y. If the predicate V (a, b, x, y) is deterministic we call G a deterministic game.
The following theorem derives the conditions for a Bell functional to be a nonlocal game:

Theorem 1. A Bell functional M is a nonlocal game if and only if it respects positivity
and normalisation: Proof. To show that the conditions are sufficient, assume that they are satisfied. Then we can define x y (15) and where 0/0 = 0, so that µ(x, y) will be in fact a probability distribution, as µ(x, y) ≥ 0 and xy µ(x, y) = 1, so we have sufficiency. To show that the conditions are necessary, assume that (17) holds. Then so normalisation is necessary. Since positivity is obvious, we are done.
A nonlocal game G is equivalent to a Bell functional M if it can be obtained via transformations that have the same effect in any non-signalling behaviour. This is important because such transformations will not change which hyperplane the equation M, P = K defines on the set of non-signalling behaviours, and in particular if a Bell inequality M, P ≤ L(M ) is a facet of the local polytope so will be the expression G, P ≤ ω (G) for the corresponding nonlocal game.
As shown in Ref. [31], these transformations are 1. Adding a constant c xy to each outcome of each setting x, y, i.e., This takes K to K + xy c xy .
This takes K to dK 3. Adding some no-signalling constraint to M . This leaves K invariant.
The next theorem shows the optimal nonlocal game that can be obtained from a Bell functional by considering only the first two transformations: Theorem 2. The nonlocal game with the largest gap χ G that can be obtained from a Bell functional M via translation (20) and scaling (21) is given by Where and Its gap is given by Proof. First note that the result of the most general translation and scaling of a Bell functional M can be written as for α xy , β xy defined as in the statement of the theorem, and some arbitrary constants α xy and β , where again α := xy α xy.
Moreover, G ab xy is a valid nonlocal game, respecting positivity (13) and normalisation (14), if and only if α xy ≥ 0 and β ≥ 0. To see that, note that so for β ≥ 0 we have G ab xy ≥ 0 iff α xy ≥ 0, and that so xy max ab G ab xy ≤ 1 iff β ≥ 0. Both the local and Tsirelson bounds transform as K → 1 β+β +α+α (K + α + α ), so the gap of G is given by and it is maximized by setting β = α = 0.
Note that for this optimal game the signalling bound, the highest probability of success attainable with an arbitrary behaviour, is always equal to 1. This is not always the case for the non-signalling bound, the highest probability of success attainable with a non-signalling behaviour.
We now turn our attention to the remaining transformation, adding no-signalling constraints. Since for any M the optimal β and α will be given by equations (23) and (24), our goal is to minimize the sum β + α given by these equations over the no-signalling constraints 2 , in order to maximize the gap χ G .
The no-signalling constraints are equations of the form (30) meaning that the marginal probability that Alice obtains result 0 does not depend on whether Bob's input is 0 or 1. Since these are satisfied by all non-signalling behaviours (by definition), it means that we can add the corresponding coefficients to M , scaled by any constant, without changing its effect on non-signalling behaviours. For example, adding equation (30) times a constant γ means transforming M as Let then {S i } N i=1 be the set of no-signalling constraints for the corresponding Bell scenario. In the scenario with s A , s B inputs and k A , k B outputs for Alice and Bob there are independent ones, where independent means linearly independent and inequivalent under translation (20). They can be chosen as a basis for the signalling vector space defined in Ref. [31]. Defining then (33) This is a linear programming problem, that can be solved efficiently by numerical methods. We provide an implementation of this algorithm in Python using CVXPY [33,34] as an ancillary file.
We managed to find an analytic solution for two special cases: for the CGLMP inequalities and for when the nonlocal game obtained from M via Theorem 2 is a unique game, that is, when the predicate is of the form V (a, b, x, y) = [a = σ xy (b)] for some permutations σ xy . Note that XOR games are a particular case of unique games. The solution for the CGLMP inequalities is presented in Section 3.1, and for unique games it is to do nothing: when M corresponds to a unique game it is already optimal. In both cases the solution is unique: any change in the amount of no-signalling constraints decreases the gap. We prove this in Appendix E.
In both these cases the Bell functional of the optimal solution was orthogonal to the signalling vector space. In general, though, this is not the case, as we found for the I nn22 inequalities for n ≥ 3 (explored below). Moreover, the solution of the linear programming problem g := β + α does not in general uniquely determine the local and Tsirelson bounds of the optimal nonlocal game, as they are given by and it is sometimes possible to change α while keeping g constant. We shall see this in the example of the I 4422 inequality below.
This raises the question of how to choose α in general. We choose to maximize α, as we want the probability of winning the game with quantum mechanics to be as high as possible. We thus need to solve another linear program, maximizing α with the additional constraint that β + α = g.
This will finally give us optimal, unique, and physically meaningful local and Tsirelson bounds for a given Bell functional. We propose that the bounds so obtained should be taken as the local and Tsirelson bounds for the Bell functional.
To illustrate the method and examine some properties of the resulting nonlocal games, we applied it to the CGLMP and I nn22 inequalities.

CGLMP inequalities
The CGLMP inequalities, introduced in Ref. [35], are a family of inequalities for two parties, with two inputs per party labelled as 0 and 1, and d outputs per party, labelled from 0 to d − 1, defined for all integers d ≥ 2. They are tight for all d. Using the form of the inequalities presented in Ref. [36], it is easy to see that the transformations in Theorem 2 take them to the nonlocal game with probabilistic predicate G CGLMP d where the probability distribution of the inputs x, y is the uniform one, µ(x, y) = 1/4, and the predicate is where [·] are Iverson brackets, and 1 − k is the probability that the referee accepts the answers if the corresponding condition is met.
As proven in Appendix E, this predicate is the optimal and unique solution of the linear programming problem (33), meaning that it gives the optimal gap for all d. Moreover, since the solution is unique, it is not possible to turn G CGLMP d into a nonlocal game with deterministic predicate without decreasing the gap. If one accepts the reduction of the gap, it is possible to do so for every d. For example, it can be turned into an equivalent game with the deterministic predicate 3 and the same probability distribution over the inputs, µ(x, y) = 1/4. The gap gets multiplied by (d − 1)/d, however. This game is equivalent to the simplified CGLMP inequality found in Ref. [37]. Going back to the optimal game, its local bound is for all d. Its Tsirelson bound is not known exactly, but quantum states and measurements are known up to d = 8 that match the upper bounds given by the second level of the NPA hierarchy within numerical precision [38,39]. They are summarized in the following table:

I nn22 inequalities
The I nn22 inequalities are a family of bipartite Bell inequalities with n inputs and 2 outputs per party. They were introduced in Ref. [41] and shown to be tight for all n in Ref. [42]. Its Tsirelson bounds are in general not known, and for the I 3322 inequality might even require infinite-dimensional quantum systems to be reached [43].
We ran our algorithm to generate an optimal nonlocal game G nn22 up to n = 8. The Tsirelson bounds were upperbounded by the level 2 of the NPA hierarchy. The results are summarized in the following table: Note that for these cases the local bound is equal to within numerical precision. One interesting feature of the solutions is that for n ≥ 3 it always required the nonlocal game to have a nonzero projection in the signalling vector space. That is, if one simply takes the unique form from Ref. [31], that has zero projection, and calculates the optimal game according to Theorem 2, one gets a gap that is smaller than the optimal one. For example 4 , for n = 3 the gap so obtained is 0.0471.
Another interesting feature is that for n ≥ 4 solving the linear program for the optimal gap did not uniquely determine the local and Tsirelson bounds. For example, for n = 4 it was possible to reduce the local bound up to 0.7778 without changing the gap. One consequence of this fact is that the nonlocal game with optimal gap does not always have non-signalling bound equal to 1, in this case it was 0.9444.
The nonlocal game returned by the linear program did not have a particularly simple form, but for the case n = 3 we managed to use the no-signalling constraints to take it to a nice form without reducing the gap. The tensor is given by where an element G ab xy is written as the element (a, b) of the 4 × 4 submatrix at coordinate (x, y).

Diviánszky-Bene-Vértesi inequality
It is also interesting to consider the Bell inequality introduced by Diviánszky, Bene, and Vértesi in Ref. [44], that gives the best known lower bound on the real Grothendieck constant of order three, Since XOR games are a particular case of unique games, Appendix E implies that the optimal nonlocal game corresponding to it is simply the one given by Theorem 2. It has local bound ω (G DBV ) = 718334945 1112439876 ≈ 0.6457 (41) and Tsirelson bound so although the ratio L(M DBV )/Q(M DBV ) is close to K R (3), the maximum possible for a maximally entangled state in dimension 2, the gap is even smaller than the one of the CHSH game.

Violating a Bell inequality with a single shot
If you have played a single round of a nonlocal game G and won, the p-value of that victory is simply the local bound of the game ω (G). The probability of obtaining this victory when playing with the optimal quantum strategy is its Tsirelson bound ω q (G), so if the gap χ G is close to one we will obtain with high probability a result that has small p-value under LHVs. More formally, the expected p-value (5) is Since we also have that ω (G) ≤ 1 − χ G , it is enough to construct a family of nonlocal games with gap χ G arbitrarily close to one in order to get a single-shot rejection of LHVs for any chosen p-value. We describe here two ways of obtaining such a gap: via parallel repetition and via the Khot-Vishnoi game.

Parallel repetition
An obvious thing to try is to play n instances of a nonlocal game in parallel, as a single round, instead of playing them in n consecutive rounds as it is usually done. We would expect that, analogous to the consecutive case, the probability of winning with LHVs a fraction of games δ more than the fraction expected from its local bound decays exponentially with n, and the probability of winning such a fraction quantumly goes exponentially to one. It is not that simple, though, because in this scenario the LHV model has access to all inputs of each party simultaneously, and this does make it more powerful in general. As shown in [13], the probability of winning two parallel CHSH games with LHVs is 10/16, strictly higher than the (3/4) 2 = 9/16 one gets by playing them independently. The problem can get even more extreme: for the Fortnow-Feige-Lovász game the probability of winning two parallel instances is the same as the probability of winning a single instance, 2/3 [45,46], although for three instances the probability does decrease to 14/27.
Nevertheless, the idea still works, because the probability of winning such a fraction of games still goes down exponentially with n, as proven by Rao's concentration bound [15]. To state it, let us define the parallel game more formally. Consider you have a nonlocal game G with m inputs and d outputs per party, who win with probability 5 V (a, b, x, y) if they give outputs a, b for inputs x, y. Its parallel version is then the game G n δ where they play n copies of G in parallel, and they win at G n δ if they win n(ω (G)+δ) or more instances of G. The concentration bound is then 6 where t := 4δ 2 4δ 2 + 75 2 log 2 d + 75 2 log 2 [1/(ω (G) + 2δ/3)] .
(45) While computing the Tsirelson bound of G n δ is also difficult, we can obtain a good enough lower bound by playing each instance of G independently with the optimal quantum strategy. The probability of winning n(ω (G) + δ) or more instances of G for δ < ω q (G) − ω (G) is then lowerbounded by the Chernoff bound ω q (G n δ ) ≥ 1 − exp(−nD(ω (G) + δ||ω q (G)), (46) which does go exponentially to one, as we want. One might consider the possibility of simplifying the discussion by considering nonlocal games for which the Tsirelson bound is one, known as quantum pseudo-telepathy games [47], of which a good example is the magic square game [48]. As noticed in Ref. [16], they give us an easy way to construct a nonlocal game with gap arbitrarily close to one: we simply demand the players to win all parallel instances, as in the ideal case this happens with probability one using quantum mechanics. The p-value of such an event is given by Raz's parallel repetition theorem [49,50], which also gives a tighter bound than Rao's concentration bound. It is, however, completely unrealistic to demand a real experiment to win all parallel instances, as it leaves no room for experimental error. Using instead the concentration bound we get a result that is robust against experimental imperfections, and as an added bonus it applies to any nonlocal game, not only pseudo-telepathy ones.
Ironically enough, the nonlocal game for which the concentration bound gave the smallest upper bound we found was in fact a pseudo-telepathy game, consisting of two parallel repetitions of the magic square game. In the case of a single repetition, the magic square game has 3 inputs and 4 outputs per player, local bound 8/9, and requires two singlets to be won with probability 1. For the case of two repetitions we showed that local bound is using the algorithm from section 6. Setting δ = 1 − 66 81 − 1 100 , to allow the players to lose 1 100 of the games, we find that for a p-value of 10 −5 it is sufficient to play n MS2 = 32 654 296 (48) parallel copies of G MS2 . For the CHSH game G CHSH , setting δ = 2+ √ 2 4 − 3 4 − 1 100 , and again for a p-value of 10 −5 , we find that n CHSH = 67 683 296 (49) parallel copies are enough. The probability of winning this many instances quantumly is extremely close to one. It might seem that it is easier to achieve a singleshot violation with G MS2 than with G CHSH , but looking only at the number of repetitions is misleading, as we need 4 singlets to play each instance of G MS2 , but only 1 singlet for each instance of G CHSH . A more meaningful measure of the experimental effort is the dimension of the quantum system required to achieve the single shot violation, which is d MS2 = 2 130 617 184 (50) and d CHSH = 2 67 683 296 , so G CHSH is actually better. We also considered parallel repetitions of G CGLMP d and G Inn22 , described in Sections 3.1 and 3.2, but they always required larger dimensions. We do not expect these numbers to be close to the true dimension required for a single-shot violation, because the concentration bound is extremely loose. For instance, for small n it gives us a bound very close to 2.
To have a better idea on what the minimal required dimension is, we investigated the actual local bounds for G n CHSH,δ and G n MS,δ for the same δ as before (which requires the players anyway to win all parallel instances for up to 6 instances of the CHSH game and 8 instances of the magic square game). It is known that ω (G 1 CHSH,δ ) = 3/4, ω (G 2 CHSH,δ ) = 10/16, and with the algorithm from section 6 we calculated that ω (G 3 CHSH,δ ) = 31/64. Moreover, using a classical version of the see-saw algorithm, for which we provide an implementation in C as an ancillary file, the best lower bounds we could find for n = 4, 5, 6 are ω (G 4 CHSH,δ ) ≥ 100/256, ω (G 5 CHSH,δ ) ≥ 310/1024, and ω (G 6 CHSH,δ ) ≥ 1000/4096. These lower bounds are achievable by using trivial combinations of the optimal strategies for G 2 CHSH,δ and G 3 CHSH,δ . A similar phenomenon happened for the Magic Square. It is known that ω (G 1 MS,δ ) = 8/9, we could show that ω (G 2 MS,δ ) = 66/81, and for n = 3, 4 the best lower bounds we found were ω (G 3 MS,δ ) ≥ 528/729, ω (G 4 MS,δ ) ≥ 4356/6561, again achievable by trivially combining the optimal strategies for lower n. If indeed no new strategies appear, it would be true that ω (G n MS ) ≤ ( √ 66/9) n (where here we are requiring the players to win all instances), and a mere 113 parallel repetitions of G MS would be enough for a single shot violation with p-value 10 −5 .
The main results of this section can be summarised as follows: Result 1. For any nonlocal game G with a quantum violation, it is possible to obtain a single-shot violation for any desired p-value p > 0 with any quantum probability of success q < 1, by constructing the parallel game G n δ with n, δ chosen so that the concentration bound (44) implies ω (G n δ ) ≤ p and the Chernoff bound (46) implies ω q (G n δ ) ≥ q. Moreover, this single-shot violation is robust against experimental imperfections.

The Khot-Vishnoi game
Parallel repetition is not the only way of obtaining a nonlocal game with a gap arbitrarily close to one. As already outlined in Ref. [16], it is also possible to do that with the Khot-Vishnoi game. This nonlocal game was introduced in Ref. [16], based on a construction by Khot and Vishnoi [51]. We present here its formulation from Ref. [23].
The game G KV d is defined by a integer d ≥ 2, restricted to be a power of 2, and a noise parameter η ∈ [0, 1/2]. It is a bipartite game, with 2 d /d inputs and d outputs per party. As shown in Ref. [23], its local and Tsirelson bounds respect (52) In Refs. [23,52], for example, the parameter η is chosen to be close to 1 2 − 1 2 log(n) , in order to get a large ratio ω q (G KV d )/ω (G KV d ). This choice results in the bounds shown in Section 2, which have a very small gap ω q (G KV d ) − ω (G KV d ). Optimizing instead for a large gap, we choose valid for d ≥ 2 6 , which is the same minimal d for which the Khot-Vishnoi game has a quantum advantage.
With this choice of η we get and so the gap gets arbitrarily close to one. It then follows that is enough to achieve a p-value of 10 −5 . For this d we have ω q (G KV d ) ≥ 0.999, so the single-shot violation is actually possible. Note that d KV is much smaller than d CHSH , the smallest Hilbert space dimension for which we could prove a single-shot violation with parallel repetition, but as we argued before we expect this to be only an artefact of the looseness of the concentration bound. Furthermore, the parallel repetition is a much easier experiment to perform than playing G KV d , as that requires one to do entangled measurements on a gigantic quantum system, whereas parallel repetition requires only independent measurements.

Bounds on the achievable gap
It is also interesting to consider the maximal gap χ G that can be achieved by quantum states of a given dimension. For this purpose it is convenient to introduce the quantity which gets arbitrarily large as the gap gets arbitrarily close to one. Ξ(G) is closely related to the socalled largest violation of a Bell inequality, introduced in Refs. [17,18], which is defined for nonlocal games as It is easy to see that and equality holds when ω q (G) = 1. This implies that upper bounds on LV(G) are also upper bounds on Ξ(G), and that a nonlocal game with a large Ξ(G) will have all the benefits associated with LV(G), such as high resistance to noise [17], in addition to having small expected p-value. We consider then the largest Ξ that can be achieved by a quantum state of local dimension d, taking the supremum over all possible nonlocal games, in which case we write Ξ(d). In Appendix D we obtain upper bounds on Ξ(d) by extending the LHV models for noisy quantum states from Ref. [53] and the bounds they imply on LV(d), generalising the technique introduced in Ref. [54]. We obtain that If we restrict the measurements in the quantum strategies to be projective, we obtain the tighter bound where H(d) := d i=1 1 i and γ ≈ 0.577 is the Euler-Mascheroni constant.
When G is an XOR game, the results of Tsirelson imply that is the real Grothendieck constant of order d [55,56]. When in addition the quantum state is restricted to be |φ d , the maximally entangled state of local dimension d, it holds that . Using the fact that ω (G) ≤ 1 Ξ(G) , these bounds imply that and Note that for d = 2 the assumption that G is an XOR game can be replaced with the assumption that G only has two outcomes per party, or equivalently that the measurements are restricted to be projective. It is known that 1.4359 ≤ K R (3) ≤ 1.4644 [44,57], which implies that Ξ XOR,|φ2 (2) ≤ 1.1885. Note furthermore that these bounds cannot be tight, because ω (G) = 1 Ξ(G) only when ω q (G) = 1, but for games with two outcomes per party ω q (G) = 1 implies ω (G) = 1 [12].
The results of Sections 4.1 and 4.2 give us lower bounds for Ξ(d). The parallel repetition construction give us for any nonlocal game Ξ(G n δ ) ≥ exp(−nD(ω (G) + δ||ω q (G)) which for the CHSH game results in The bounds for the Khot-Vishnoi game imply that which is asymptotically smaller. Both results are very far from the existing upper bound, but we expect it to be actually achievable. As we discussed in Section 4.1, we expect a better concentration bound to show that Ξ(d) close to d deterministic strategies and take the maximum. Since the number of strategies is exponential in both m A and m B , this becomes unpractical very quickly. A significantly faster algorithm can be obtained if we notice that for any given strategy of Bob (which we choose as the party with the smaller number of possible strategies) it is trivial to determine the corresponding optimal strategy of Alice. To be more precise, let P the behaviour generated by the deterministic probability distributions D A (a|x) and  , x), and take the maximum over the optimal value for each of Bob's strategies. Note that this algorithm will be specially good when the nonlocal game is asymmetric, i.e., when We provide an implementation of this algorithm in C as an ancillary file. We dedicate this work to the memory of Boris Tsirelson.

A Nonlocal games with probabilistic predicate
Here we show that there exists a Bell functional that cannot be transformed into a nonlocal game with deterministic predicate by providing an explicit example. To start with, note that if a Bell functional M is such that for each setting x, y the tensor M ab xy takes only two different values, then it is easy to transform it into deterministic nonlocal game via translation (20) and scaling (21): just take the form from Theorem 2.
On the other hand, if there are more than two different values in each setting, translation and scaling won't help, as they cannot change the number of different coefficients in a given setting. Therefore, the question of whether a Bell functional can be transformed into a deterministic game reduces to whether we can use the no-signalling constraints to make each setting have only two different values. Intuitively it's clear that this shouldn't be possible: in a scenario with d outcomes per party each settings will have d 2 coefficients, but only 2d − 2 no-signalling constraints will act non-trivially on it.
Consider then a Bell functional such that the coefficients of one of the settings are given by The most general way to transform it with the nosignalling constraints takes it to To make this setting take only two values, there are three possibilities 1. All coefficients except (0, 0) are equal to each other.
We shall examine all these possibilities in turn, and show that all imply that this setting takes at least three different values.
1. To make all these coefficients equal, we need in particular to make those in the last row equal, which implies that b = 2 and b = 5. But this implies that there are three different coefficients in the first row, so this does not work.
2. To make the coefficient (0, 1) equal to zero, we need to set b = 0. This implies that the coefficients (1, 0) and (1, 1) are equal to 1 + a and 2 + a. Since it's not possible to make them equal, at least one of the must be equal to zero. To zero the first one, set a = −1. This implies that the coefficient (1, 1) is equal to 1, and that the coefficients (0, 2) and (1, 2) are b and 5 + b . Since we already have two different coefficients, these latter two must be equal to either 0 or 1. But this is not possible, since their difference is always 5. The cases (0, 2), (1, 0), and (2, 0) follow from a similar argument, so we omit them.
3. To make the coefficient (1, 1) equal to zero, we need to set b = −2−a. This implies that the first two columns of R are To make all these nonzero coefficients equal, we need in particular to set a = −4, but this takes the coefficient (0, 1) to 2 and the coefficient (1, 0) to −3, making this submatrix take three different values already, so this does not work. Therefore, at least one of the nonzero coefficients must be equal to zero. This leads us to examine the possibilities of setting a = 2, a = −1, a = −6, and a = −2 + a, which all lead to at least three different values in this submatrix. The cases (1, 2), (2, 1), or (2, 2) follow from a similar argument, so we omit them.

B Lifting nonlocal games with probabilistic predicate
Here we show that if we also consider equivalence under liftings, i.e., addition of inputs, it is possible to transform all Bell functionals with two outcomes into nonlocal games with deterministic predicate.
If we set α xy := −M 01 xy +M 11 xy and β xy := −M 10 xy +M 11 xy , direct calculation shows that for every fixed pair of inputs (x, y) the coefficients N ab xy can only assume two different values. We can now perform the construction presented in proof of Theorem 2 to obtain a deterministic game G which is equivalent to M in this extra-input scenario.
Note that the example shown in Appendix A implies that even under liftings it is not possible to transform Bell functionals with three or more outputs per party into a nonlocal game with deterministic predicate.

C Upperbounding the expected pvalue
In this appendix we leave the argument G implicit in order to simplify notation. Let then as in the main text.
Proof. The idea of the proof is to use the generating function of the binomial sequence in order to upperbound the tail p(v, n), and from that upperbound p(n). We then relax this upper bound to make it a function of ω q − ω only. To do that, first notice that and that for any z ∈ [0, 1] it holds that Using the binomial expansion of (ω + z(1 − ω )) n to simplify the rhs, we get We have then To minimize the rhs, we set which for ω ≤ ω q is in the interval [0, 1], as required. This gives us In order to obtain an upper bound that is a function of the gap χ alone, we rewrite the rhs as a function of the coordinates obtaining For any fixed χ, this function is maximized by φ = 1, so we substitute that and obtain p(n) ≤ (1 − χ 2 ) n (92) Theorem 5. For ω ≤ ω q it holds that where χ := ω q − ω .
Proof. Let δ ≥ 0. From the Chernoff bound and the fact that n(ω + δ) ≤ n(ω + δ) we have that p( n(ω + δ) , n) ≤ e −nD(ω +δ||ω ) where D(a||b) := a log a b is the relative entropy. We can then lowerbound the relative entropy by minimizing each term over ω individually, resulting in and therefore for δ = χ we have D Bounds on the largest violation where L is the set of local behaviours, and where Q d is the set of quantum behaviours with local dimension d. We are interested in the largest violation that can be achieved by a d × d quantum state in any nonlocal game or Bell functional, so we define where the supremum is taken over all Bell functionals. We now provide upper bounds for LV G (d) and LV M (d) which improve the existing ones [24,54].
we only need to exhibit p, q satisfying the relevant constraints such that R, p − q = max(R) − min(R).
For the case where the original Bell functional encodes a unique game, the elements of R are given by for some permutation σ (where max(R) is the probability that the referee asks questions x 0 , y 0 ). We can set then p(a, b) : and q(a, b) : It is easy to see that so the constraints are satisfied, and that equation (129) holds, so we have optimality.
For the case of the CGLMP inequalities, shown in equation (35), R is given by for some constants C k . Let then k max , k min be such that max(R) = C kmax and min(R) = C kmin . An optimal solution of the dual problem will be p(a, b) : and q(a, b) : It is easy to check that they satisfy the constraints and that equation (129) holds, so as before we have optimality.
To show uniqueness, we need to show that max(R ) − min(R ) = max(R) − min(R) implies that γ A i = γ B i = 0 for all i. For the case of unique games, R is given by and the inequalities min(R ) ≤ R ab ≤ max(R ) imply that for all b we have and Combining these inequalities to eliminate the variables γ B b , we end up with γ A σ(b) ≤ γ A σ(b+1 mod d) + max(R ) − min(R ) − max(R) (141) for all b, so equation (137) implies the chain of inequalities which implies that γ A i = γ A 0 = 0 for all i, as claimed. To prove that γ B i = γ B 0 = 0 we just need to run the same argument for b = σ −1 (a).
For the case of the CGLMP inequalities, R is given by Defining k max and k min as before, the inequalities min(R ) ≤ R ab ≤ max(R ) imply that for all b we have and Combining these inequalities to eliminate γ B b as before and assuming that equation (137) for all b. Since these indices are just permutations of b the same argument as before applies and we are done.