Practical quantum-enhanced receivers for classical communication

Communication is an integral part of human life. Today, optical pulses are the preferred information carriers for long-distance communication. The exponential growth in data leads to a “capacity crunch” in the underlying physical systems. One of the possible methods to deter the exponential growth of physical resources for communication is to use quantum, rather than classical measurement at the receiver. Quantum measurement improves the energy efficiency of optical communication protocols by enabling discrimination of optical coherent states with the discrimination error rate below the shot-noise limit. In this review article, the authors focus on quantum receivers that can be practically implemented at the current state of technology, first and foremost displacement-based receivers. The authors present the experimentalist view on the progress in quantum-enhanced receivers and discuss their potential.


I. INTRODUCTION
The communication capacity crunch is upon us, 1,2 owing to the exponential expansion of the Internet. With monthly Internet traffic of 200 exabytes at the time of writing, the underlying communications systems will no longer be able to support the service reliability and Internet traffic congestion will only worsen as the exponential trend continues. Claude Shannon analyzed and described the limits of a communication channel. 3 He found a universal relation between information capacity, available channel resources, and noise. The connection between information and physics turns out to be even more fundamental. This connection is now well-established as a result of the progress in information theory and computer science, on one hand, 4,5 and quantum physics on the other. 6 With physical measurement at the heart of communication, the fundamental communication channel limits are related to fundamental properties of measurements.
In general, a quantum enabled channel supports (1) classical encoding and measurement, (2) classical encoding and quantum measurement, (3) quantum encoding and classical measurement, and (4) quantum encoding and measurement. The information capacity of quantum-enabled channels is bounded from above by Holevo's theorem. 8 From a quantum standpoint, electromagnetic waves are described by expanding them to a series of orthogonal modes and prescribing each mode a discrete number of excitations, i.e., photons. For the sake of simplicity, we assume communication via a single spatial mode, which is most commonly the case. Then, the number of orthogonal modes is directly related to the frequency bandwidth B of the channel. The average number of photons per state n is directly proportional to the average energy: E = ħωn. The average power is W = EB = nħωB. After substituting the maximal achievable entropy per optical mode to the Holevo theorem, one finds the capacity of a lossless and noiseless quantum-enabled channel: 7 C Q = B log 2 1 + W ℏωB + W ℏω log 2 1 + ℏωB W . (1) Therefore, the number of modes and the average energy of an optical state fully describe the physical resource use when electromagnetic waves are used as information carriers. This important result is referred to as Gordon capacity (or Holevo bound). To aid comparison, the channel capacity is often divided by the channel bandwidth. Then, normalized channel capacity C Q /B conveniently characterizes the spectral efficiency. Formally, the spectral efficiency is measured in bits, but often units bits/s/Hz are used to emphasize the physical meaning of C/B as a measure of data rate in bits per second over a channel with a bandwidth of 1 Hz. C Q /B is also used for classification of communication protocols 34,35 C Q /B = log 2 1 + W ℏωB + W ℏωB log 2 1 + ℏωB W . (2) In a classical limit, W ≫ Bħω, so the second term nearly vanishes, and the capacity becomes C Q ≈ C Shannon = B log 2 (1 + W/(ħωB)). This result is identical to a classical channel capacity given by the Shannon limit, where W/(ħωB) is a signal to noise ratio.
In a photon-starving regime, W ≪ Bħω capacity is mainly defined by the second term in (2). This result can be interpreted as follows. For low input power, one can use several orthogonal modes and, each time they send the entire available energy (a single photon) in one mode. The more modes are available, the more bits of information can be encoded per photon. For example, if one photon can be sent per time interval T, then for an available bandwidth B, one can divide this interval into M = BT slots. The information can be encoded by sending a photon in a particular time slot. The number of encoded bits is log 2 BT. Therefore, C OME = log 2 BT T = W ℏωB log 2 where OME stands for orthogonal mode encoding. Spectral efficiencies C Q /B, C Shannon /B and C OME /B are shown as a function of energy efficiency defined as average number of photons used to transmit 1 bit of information in Fig. 1.
Note that this simple result is based on the assumption of a noiseless and lossless channel. In this ideal case only encoding using Fock states in conjunction with ideal photonnumber resolving (PNR) measurement can attain the Gordon capacity, Table I. Typically, optical channels exhibit a significant loss. The upper bound (2) can be corrected by changing assumptions. In particular, adding a model for losses leads to a different capacity bound. [36][37][38] Any practical optical communication system requires physical states that are resilient to optical loss, at least to some extent. To this end, classical states, especially coherent states of light, are particularly useful. In this review, we focus on channels with classical encoding and quantum measurement.
To design a practical digital communication system, an encoding method to map digital information on transmitted physical states is needed. The set of states {|Ψj〉} is called an alphabet; it can be of an arbitrary length M. We assume equiprobable states and a noiseless channel, Table I. How well can the alphabet symbols be distinguished? To answer this question quantitatively, we use symbol error rate (SER), the probability that a transmitted symbol is received incorrectly, P. Helstrom determined that the lower bound on this error is related to an overlap of the alphabet states. 10 One uses the square root measure (SRM) method 8,23,39,40 to find the Helstrom bound (HB). This method relies on a Gram matrix defined as Note that the dot product in (4) cannot be zero for coherent states. Indeed, in Fock basis, one writes ψ j = e − x 2 2 vac j + α 1 j + 1 2 α 2 2 j + ⋯ , (5) where α is a coherent state parameter. Recall that 〈vac m |vac j 〉 = 1 even if modes m and j are orthogonal. This property of coherent states is important for other applications such as quantum fingerprinting. 41 (6) In general, Helstrom bound cannot be found analytically. For some encodings, G 1/2 has an analytical form. We will give examples of Helstrom bounds for typical encodings in Sec. III.
Because Holevo theorem bounds channel capacity and HB puts a limit on error rate, the two bounds cannot be directly compared. However, HB gives the resource use, i.e., the required power and bandwidth to reach a certain error probability. Thus, to benchmark an encoding, the error probability is fixed. Then the normalized data rate R/B and the required power are compared to the normalized channel capacity C(W)/B. Obviously, for sufficiently low SER P HB ′ and a lossy communication channel, R HB /B < C(W´)/B, where W´ is the power required to achieve the SER P HB ′ . It is very important to note here that HB merely establishes the lowest possible error probability, but does not guarantee a measurement method capable of achieving the HB. Hence, we expect that experimental spectral efficiency R E /B ≤ R HB /B < C(P)/B.

B. Classical channels
Unless OME is used, classically, the information capacity is given by the Shannon theorem 3 This classical model does not specify the origin of channel noise N. Naively, this noise is a property of a communication channel and can be arbitrary small, which would result in the infinite channel capacity. In reality, noise is a fundamental property of any measurement. Because communication cannot occur without a physical measurement at the receiver, it is the measurement noise that would limit an otherwise noiseless channel. Although noise can be introduced ad hoc into a classical model of measurement, it is much more convenient to derive the minimal measurement noise using a quantum mechanical description of an otherwise classical measurement. 7,42,43 A typical classical receiver measures the optical signal via heterodyne and/or homodyne measurements. In both cases, the signal undergoes interference on a beam splitter with a local oscillator (LO). The LO is a coherent state with the same optical frequency as the signal carrier in the case of the homodyne and a different frequency in the case of the heterodyne. After interference, the signal is detected on one or more detector(s). In all cases, there will be a current at the detector, and hence there will be shot noise. Assuming the detection efficiency of unity, the information capacity of coherent homodyne and heterodyne receivers is 7 C heterodyne = Blog 2 1 + W ℏωB , C homodyne = B 2 log 2 1 + 4 W ℏωB . (7) We see that the measurement-induced noise is proportional to the channel's bandwidth, and the dependence of capacity on power and bandwidth in (7)  We see that C Q > C heterodyne , C homodyne , C OME . Therefore, channel capacity of the quantumenabled channel exceeds that of the classical channel. In derivations above, one does not specify a modulation scheme to obtain channel capacity. Finding the upper bound for SER requires selecting a modulation scheme. The uncertainty due to shot noise on the detector 12,47 leads to state discrimination errors. The lowest classically attainable symbol error rate is often referred to as shot noise limit (SNL), quantum noise limit (QNL), or standard quantum limit (SQL). We will give examples of SNL derivations for particular modulation protocols in Sec. III.
To benchmark an encoding, a SER P is fixed (at a sufficiently low value). Then the normalized data rate R/B and the required power can be compared to the normalized channel capacity C(W)/B. Thus, the highest attainable data rates for classical and quantum-enabled channels as well as Holevo bound and Shannon limit can be presented on the same graph. As we will see below, R SNL (W´)/B < R HB (W´)/B < C Q (W´)/B, where W´ is a fixed power. Note that because we explicitly assume the measurement method to compute SNL, the classical lowest possible error probability P SNL can in principle be achieved using ideal components, as opposed to P HB , because the ideal quantum measurement method might be unknown.

III. CONVENTIONAL and NOVEL COMMUNICATION PROTOCOLS
In digital communications, the ratio between data rate and bandwidth R/B gives the spectral efficiency of the communication protocols. Two main families of modulation schemes are generally distinguished: power-limited R/B > 1 and bandwidth-limited R/B < 1, Fig. 2.
The power-limited family includes such encodings as pulse amplitude modulation (PAM), quadrature amplitude modulation (QAM), phase-shift keying (PSK), and others. In these modulation schemes, the bit rate R for a fixed signal pulse duration grows as log 2 M as the number of states in the alphabet M increases. Communication bandwidth B remains constant, which means that the spectral efficiency R/B improves with M. However, powerlimited modulation schemes using longer communication alphabets M require more power than these with shorter alphabets for reliable communication because it is generally harder to discriminate a larger number of non-orthogonal states. The maximal possible R/B is set We will discuss different modulation methods, review their theoretical limits for detection error rates, and compare their performance with the fundamental channel capacity. We will focus on encoding schemes that have been actively considered for quantum-enabled communication experiments.

A. Binary protocols
Binary protocols are well studied, and they are a rare case where analytical expressions for error rate limits can be found, see Table II. It is not surprising that the first quantum receiver outperforming SNL was proposed for the binary modulation. 17 In addition, the first projection measurement that achieves the HB (or the optimal projection) was found for binary encodings. 10,18 Here we discuss binary encodings based on amplitude and phase modulations.
The binary PSK (BPSK) uses two coherent states with opposite phases for encoding and encodes exactly one bit per symbol s 0 = − α , The corresponding constellation diagram is shown in Fig. 3. Fuzzy circles represent coherent states on a phase diagram, a distance from the state to the origin is proportional to the square root of the average number of photons in the state, and an average phase is measured as the angle between the positive direction of axis I and the vector from the origin to the center of the coherent state. Because both BPSK symbols s 0 and s 1 are states of the same optical mode, this encoding is non-orthogonal even if one can neglect the vacuum component, cf. (5). Faint states s can significantly overlap. The optimal classical discrimination of the BPSK signals can be achieved via a homodyne measurement. The only relevant measurement value for BPSK is the projection of the measured state on the in-phase quadrature (I axis in Fig. 3). The probability density function to receive a projection x when state s i was sent is where n = 〈n〉 = |α| 2 is the average number of photons in a state s i . A decision that the input state is s 0 is made if the measured x < 0; otherwise, if x > 0, the decision is s 1 . Therefore, to find the probability of a discrimination error, we need to compute the probability of measuring x > 0 when s 0 was sent (or the probability of measuring x < 0 when s 1 was sent), The HB can be readily found as As we expect, P HB < P SNL .
The BPSK constellation is similar to a binary on-off keying (OOK) (i.e., s 0 = 0, s 1 = |α〉) when the origin is shifted to the center of the left state in Fig 3. Thus, we immediately get 20 Note that the classical measurement of OOK states distinguishes coherent states from vacuum states, and this measurement does not require a heterodyne; P SNL OOK is based on direct optical power measurement (direct detection). OOK requires four times higher peak energy and two times higher average signal energy than BPSK to match its quantum discrimination error rate bound, HB. This inefficiency can be explained by calculating the geometrical distance between signal vectors d 01 PSK = 2α and d 01 Binary protocols can carry only one bit of information with each signal pulse. It may be beneficial to encode more than one bit of information per signal pulse, i.e., by using larger encoding alphabets.

B. M-ary PSK
A natural extension of BPSK is when more than two states are encoded in the phase of a coherent state. From symmetry considerations, the states are separated by equal phases Δϕ = 2π/M, where M is the number of states in the alphabet. As an example the constellation diagram of the 4-ary PSK is presented in Fig. 4. This modulation method encodes more than one bit per state, which may be beneficial for two reasons. First, when detectors are slow, a single measurement yields several (log 2 M) bits, so that the rate of information exchange improves. Second, the number of bits transmitted per optical mode in a unit time is higher; thus, spectral efficiency is higher. SNL and HB can be found analytically in integral form Refs. 23 and 34; see Table II. It is convenient to plot energy and bandwidth requirements of M-ary PSK protocols on one graph, where points with different M are connected to guide the eye, Fig. 2. Even though P SNL > P HB for all M, error probability bounds for classical and quantum detection grow fast with M for a constant energy per bit n/log 2 M. 34,48 Unfortunately, the potential advantage of the quantum measurement P SNL /P HB also decreases with M. Therefore, quantum receivers are most effective for PSK protocols with relatively low M (see SNL PSK and HB PSK in Fig. 2).

C. M-ary orthogonal encodings
The are separated in frequency, in which case information will be encoded in spectral modes, and the required bandwidth will still be M times broader than that for the flat-top pulse of duration T. Linear expansion of bandwidth use is unavoidable for all modulation schemes using orthogonal modes. Other degrees of freedom, such as polarization or spatial modes, can be used when available.
Direct detection is classically the best detection strategy. Specifically, in PPM, modes are separated in time, so the arrival time of the pulse to the detector is sufficient for the physical separation of modes. For other encodings, mode separation may involve spectral filtering, spatial mode sorters, and so on. The classical error limit for ideal signal-shot-noise limited (background-free) detector operation, Table II, is proportional to e −n , i.e., the probability to detect vacuum states in all modes. There is no dependence of P SNL on M for large M. Therefore, for a given power, error per bit reduces with M as log 2 (M) (DD orthogonal in Fig.  2). This feature is used for photon-starved communications although the energy-bandwidth trade-off becomes inefficient for large M.
Even though heterodyne detection is not optimal due to larger shot noise, it is often used in optical communications for orthogonal frequency shift keying. Heterodyne noise increases with the bandwidth. On the other hand, noiseless physical separation of the closely-spaced frequency modes may be practically unfeasible. Interestingly, when heterodyne detection is employed, nearly all gain in bits per unit energy for large M is canceled by increasing noise (see SNL OFSK in Fig. 2).
As we discussed above, from the quantum viewpoint, faint coherent states are always nonorthogonal. A Helstrom bound is therefore above zero. Its value can be readily found, Table II (HB orthogonal in Fig. 2), and it can be shown that the P HB < P SNL . 9 Therefore, orthogonal encoding receivers can also benefit from a quantum measurement.

D. M-ary coherent frequency shift keying
The M-ary coherent frequency shift keying (CFSK) encodes information in both the frequency and phase of coherent state pulses, |α m 〉 = |α(ω m , θ m )〉. The adjacent symbols m and m + 1 are separated by Δω in frequency space, and their initial phases differ by Δθ, so that |α m 〉 = |α(ω 0 + (m − 1)Δω, (m − 1)Δθ)〉. This alphabet is illustrated in the constellation diagram, Fig. 6. In this diagram, coherent states rotate with time around the origin with rates given by their detuning. The keying can be described by two parameters: Δθ and ΔωT. This parameter space contains the PSK modulation scheme: ΔωT = 0, Δθ = 2π/M and the orthogonal frequency shift keying (OFSK): ΔωT = 2π. The goal here is to reduce the bandwidth of the communication protocol while maintaining low error probabilities. Therefore, one is interested in small frequency separation: ΔωT < 2π. In this parameter space, states are nonorthogonal. Therefore, both P HB and P SNL cannot be expressed analytically. Numerical methods 48,49 are used instead. Both Δθ and ΔωT can be adjusted to meet certain optimization goals. For instance, when optimizing for energy efficiency, minimal Helstrom bound is achieved with one set of parameters, the lowest shot noise limit requires another parameter set, and the minimal error rate is achieved in a quantum receiver with yet another one. Interestingly, as the numerical analysis of P HB shows, this keying balances energy requirements and bandwidth requirements at the same time, for 4 ≤ M ≤ 32, see Fig. 2. As a consequence, its rate graph crosses the R/W = 1 value.
Therefore, this keying is neither power limited nor bandwidth limited.
For a properly optimized CFSK P SNL CFSK < P SNL PSK , which is expected, because the bandwidth of CFSK is wider than that of PSK. However, it may be difficult to build an efficient classical CFSK receiver in practice. Interestingly, it turns out that a time-resolving quantum receiver, discussed later, uses the same hardware for many encodings including PSK and CFSK. The only difference is the feedback algorithm encoded in firmware. Therefore, the quantum measurement can be used to provide bandwidth and power efficiency simultaneously in a practical way.

IV. DISPLACEMENT-BASED QUANTUM STATE DISCRIMINATION
Quantum theory establishes a lower discrimination error bound than that accessible through classical measurement. However, the design of a practical measurement method does not directly follow from theory. In 1973, Kennedy proposed the first near-optimum receiver approaching Helstrom bound for binary coherent states. 17 In less than a year, Dolinar proposed an improved receiver for binary coherent states. 18 In both receivers, the input state is displaced from its original state through interference with a local oscillator, which can be practically accomplished with a heavily unbalanced (typically, 99:1) beam splitter. These two seminal papers have triggered theoretical and experimental research of quantum receivers.
Most theoretical and nearly all experimental reports to date take advantage of coherent state displacement in one way or another even though coherent state displacement is not the optimal quantum measurement for some encodings. As it has been shown recently, an optimal projective measurement may require ancillary quantum states or quantum nodes, such as a single atom. We cover this exciting work in Sec. V.
In this section, we discuss the experiments with coherent state displacement-based quantum receivers. To aid the reader, we present a simple classification of these receivers in  Table III. The improvement from quantum measurement is typically measured as a ratio of the observed error rate to the classical SNL limit for a noiseless receiver with the same system detection efficiency as the quantum receiver, i.e., adjusted SNL. This measurement quantifies the so-called "quantum advantage" over a classical measurement under similar conditions. However, using this characterization method does not account for any inefficiency of the quantum measurement experiment. Some inefficiencies may be due to imperfect off-the-shelf components that were used, while other inefficiencies may be intrinsic to the chosen quantum measurement method. Thus, one could argue that a more relevant comparison of quantum versus classical receivers is to use the absolute SNL-the limit of the ideal classical receiver with unity efficiency. The error rates below the absolute SNL cannot be achieved by a classical receiver in principle. Although all quantum receivers surpass the adjusted SNL, not all of them achieve SER below the absolute SNL. We also compare input state energy required to achieve SER of 10% for demonstrated quantum receivers versus the SNL-limited receivers where applicable. This comparison shows the possible reduction of energy requirements by switching to quantum receivers.

A. Kennedy receiver
Helstrom determined the fundamental SER bound for the optimum receiver in 1968, 10 where the projection measurement on a quantum superposition state, often called "Shrödinger cat state" was proposed to reach the quantum limit for the binary coherent state encoding. The experimental implementation of the proposed optimal measurement is very difficult because it relies on a superposition basis and entanglement measurements. 52 This method requires a very high-fidelity entanglement and a near-unit detection efficiency. 53 In 1973, Kennedy proposed the first receiver using a simple displacement operation on the input coherent state followed by photon detection. 17 While the overall performance of the receiver falls short of the HB, the receiver achieves exponentially optimum performance and outperforms the shot noise limit. 17 The receiver scheme proposed for BPSK states | + α〉 and | − α〉 is shown in Fig. 8(a). The input signal is displaced using a local coherent state and measured using a photon detector. The displacement occurs by interfering with the input signal with the local state on a beam-splitter. As shown in Fig. 8(a), the local state is set to | + α〉. The destructive interference occurs for the input signal | − α〉 which is displaced to vacuum |0〉, so no photon can be detected. The constructive interference occurs for | + α〉, such that the output is displaced to | + 2α〉. A brighter output makes the probability to detect a photon higher. Therefore, in the ideal noiseless case and with the perfect displacement no photons will be detected when the input state was | − α〉, but there is a probability (proportional to exp (− 4|α| 2 )) that no photons will be detected if the input state was | + α〉. This non-zero probability causes a discrimination error. In spite of the apparent simplicity of the method, experimental implementations 54,55 fell short from outperforming the absolute SNL due to low system efficiency, non-ideal displacement, and dark noise at the detector. Modified Kennedy receivers use an optimized displacement and a more sophisticated discrimination algorithm. Those receivers unconditionally surpass the SNL in experiments, 20,22 discussed below.

B. Dolinar receiver
Following the proposal of the first quantum receiver using non-Gaussian measurements to beat the shot-noise limit, Dolinar proposed a receiver 18 that can reach the Helstrom bound for discrimination of binary coherent states. This receiver theoretically approaches the quantum limit in binary state discrimination by using the real-time quantum feedback with the so-called optimal displacement and photon counting measurements, i.e., without the need for a "cat-state" measurement. 10 In contrast to the Kennedy receiver, the displacement amplitude β is changing constantly. The phase is adjusted every time a photon is detected, i.e., it is determined from the number of photons n t detected in the time interval [0, t). 18,56 For an on-off keying, the optimal displacement amplitude is given by 57 β n t = α 2 e iπ n t + 1 1 − e − α 2 t/T − 1 .
The discrimination decision is based on the total number of photons n T counted during the entire measurement [0, T], so that |α〉 (| − α〉) is chosen when n T is even (odd) as shown in Fig. 8(b). Formally, Eq. (12) diverges at the beginning of the pulse t = 0, which cannot be practically implemented because of the finite energy of the LO and the saturation of a single-photon detector.
Yet, this issue can be practically alleviated in a laboratory environment. A binary Dolinar-like receiver with finite displacement amplitudes was successfully implemented experimentally in Ref. 57. In their work, authors demonstrated that for input signal with the low average number of photons (n <1) the OOK receiver not only surpasses the adjusted SNL, but also approaches the adjusted HB; for comparison, both SNL and HB were adjusted to the system efficiency.
Dolinar's idea of adaptive feedback enabled multiple new quantum receiver configurations. Particularly, sub-SNL receivers for M-ary encodings were invented and experimentally demonstrated.

C. Novel quantum receivers and experiments
1. The optimized displacement receiver-A few attempts were made to modify Kennedy receivers to achieve a lower SER. One such enhancement is the optimized displacement receiver (ODR). Kennedy receiver displaces the input state by interfering it with the equal amplitude of the local state. In their theoretical paper, Takeoka and Sasaki proposed to adjust the displacement of the input signal using local state. 58 Their ODR uses the local state with an amplitude β greater than the input signal amplitude α, Fig. 9(a). It is evident that due to unequal amplitude in the local state the input signal will not be displaced to vacuum. There are no other changes to the Kennedy design, cf. Fig. 8(a). Since larger displacement results in a higher probability of photon detection when input signal state is displaced to |α + β〉, the probability of detecting no photons e − | α + β| 2 is reduced from that of the Kennedy receiver. However, because | − α〉 is no longer displaced to vacuum, there is a possibility to collect photons, which leads to errors. The trade-off between these "false" detections due to the non-ideal vacuum |β − α〉 and the reduced probability to get no clicks for the |β + α〉 state results in an optimization problem. The optimal displacement amplitude β minimizes the combined error probability. The experimental implementations of ODR has shown discrimination error rates below the SNL adjusted for the experimental conditions 54,59 and unconditionally, 20,22 i.e., in comparison to the absolute SNL. The most significant improvement in discrimination accuracy is shown for faint coherent states with |α| 2 ≈ 1. The amplitude of the optimized displacement approaches the amplitude of the input state |β| → |α| as |α| → ∞. A similar optimization of displacement can reduce the discrimination error rate of adaptive feedback receivers for binary and M-ary alphabets as well.
The discrimination error rate of the ODR receivers can be further reduced with photonnumber resolving (PNR) measurements [ Fig. 9 2. Conditional pulse nulling receiver-Conditional pulse nulling (CPN) receivers are explicitly designed for pulse position modulation (PPM) which is widely used in photonstarved free space communications due to its high energy efficiency. Dolinar proposed the CPN receiver in 1982. 50 He theoretically showed that CPN performs near the optimum. 50 Almost three decades later, the CPN receiver has been experimentally demonstrated for a 4-ary PPM with the discrimination error below the adjusted SNL. 51 The experimental scheme of the CPN receiver is shown in Fig. 10(a). The input signal is displaced to vacuum using the local state pulse. The decision strategy for 4-ary PPM is shown in Fig. 10(b). The receiver starts by nulling the pulse in position 1 (Fig. 5). Photon detection (failure) leads to the nulling of pulses in the subsequent steps. If no photons were detected in position 1 (success), then the received state is |α 1 〉. The same strategy is repeated for subsequent positions. The green boxes represent the received state after a discrimination. Even in ideal experimental conditions, errors arise from the Poisson nature of the coherent states, cf. Kennedy receiver: the displacement of the input signal with a wrong local state does not necessarily lead to photon detection.

Multi-stage receivers-
The optimal receiver for binary coherent states proposed by Dolinar 18 requires feedback to adjust the LO as more information about the input state becomes available. A possible modification of the Dolinar receiver that makes it more experimentally feasible breaks the input into segments or stages either spatially [(11(a)] or temporally [(11(b)]. Then, the measurement result from each segment can be used to choose the best displacement state for the next segment. The number of stages is predefined. Switching rules can be represented as a decision-making tree that is typically precomputed. It can be shown 63,64 that with the proper choice of the displacement intensity at each stage n (|β n | 2 > |α i | 2 , cf. Dolinar receiver) and in the limit of infinite number of stages such a multi-stage receiver can optimally discriminate binary states. Thus, choosing the same intensity of the LO for all stages does not enable the HB-limited discrimination even when the intensity is optimized.
For example, the BPSK input state, |α i 〉, is split into multiple copies with equal intensity, Fig. 11(a). Thus, the energy of the input to each stage is reduced by the factor of m.
Each attenuated copy of the state is sent to a displacement setup. An optical delay is inserted in each stage so that the measurement on an n + 1th stage does not start before the measurement on the nth stage is completed. For the first stage, an arbitrary state of the LO is chosen. If the LO matches the input, the input state is displaced to vacuum, no photons will be detected; otherwise, a photon can be detected. To achieve close to optimal performance, the value of |β n | 2 should be corrected at each stage, but the phase of |β j 〉 only changes with photon detection. The potential drawback of this scheme is that the number of optical elements and single-photon detectors grows with the number of stages. An excessive loss of the optical signal occurs due to imperfect optical components. In addition, the alignment of the multistage setup may be complicated.
A signal pulse can be divided into equal temporal intervals rather than spatially. In this case, just one LO with the feedback and one detector is needed. As before, the feedback is used to update the LO after each measurement segment with an equal duration T/m. Figure 11(b) shows the experimental scheme of the multi-stage receiver with temporal stages. The strategy tests the hypothesis that the most probable input signal is α i during each measurement segment. At the end of the signal pulse T, final Bayesian probabilities are computed, and then the hypothesis with the highest probability is used to make the discrimination decision. The main drawback of temporal segmenting is the need for faster detectors and electronic components. A deadtime of single-photon detectors is yet another obstacle.
The idea of adjusting the feedback after each photon detection can be generalized for M-ary communication protocols although the optimal feedback algorithm is not known. An M-ary discrimination strategy that uses m measurement stages where LO can be adjusted after each stage was proposed in Ref. 24 Temporal adaptive receivers can also be generalized to longer alphabets, Fig. 11(b). The temporal adaptive receiver design was used in the experimental demonstration of the 4-PSK quantum receiver that unconditionally surpassed the SNL limit. 67 A similar design was used for the first demonstration of the 4-PSK receiver at a telecom wavelength. 68 A more sophisticated version of this receiver counts the number of photons in each measurement. This approach enables more precise Bayesian calculations and especially helps with sub-SNL measurements of mesoscopic input states. The information about the number of detected photons is particularly helpful against the experimental imperfections such as darkcounts, non-ideal visibility, etc. Thus, lower discrimination error probability can be achieved. In Ref. 69, a SPAD-based quasi-PNR detection was used. The authors extended the sub-SNL performance of their receiver to the inputs with more than 20 photons per pulse on average. 69 They achieved the record SER (below 10 −6 ). The similar quasi-PNR enhancement with a SPAD detector was used to optimize other multi-stage receivers. 66,[70][71][72] Adjusting intensity of the LO is yet another path to sensitivity improvement. In Ref. 73, the theoretical model of displacement for M-ary receivers is optimized by optimizing |β| 2 at each step and the unconditional error rate below signal-to-noise ratio (SNR) is experimentally demonstrated.

Time-resolving receivers-Another class of receivers consists of one displacement
module and one single-photon detector and uses single-photon detection times for discrimination. Unlike multi-stage receivers, it provides instantaneous feedback to switch the LO state right after each photon detection. By design, the receiver gets to test the unrestricted number of hypotheses and allocates the optimal time to verify each hypothesis.
Owing to the nature of coherent states, with a sufficiently fast detector, the probability to detect more than one photon in the field is negligible. Therefore, PNR detection is not required.
The first receiver of this class was introduced by Bondurant. 19 Type-I Bondurant receiver probes hypothesis in a simple sequential order and uses the hypothesis at time T as the discrimination decision, Fig. 12(a), while Type-II receiver uses the sequential order, but compares photon interarrival times to make the final discrimination decision. Bondurant receivers have a near-optimal performance for 4-PSK state discrimination, where a Type-II receiver outperforms the Type-I receiver at low input energies. The probing is executed by switching the local state from one hypothesis to next, α 1 → α 2 … → α m , until all hypotheses are tested or no more clicks are detected. In a practical setting, a detection event can be induced by a dark count or non-ideal displacement. After any photon detection, the Bondurant receiver discards the hypothesis and will never test it again, leading to extra errors. A cyclic strategy can correct some of these errors. A cyclic receiver is similar to the Bondurant Type I receiver, except after testing the last state of the alphabet α M it switches back to the first state α 1 and continues the measurement until the end of the pulse T. 26 The cyclic receiver was demonstrated experimentally. 74 The measured SER is unconditionally better than the SNL for 4-PSK, 8-PSK, 4-CFSK, and 8-CFSK encodings.
A much better result can be obtained if the time-resolving quantum receiver uses both instantaneous feedback and Bayesian inference. 48 A Bayesian classifier uses the knowledge about prior local state and a photon arrival time to predict the most probable input state after each photon detection. This strategy converges to the right hypothesis with a minimal number of photon detections and it can be applied to any encoding. 75 The strategy works best if the encoding is developed to take advantage of the instantaneous feedback. 48,49 This holistic approach when both the receiver and the encoding are developed side-by-side has resulted in the record low error rates in discrimination of large alphabets with faint signals (|α| 2 ≲ 1 photon per bit). This receiver is shown to perform unconditionally below the SNL for M ≤ 16 alphabets, the largest number of states in an alphabet reported to date. 49

D. Summary of displacement receivers
In summary, a direct comparison of different displacement receivers is not always possible. For binary protocols, the optimal measurement is theoretically possible; measurement schemes that are asymptotically optimal have a clear advantage. For longer alphabet lengths, displacement measurements are not optimal. Theoretically, time-resolving protocols and the protocols that adjust LO intensity throughout the measurement are the most advantageous.
In experiment, practical considerations may play the decisive role. In general, based on experimental evidence, Table III, the protocols that take advantage of photon number resolution perform particularly well for brighter input states. Time-resolving protocols perform better with dimmer input states (with ≈ 1 photon/bit and lower). This is because detectors have deadtime and the feedback components have latency; therefore, fewer feedback cycles may be practically advantageous. Other considerations include the following: • • Transmission loss and detection efficiency. Both properties reduce system efficiency and reduce the unconditional advantage over the absolute SNL.

•
• Alignment of the displacement reduces both conditional and unconditional advantage of the quantum measurement, but can be partially mitigated by including the inefficiency into the feedback model.

•
• Background and dark counts similarly reduce both conditional and unconditional advantage of the quantum measurement, and can be partially mitigated by adjusting the feedback model.
We see that spatial multiplexing can remedy time delays, but it may introduce higher losses and alignment issues. The choice of the most optimal modulation protocol and the alphabet length may also depend on experimental and/or practical conditions. In making the choice, considering both conditional and unconditional performance of a receiver (Table III) is important because the conditional performance shows the degree of the advantage made specifically by a non-classical measurement whereas the unconditional performance reveals the system efficiency penalty.

V. NEW TRENDS
In Sec. IV, we discussed theoretical and experimental achievements in coherent state discrimination with displacement-based quantum receivers. The field of quantum measurement is very active, and many new ideas for using quantum measurement in optical networks have emerged. Here we briefly discuss new research directions that in our view have a significant practical potential.

A. Noisy communication channels
Realistic communication channels may distort and contaminate communication signals. Given that the theory of quantum receivers assumes noiseless channels, it is important to understand if quantum measurement advantage extends to channels with noise. An important realistic channel model is the non-Gaussian channel with bosonic phase noise. In Ref. 76 Fig. 13(a)]. They demonstrated SER below the homodyne limit adjusted for the system efficiency of 72% in the presence of phase noise. A similar strategy for a channel with thermal noise is considered in Ref. 77. The authors theoretically demonstrate that a PNR-enabled Kennedylike receiver with the optimized displacement (see Refs. 58 and 60) can surpass the SNL when the average number of thermal photons is smaller than 0.2. Practical implementations of many quantum receivers require interferometric stability of the communication channel or a pilot signal providing the reference phase. In long-distance communication, it may be challenging to interferometrically stabilize the communication channel. In Ref. 78, authors experimentally demonstrate a phase-tracking protocol for quantum receivers to correct for time-varying phase noise and keep SER below the SNL.

B. Discrimination of optical states other than coherent states
So far we considered coherent states as communication carriers. This is because coherent states of light are widely used for communication. Other types of states can be discriminated using quantum methods as well. The optimal discrimination of optical states with non-Poissonian photon number statistics [81][82][83][84] has recently attracted a lot of interest. In these new experiments, ancillary coherent states are used for displacement in a receiver. Clearly, the perfect displacement of a non-Poissonian state to a vacuum state with a coherent state is impossible. Still, the probability to detect at least one photon can be significantly increased for one type of input and significantly reduced for the other.
In Ref. 79, authors investigate a binary communication channel that uses squeezed vacuum states as information carriers. The information is encoded by displacing the squeezed vacuum state by D( ± α), 16 resulting in two displaced squeezed states (DSS) |±DSS〉 with the opposite phases [cf. BPSK, see Fig. 13(b)]. Squeezing of one of the quadratures of the carrier gives a smaller overlap between the DSS states in comparison to coherent states with the same average number of photons. Thus, the discrimination error probability for the squeezed states, in theory, may fall below the Helstrom bound for BPSK with coherent states in the absence of loss. When the channel has some phase noise, but no significant loss, even a homodyne-based "classical" receiver can approach the quantum optimum.
In Ref. 85, the fundamental quantum limit for discrimination error probability between a coherent and a thermal optical state is computed. Additionally, error probability bounds for direct detection, coherent homodyne detection, and the Kennedy-like receiver are given. The generalization of the Kennedy receiver for discrimination of coherent and thermal states with a low average photon number is shown to closely approach the quantum limit.
The displacement-based discrimination strategies used by quantum receivers were recently adopted for the discrimination of single-rail qubits, a superposition of the vacuum state with a single photon. In Refs. 80 and 86, authors theoretically and experimentally investigate a receiver for orthogonal single rail qubits: | ± = ( | 0 ± | 1 )/ 2[see Fig. 13(c)]. Authors have shown that their setup can discriminate the superposition states using weak coherent states for displacement. Both input states have a certain vacuum and single-photon components. After coherent state displacement, the resulting states have distinct photon-number statistics, Fig. 14. This difference in mean photon numbers can be assessed with a single-photon detector. A feedback discrimination strategy generalized for single-rail qubits yields an SER below that of the perfect homodyne detection. These results can facilitate the implementation of quantum information processing protocols using single-rail qubits.

C. Quantum unambiguous state discrimination
Displacement-based quantum receivers can be employed for so-called unambiguous state discrimination (USD). 71,87,88 Unlike a typical receiver whose goal is to provide the best guess for all input states, unambiguous state discrimination receivers aim to error-free

D. Optimal quantum measurements
We saw that displacement receivers are optimal for some encodings. For other encodings, displacement receivers cannot reach the HB. There is an alternative to displacement measurements, however. For instance, an optimal projective measurement with the help of quantum states, such as cat states, 89 has been proposed and experimentally implemented. To our knowledge, this work is the only experimental effort to date that enables a quantum receiver that is not based on coherent state displacement. Yet another idea is to take advantage of an ancillary quantum system, such as a single atom. [90][91][92] In these proposals, the input light field is mapped on a discrete set of atomic states, followed by a projection measurement. Near-optimal discrimination of BPSK, M-PSK, and M-ASK (amplitude shift keying) encodings has been discussed. An efficient light field interaction with an ancilla atom is required, which may be challenging to experimentally implement with today's technology. Another theoretical proposal shows how to design the optimal receiver for an arbitrary alphabet length and an arbitrary modulation scheme with the help of a universal quantum computer. The input signal is split to m copies each of which is transferred to the quantum computer. The quantum computer performs m unitary operations on the ancilla quantum register. The final state of the ancilla register is measured to arrive to the discrimination result. 93 This idea uses two properties of coherent states: first, splitting a coherent state produces coherent states with the same properties, except for amplitudes; second, a coherent state with a sufficiently small amplitude is well approximated by a singlerail qubit (cf. Refs. 80 and 86). The problem of discriminating coherent states is reduced to discriminating multicopy single-rail qubit states by a sequential coherent-processing receiver. 94

E. Artificial intelligence in communication
One of the interesting future directions for quantum receivers is the possible use of the artificial intelligence for real-time feedback and discrimination. Recently, artificial neural networks were successfully applied to reduce the error probability of the classical communications system, achieving the classical optimal limit. 95 Replacing or pairing Bayesian inference with artificial neural networks could optimize feedback strategy and reduce error rates of quantum receivers in practical settings.

OF THE FUTURE? (IN LIEU OF CONCLUSION)
As it is evident by now, below-the-shot-noise limit discrimination error rates for coherent states have been achieved in many laboratories and for different encoding methods. Properties of displacement-based quantum receivers using non-Gaussian measurement were extensively studied. The field, however, is still in its early stage. Indeed, just one experimental report achieved SERs unconditionally below the classical limit at a telecom wavelength, 68 while other proof-of-principle experiments either use visible light or cannot unconditionally surpass the SNL. 55 Conventional communication systems, on the other hand, are very successful, mature, and competitive. Let us discuss the possible future of quantum technologies for classical communication. Figure 15 shows the channel resource use required for nearly fault-free communication (P e = 10 −5 ) using traditional modulation methods with ideal classical detection. This theoretical plot does not consider channel noise in a practical communication link, which would make energy requirements significantly greater. The sources of such noise include in-line optical amplifiers, cross-talk between wavelength-multiplexed channels, nonlinear effects in fiber, and dark noise of detectors. On the other hand, this plot does not take error correction into account, which can somewhat relax the energy requirements. Yet, we believe that this curve is a good estimation for the threshold of classical technologies. Some classical systems that are currently near this threshold use single-photon detectors 96-99 because of their low dark noise. 27 We see that quantum measurement could potentially reduce channel energy requirements from this threshold by more than one order of magnitude while not requiring more bandwidth.
In certain cases, for instance, for photon-starved communication links, reducing channel energy requirements may be the goal, which is achievable by switching to a quantum measurement at the receiver. However, reducing channel energy requirements does not automatically reduce the total energy consumption of the entire communication link. In fact, the total energy requirements of the state-of-the-art communication link using quantum receivers can be higher than that using classical receivers. Below we discuss if reducing the total energy consumption of communication systems using quantum measurement is fundamentally possible. We also list major technological obstacles that prevent such an energy reduction.
In order to tame the power needs of the telecom links, all components of a communication system should be taken into account. Power requirements of some electronic components scale proportionally to optical power used and those components dominate the power budget of fast (>10 GB/s) optical communication systems. 100 Quantum measurement reduces the energy of light required to transmit one bit; thus, the power required for those electronic components excluding the receiver reduces proportionally. Displacement quantum receivers require significantly stronger LO than that for the ideal classical homodyne or heterodyne measurement. On the other hand, consider a long-distance fiber link where the optical loss is significant. The energy savings at the transmitter scale proportionally to loss and eventually overcome the additional optical power needs at the receiver. Certain single-photon detectors, such as SPADs, are less energyefficient than classical detectors, yet another issue with quantum receivers. A new generation of single-photon detectors particularly superconductor nanowire detectors can use significantly lower currents to reliably register photons than amplified classical detectors, ultimately dissipating approximately 5 aJ per photon detection. 101,102 Therefore, on balance, long-distance communication systems can fundamentally be more energy efficient than classical systems. Significant energy savings could also come from a conceptual rethinking of the network topology. Currently, a series of optical amplification stations mitigate light loss in fiber. Amplification stations are used because they require less wall power to operate than a transceiver. If transceivers power requirements could be dropped below that of an amplifier, the topology of the network would significantly change. Given that a large fraction of the optical noise in current networks is due to amplification and optical power-dependent effects (Raman cross-talk, cross-and self-phase modulation, etc.), the quantum-measurement-based communication system can be made nearly noiseless by reducing optical power. Such a nearly noiseless communication system can naturally support the coexistence of classical and quantum communication channels (such as quantum key distribution and entanglement distribution channels). This optimistic outlook faces serious technological challenges. Currently, even the best single-photon detectors at telecom can count fewer than 100 × 10 6 photons per second. In addition, adaptive algorithms employed in receivers may require extra time to execute. Thus, per-channel data rates may be slower than that of conventional receivers. Wavelength division multiplexing can alleviate this issue, but it will require denser channel "packing" than is currently used. Such packing would require better frequency stabilization of telecom light sources, multiplexers/demultiplexers with better resolution, etc. Some single-photon detectors, such as superconducting nanowire detectors, require a low ambient temperature to operate. Because these detectors generate very little heat when operating, hundreds of such detectors could share the same cooling module. 102 Also, the efficiency of the state-of-the-art cooling systems is far from theoretically optimal, leaving a lot of room for improvement. Lastly, although most of the proof of principle experiments currently use one laser source for both signal and local oscillator, local laser sources with long coherence times and the phase control should be used to unveil the potential energy saving. To this end, new phase correction protocols are being actively considered. One such protocol 78 demonstrates phase estimation based on the output of quantum state discrimination, potentially requiring no exchange of phase information between the transmitter and receiver.
In conclusion, in light of the exponential growth of the Internet traffic and capacity crunch, 1,2 the research of applied practical quantum measurement for communications is of urgent importance. We are cautiously optimistic that quantum technology will be usedeither on a global scale or at least for some niche applications in a near future. We hope that our review helped the curious reader to get acquainted with this exciting field.

FIG. 2.
Resource use per bit for different communication protocols. Bandwidth and the theoretical minimum energy per bit requirements are shown for classical, quantum state discrimination of some communication protocols assuming a symbol error rate P = 10 −5 . The protocols with the same modulation method, but different alphabet lengths M are connected with colored lines. Power-limited protocols are above R/B = 1 line and bandwidth-limited protocols are below R/B = 1 line.

FIG. 7.
Classification of displacement-based quantum receivers. References to experimental demonstrations are in bold.

FIG. 8.
Schematic diagram of first quantum receivers for binary state discrimination. The displacement operation, D, uses a local oscillator state and a beam-splitter. (a) Kennedy-like receiver (non-adaptive) and (b) Dolinar-like receiver (with feedback).  Schematic diagram of adaptive displacement receivers: (a) spatial adaptive displacement receiver and (b) temporal adaptive displacement receiver.  Potential improvement in resource use of quantum-enabled communication over classical technology. The classical resource use is comprised of shot-noise limits (at P e = 10 −5 ) for M-ary PSK (values above R/W = 1) and M-ary PPM (values below R/W = 1), red curve. The potential, but optimistic, quantum bound is Gordon capacity, black curve. Assumptions for the channel capacity and SER bounds derivations.

Metric Channel assumptions Measurement assumptions Encoding assumptions
Gordon capacity (Holevo) Lossless, noiseless Photon number resolving