Understanding SerDes's second

Source: Internet
Author: User
Tags domain transfer serdes
Understanding SerDes's second(2012-11-11 21:17:12) reprinted
Tags: dfe serdes it
2.3 Receiver-side equalizer (Rx equalizer) 2.3.1 Linear equalizer (Linear equalizer)

The target of the receive-side equalizer is consistent with the send equalizer. For low-speed (<5gbps) SerDes, typically in a continuous time domain, a linear equalizer is implemented such as a spike amplifier (peaking amplifier), and the gain of the equalizer over the high frequency component is greater than the gain of the low frequency component. Figure 2.8 is the frequency domain characteristic of a linear equalizer. Typically, the factory encapsulates several levels of equalization characteristics, which can be dynamically set to accommodate different channel characteristics, such as High/med/low.



Figure 2.8 Frequency Response of A peaking amplifier based Rx equalizer 2.3.2 DfE equalizer (decision Feedback equalizer)

For high-speed (>5gbps) SerDes, the linear equalizer alone is no longer applicable because signal jitter (such as ISI-related deterministic jitter) can exceed or approach a symbol interval (UI, Unit Interval). The linear equalizer amplifies the noise and signal, and does not improve Snr or BER. For high-speed serdes, a nonlinear equalizer called the DfE (decision Feedback equalizer) is used. DfE predicts the sampling threshold for the current bit by tracking data from past multiple UIs (history bits). The DfE only amplifies the signal and does not amplify the noise, which effectively improves the SNR.

Figure 2.9 illustrates a typical 5-order DfE. The received serial data is determined by the comparator (slicer) to 0 or 1, then the data stream is a filter to predict inter-code interference (ISI), and then subtract the inter-code interference (ISI) from the original input signal, thus to a clean signal. To allow the circuit of the DfE equalizer to operate within the circuit's linear range, the serial signal is first automatically controlled by the VGA to enter the DfE signal amplitude.

To understand how the DfE works, first look at the impulse response of a 10Gbps backplane, which is a test-based model given by MATLAB with typical characteristics.


In Figure 2.10, a horizontal lattice represents the time of a UI. As can be seen, a UI (0.1nS = 1/10ghz) pulse signal, through the backplane, leaked into the front and back of several adjacent UI, thereby interfering with the data of other UIs. The disturbance behind the sampling point is called post-cursor interference, which is called pre-cursor interference in front of the sampling point. The first coefficient H1 of the DfE (in this case 0.175) corrects the first post-cursor, and the second factor H2 (0.075 in this example) corrects the second post-cursor. The more orders of DfE, the more post-cursor can be corrected.



Using the above backplane to transmit a 11011 of the stream, due to the leakage of post-cursor and pre-cursor, if not balanced, will cause ' 0 ' is not recognized, see Figure 2.11. Assuming there is a 2-step DfE, then the magnitude of the ' 0 ' bit should be subtracted from the first ' 1 ' bit of H2, the second ' 1 ' bit of h1, get 0.35-0.075-0.175 = 0.1, enough to be recognized as 0.

As can be seen, the DfE calculates the post-cursor interference of the history bits, subtracting the interference in the current bit, thus obtaining a clean signal. Since DfE can only correct the Post-cursor ISI, the DfE is typically preceded by Le. As long as the coefficients of the DfE are close to that of the channels (channel), the desired results can be reached. But the channel is a time-varying medium, such as the slow change of the temperature and voltage process and other factors will change the characteristics of channel channels. So the coefficients of DfE need an adaptive algorithm that automatically wins and follows the change of the channel. The DfE coefficient adaptive algorithm is very academic, each vendor's algorithm is confidential and not disclosed. For NRZ code, the typical algorithm criterion is based on the Sign-error-driven algorithm. The Sign-error is the error of the amplitude and expected value of the equilibrium signal, the algorithm takes the Sign-error mean variance as the optimization goal, successive optimization h1/h2/h3 ... Since the sign-error and sampling positions are coupled together, it is also possible to predict the DfE coefficients for the target by Sign-error and the eye width of two criteria. Therefore, the serdes with the DfE structure usually comes with an embedded eye-graph test circuit, as shown in Figure 2.9. The eye graph test circuit can translate the amplitude of the signal vertically, translate the sampling position horizontally, calculate the BER ber at each translation position, and get the "eye graph" of each offset position and bit error rate, see Figure 2.12.


Figure 2.12 SerDes Embedded Eye-diagram Test Function

2.4 Clock data Recovery (CDR)

The goal of the CDR is to find the best sampling time, which requires a rich jump in the data. The CDR has an indicator called maximum 0 or even 1 length tolerance (Max Run length or consecutive identical Digits) capability. If the data has not been skipped for a long time, the CDR will not be trained accurately, and the CDR sampling time will drift and may be 1 or 0 more than the real data. And when the data is re-resumed, it is possible that the wrong sample will occur. For example, some CDR is implemented with a PLL, and the output frequency of the PLL will drift if the data stops jumping for a long time. In fact, the data transmitted on the SerDes either uses scrambling or encodes the method to ensure that Max Run length is within a certain range.

L 8B/10B Encoding method guarantees max Run length of up to 5 UI.

L 64b/66b encoding method to ensure Max Run length does not exceed 66 UI

L SONET/SDH Scrambling method ensures max Run length does not exceed 80 UI (BER<10^-12)

The majority of SerDes protocols use continuous mode (Continuous-mode) in a point-to-point connection, and the traffic on the line is continuous and uninterrupted. In point-to-multipoint connections, burst mode (Burst-mode), such as Pon, is often used. It is clear that Burst-mode has stringent requirements for serdes lockout time.

Continuous-mode protocols such as SONET/SDH require a longer connection of 0, and there are stringent requirements for the jitter transmission performance of the CDR (because of loop timing).

If Rx (Tx) is in asynchronous mode (asynchronous mode), or in a spectrum extension (SSC) application, the CDR is required to have a wide phase tracking range to track the RX/TX frequency difference.

Depending on the needs of the application scenario, the CDR implementation also has a very wide variety of architectures. The FPGA SerDes is often used for digital PLL-based CDR, and a phase interpolator based CDR. These two kinds of CDR adopt digital filter in the loop, and the structure of the charge pump plus analog filter is more save area.

Figure 2.13 is a CDR based on a phase interpolator. The phase error signal of multiple UI spans is obtained by comparing the phases of the input serial data with the M-phase interval clocks on multiple UI spans. The frequency of the phase error signal is very high, the width is also very wide, after the extractor spin down and smooth, sent to the digital filter. The performance of the digital filter can affect the bandwidth, stability, and reaction speed of the loop. The error signal smoothed by the digital filter is sent to the phase interpolator (phase Rotators) to correct the clock phase. When the final loop is locked, the theoretical phase error is zero and the 90-degree offset clock is sampled as a serial input for the recovery clock.





Figure 2.14 is a DPLL-based CDR, divided into two loops, similar to the data-phase-locked loop (phase tracking loop) and the CDR in Figure 2.13. The phase error signal is obtained by phase comparison of the input serial data with the M-phase-spaced clocks (which may also be on multiple UI spans). The phase error signal is sent to the digital filter. The performance of the digital filter can affect the bandwidth, stability, and reaction speed of the loop. The error signal smoothed by the digital filter is sent to the VCO to fix the clock phase. When the final loop is locked, the theoretical phase error is zero and the 90-degree offset clock is sampled as a serial input for the recovery clock.

The DPLL-based CDR has one more frequency tracking loop (Frequency Tracking loop). This is to reduce the lock time of the CDR and to minimize the design constraints on the loop filter. The data phase tracking loop is switched only when the frequency tracking loop is locked. When the phase tracking loop loses its lock, it automatically switches to the frequency tracking loop. The N-Times reference clock (Reference clock) is nearly equal in frequency and line rate, so the VCO steady-state control voltage for two loops is nearly equal. With the frequency tracking loop, the capture time of the phase tracking loop is reduced.
When the phase tracking loop is locked, the frequency tracking loop does not affect the phase loop. Therefore, the SerDes receiver side has no high requirement for jitter of the reference clock.

The reference clock of the CDR based on the phase interpolator can be either a common PLL or a separate PLL for each channel. The reference clock jitter of this structure can directly affect the jitter of the recovery clock and the received bit error rate.

L Phase Detector (PD)

The phase error is compared to the phase error, which is expressed by the signal of the up or DN, and the Up/dn duration is proportional to the phase error. An example of a BANG-BANG structure phase detector is shown in Figure 2.15. In the example, only four phase recovery clocks are used as an example.


L decimation and filters

The extractor is designed to allow the filter to operate at a lower frequency. The length of the extraction, the smoothing method, will affect the performance of the loop. The digital filter is composed of proportional branch (proportion) and Integral branch (Integral), respectively tracking phase error and frequency error. In addition, the processing delay of the digital filter can not be too large, if the processing delay is too large, it will cause the loop can not track the phase and frequency of rapid changes, resulting in error.

The structure of the CDR is not limited to the above two, there are many other variants. are basically a phase-locked loop. Loop-Following performance, stability (stability), bandwidth (bandwidth)/gain (gain) performance analysis is a very academic issue, with small signal linear model analysis, there are a lot of books and data explaining the quantization performance of loops. Some of the features of the CDR loop are summarized below:

L Loop Bandwidth

1. Phase jitter at frequencies below the loop bandwidth is transferred through the CDR to the recovery clock. In other words, jitter at frequencies below the loop bandwidth can be traced by the CDR without causing a bit of error. The jitter component of high frequency depends on the magnitude of jitter amplitude, which may cause error.

2. The larger the loop bandwidth, the shorter the lock time, and the greater the jitter of the recovery clock. Conversely, the longer the lock time, the less jitter the recovery clock will be. As a CDR, we want the loop bandwidth to be larger, so that there is greater jitter tolerance, but for loop timing applications such as SONET/SDH, there is no limit to the jitter of the recovery clock, nor too much.

3. Switching power supply switching frequency is generally less than the loop bandwidth, can be traced by the CDR. However, while the noise on the switching power supply coupled to the VCO (Digital to multi-phase convertor) cannot be traced by the loop, the low cost ring VCO is particularly sensitive to power supply noise. On the other hand, the switching power supply harmonics may exceed the loop bandwidth.

Some protocols provide a CDR gain template, such as sdh/sonet. Compatibility with these protocols requires calculation of jitter budgets for inputs and outputs.

2.5 Public phase-locked loop (PLL)
SerDes requires an internal clock that works on the data baud rate, or 1/2 data baud rate on the internal clock, which works in DDR mode. The reference clock frequency provided to the SerDes is much lower than the data baud rate, and the PLL is used to generate the internal high frequency clock. The SerDes PLL of an FPGA typically has a 8x,16x,10x,20x,40x mode to support commonly used SerDes interface protocols. For example, pciexpress work at 5Gbps, in 40x mode need to provide 125MHz of the off-chip reference clock, 20x mode needs to provide a 250MHz of the off-chip reference clock.

A third-order PLL circuit, as shown in Figure 2.17, the phase of the input signal and the phase of the VCO feedback signal is compared by a discriminator, the phase error has charge pump converted to a voltage or current signal, after the loop filter smoothing generates a control voltage, correcting the phase of the VCO, and finally the phase error tends to zero.



Figure 2.17 A 3-order Type II PLL

The working process of the PLL is divided into the lock process and the tracking process. During the lock-in process, the loop model can be represented by a nonlinear differential equation, which can be used to evaluate capture time and capture bandwidth. After entering the lock, in the small signal range, the PLL model is a constant coefficient linear equation, can be studied in the Laplace transform domain PLL bandwidth, gain, stability and other performance, figure 2.18 is a small signal mathematical model.



The PLL names the order of the loop in the number of transfer function poles (the root of the denominator). The VCO has integral action on the phase (KVCO/S), so a loop without a filter is called a first-order ring. A loop with a first-order filter is called a second ring. The first and second rings are unconditional stable systems. However, higher-order loops have more poles and 0 points that can be independently adjusted for band, gain, stability, capture band, and capture time performance.

The frequency domain transfer function characteristics of the PLL are mainly the Loop filter f (s) |S=JW, which is determined by a general PLL frequency domain transmission curve as shown in Figure 2.19. There are two important features, loop bands and jitter peaking. An oversized peaking amplifies the jitter, and a large damping factor (damping factor) can limit peaking, but increases the loop's lock time, which affects the speed of the roll-down and the natural frequency (natural frequency).

L Fixed phase difference when loop is locked:

The KDC is the DC open-loop gain for the loop, and the δω is the difference between the VCO center frequency and the controlled frequency. For the charge pump + passive filter structure, the PLL phase error is zero.

L when the loop is locked, only the fixed phase difference, two input signal frequency is equal.

fr/m = fo/n

L for input noise, the loop is a low-pass filter that can suppress noise or interference above the loop cutoff frequency. As a serdes PLL, the bandwidth is expected to be smaller to suppress interference and noise on the reference clock.

For Vco noise, the loop is the function of a high-pass filter. Only VCO noise below the loop cutoff frequency has been suppressed. Excessive VCO high-frequency noise can worsen the jitter of the clock. The low-speed SerDes (<5gbps) VCO uses a ring-structured vco for cost reasons and is noisy and power-sensitive. The high-speed SerDes VCO uses an LC structure VCO with a smaller noise.


Not finished .....

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.