You need to test, we're here to help.

You need to test, we're here to help.

17 January 2018

Some More PCIe 3.0 Test Examples (Part II)

This shows how a PeRT 3 state-machine log can be invaluable in diagnosing timeouts in requests for presets
Figure 1: This shows how a PeRT 3 state-machine log can be invaluable
in diagnosing timeouts in requests for presets
Continuing on from our last post, let's look at some more examples of common PCIe 3.0 test scenarios and how a well-equipped PCIe 3.0 testbench would approach them. Recall, if you will, that such a testbench would comprise a real-time digital oscilloscope of suitable bandwidth (such as Teledyne LeCroy's SDA830Zi-B oscilloscope), a protocol-enabled receiver tester (such as Teledyne LeCroy's PeRT 3 Phoenix System), and software that enables simultaneous, correlated views of the protocol and physical layers (such as Teledyne LeCroy's ProtoSync software).

Timeout at Phase 3 Preset Request

As your DUT proceeds through the various stages in the link-equalization process, the PeRT3 is dutifully maintaining a state-machine log (Figure 1). Let's say your system is in Phase 2 of the recovery.equalization process. In that Phase 2, the add-in card requests a preset from the system board's (in our test setup, the PeRT 3 is masquerading as the "system board") TxEQ, and then another, cycling through all the presets until it arrives at one it's pleased with. Then, the system transitions to Phase 3, where it again requests a preset. Only this time, it never gets a response, and the process times out.

In the PeRT 3 options, you can simply set a trigger for EQ Timeout. When such an event takes place, the PeRT 3 sends a trigger to the oscilloscope, which is being fed the upstream and downstream signals via splitters in the connections between the PeRT3 and the DUT. Once the oscilloscope is triggered, we can capture both signals and see exactly what is happening. With the ProtoSync software, you can look at both the protocol layer and the electrical waveform to pinpoint the source of the misbehavior.

DUT Firmware Bug in EQ Settings

In this scenario, the PeRT 3 facilitates tracking down of a firmware bug in the DUT
Figure 2: In this scenario, the PeRT 3 facilitates
tracking down of a firmware bug in the DUT
In another possible scenario, we've acquired the PCIe waveforms on the oscilloscope (Figure 2). We can observe the protocol trace and what is compiled in the PeRT 3's state-machine log. At one point, the "system" (which is really the PeRT 3) requests Preset 4, but nothing happens.

Upon examining the electrical waveform (Figure 2, left), we can see the preset request (pink trace, center right) and then it seems to suddenly go to electrical idle. If we zoom in on the waveform (top right), the signal has been attenuated to an amplitude of 20 mV. Other presets didn't have this issue, but only Preset 4. The problem was traced to a firmware bug, probably a typo, that forced the amplitude to shrink when Preset 4 was requested.

Bit Error-Rate Testing

One of the key tests in the PCIe 3.0 compliance-test suite is a bit error-rate (BER) test. At the end of the recovery.equalization process, the DUT should have settled on a TxEQ setting that it wants the system board to use. It can only choose one setting. But in our test process, we can run through the entire spectrum of possible TxEQ settings. Yet, in some cases, the DUT chooses a preferred TxEQ setting, but the BER test fails to give a result with the required BER of 10-12 even with the DUT using that chosen setting.

A bit error-rate test on an add-in card DUT  yields an unsatisfactory result
Figure 3: A bit error-rate test on an add-in card DUT
yields an unsatisfactory result
Figure 3 shows a plot of a large number of BER measurements made while sweeping the PeRT 3 through a range of de-emphasis and preshoot settings for the TxEQ filter. The X axis is de-emphasis amplitudes, the Y axis is bit-error rates, and the color-coded plot lines represent different amounts of preshoot.

Unfortunately, no matter which settings are used, this DUT cannot attain a BER of better than 10-8. So there's no chance, even with a perfectly tuned algorithm, that this DUT will ever yield a BER of 10-12. This would cause real problems if this DUT were to be used in a real system, and it would never pass the PCIe 3.0 link-equalization compliance testing. PCIe assumes a low BER, and a higher one can cause the link to fail and re-train at sub-optimal data rates.

Previous posts in this series:

The Hows and Whys of PCIe 3.0 Dynamic Link Equalization
PCIe 3.0 Dynamic Link EQ: De-Emphasis, Preshoot, Cursors, and Presets
An Under-The-Hood View of PCIe 3.0 Link Training (Part I)
An Under-The-Hood View of PCIe 3.0 Link Training (Part II)
A Tour of a PCIe 3.0 Test Setup
Some PCIe 3.0 Test Examples

No comments:

Post a Comment