DDR Signals and FPGA
What is DDR signal/bus
In electronics, many synchronous buses use clock line and one and multiple data lines for reliable data transfer. The clock line is dictating the timing and period when the data lines should be sampled on the receiving end while the data lines carry the signal synchronized to the same clock line.
If, for example, a signal is sampled only on the rising or falling edge of the clock we have an SDR (Single Data Rate) signal. In contrast, if we use both, rising and falling, edges of the clock to transmit the signal we are utilizing DDR (Double Data Rate) signal. Obviously, it is possible to transmit two times as much data by using DDR instead of an SDR signal of the same frequency.
DDR bus is now largely popular with RAM memory.
DDR vs SDR with double the frequency
A naturally emerging question would be why not simply use SDR with double the frequency instead of a more complicated DDR signal. By definition bandwidth and latency of those signals would be identical but DDR still provides benefits regarding ease of PCB trace routing, power consumption and EMI emissions. Generating half of the original frequency signal consumes less power while putting lowering constraints on PCB layout design. Though the same technique is not without disadvantages as the input circuitry diverges from the standard flip-flop model.
Interfacing DDR bus to FPGA
Physical Interfacing
Now when having PCB layout constraints it’s important to check FPGA bank voltage compatibility, as a single FPGA bank is able to use only one voltage level. Another, more common, mistake is not connecting input clock RX_CLK to SRCC or MRCC capable FPGA pin. FPGA manufacturers, for performance and signal integrity reasons, can route clock lines on a different plane from usual RTL lines. Those clock lines are not accessible from every FPGA pin so the selected chip data sheet states pins which are capable of accepting clock lines. On AMD Xilinx FPGAs those pins are marked with SRCC (Single Region Clock Capable) and MRCC (Multi-Region Clock Capable). A good practice is to, if possible, build the firmware where the “place and route” PnR software will issue an error or warning in the former case.
RTL Interfacing
FPGA chips are not required to include circuitry or flip-flop types required for DDR signals. This means it’s required to convert DDR to SDR signals at the chip level. Generic Verilog code for that task would look like this:
module iddr (
input wire clk, // input clock
input wire d, // ddr data input
output wire q1, // sdr data output
output wire q2
);
reg d_tmp_1 = 1'b0;
reg d_tmp_2 = 1'b0;
reg q_tmp_1 = 1'b0;
reg q_tmp_2 = 1'b0;
assign q1 = q_tmp_1;
assign q2 = q_tmp_2;
// sample rising
always @(posedge clk) begin
d_tmp_1 <= d;
end
// sample falling
always @(negedge clk) begin
d_tmp_2 <= d;
end
// sync
always @(posedge clk) begin
q_tmp_1 <= d_tmp_1;
q_tmp_2 <= d_tmp_2;
end
endmodule
Shown code listing simply samples data on rising edge with one off and on falling with the other, resulting in q1 and q2 outputs synced to clk signal. While the listing above is generic for a 1bit signal, FPGA manufacturers provide input DDR primitives with better performance. One such primitive is IDDR applicable for 7 Series Xilinx FPGA, example instantiation looks like this:
IDDR #(
.DDR_CLK_EDGE("SAME_EDGE_PIPELINED"),
.SRTYPE("ASYNC")
)
iddr_inst (
.Q1(q1),
.Q2(q2),
.C(clk),
.CE(1'b1),
.D(d[n]),
.R(1'b0),
.S(1'b0)
);
RGMII example
RGMII signal as discussed on our blog has 4 data lines, a control line and clock for input. Output is implemented in the same fashion with data-flow reversed. In this part, we’ll discuss the input or how to get FPGA to read RGMII data. The strategy would be to take 4 data and 1 control line at 125MHz DDR and create 8 data and 2 control lines at 125MHz SDR. The same is achievable by using iddr module from the previous listing, applying the module looks like this:
//...
iddr iddr_rx_d0 (
.clk(rgmii_rx_clk),
.d(rgmii_rx[0]),
.q1(gmii_rx[0]),
.q2(gmii_rx[1]),
);
//... apply for all data lines ...
iddr iddr_rx_ctl (
.clk(rgmii_rx_clk),
.d(rgmii_rx_ctl),
.q1(rx_ctl_1),
.q2(rx_ctl_2),
);
assign gmii_rx_dv = rx_ctl_1;
assign gmii_rx_er = rx_ctl_1 ^ rx_ctl_2;
// gmii rx clock can be derivated by forwarding rgmii rx clock
// it's recommended to use IBUFR primitive
While the presented code snippet is theoretically enough for reading data from RGMII in praxis it’s required to shift the data/control lines relative to clock line for a more reliable sampling. The same can be achieved using IDELAY primitives with high resolution. Delaying data/control lines would ensure that the signal has gone through hold/setup delay and that we only sample valid data.
About the author
Asmir Abdulahović
FPGA Team lead
Embedded hardware and software enthusiast.
S