June 11, 2022

FPGA - entities and gadgets

In several places you can find tables like this as to what resources exist within a particular FPGA. The purpose of this page is to explore exactly what these things are.
This will most certainly not be a fully detailed description, but rather a sampler to give a feel for what these gadgets in an FPGA are.
XC7Z010  - Zynq-7000  2011  2200 clb, 17600 6-lut, 1500 slicem,  60 bram,  80 dsp, 2 cmt, 100 uio
XC7Z020  - Zynq-7000  2011  6650 clb, 53200 6-lut, 4350 slicem, 140 bram, 220 dsp, 4 cmt, 200 uio
Just to clear the air on this -- almost nobody talks about how many "gates" a given FPGA has. People are interested in how many CLB and other resources a device has.

One of the first thing you might notice is that the count of 6-lut is 8 times the count of clb. This might lead you to suspect that each clb contains 8 of these 6-lut. It turns out that this is exactly right. Also, be aware that a SLICEM is part of a clb (but not part of all of them).

All of the recent FPGA architectures use 6-LUT. Some ancient devices used 4-lut blocks.

In addition, different Zynq chips use different classes of FPGA fabric. The Zynq devices I have (listed above) use the Artix-7 fabric, but the next Zynq chips up the line (XC7Z030 and above) use the Kintex-7 fabric. Kintex yields faster switching times, but at a higher cost.

Interestingly, the "deep dive" article suggests that you can use Vivado to "zoom in" on a generated design and see exactly how a CLB (or anything else presumably) was configured for your design.

A CLB consists of two slices, either a pair of SLICEL or a SLICEM and a SLICEL. "L" is for logic and "M" is for memory. Each slice contains four LUT, eight flip-flops, along with carry logic and three types of multiplexers. A SLICEM is special as it can be configured to build memory or shift registers.

The LUT has 2 outputs, allowing it to be split into a pair of LUT side by side with flexible numbers of inputs. Alternately, two of the LUT inputs can be configured to act as a 4:1 MUX, selecting among the remaining 4 inputs. Or you can combine two LUT to have an 8:1 MUX, or you can combine 4 LUT to get a 16:1 mux.

Each CLB has logic to implement carry lookahead via a combination of XOR gates and dedicated multiplexors. This allows a column of slices to be cascaded for fast addition and multiplication of objects bigger than 4 bits.

A BRAM block (in a 7-series FPGA like these) is a 36k-bit dual port ram. The Naive view would be that this is 4.5 K-bytes. This may or may not be true given how the block is configured, but we ignore that for now. In the smaller device I list (the "010") there are 60 of these, so we have 270 K-bytes of ram all told in the bram blocks. In the larger "020" there are 140 of them, so 630 K-bytes of dual ported ram.

A DSP block (and we have a lot of these) has the following:

DSP blocks can be cascaded to generate wider paths.

The IO blocks should not be overlooked (notice that the "UG" for these is over twice as big as any of the others). These are the interface to the outside world. They are arranged in banks (HR and HP) with different properties. They can be configured to work at different voltage levels. They can drive differential signals in pairs. They may contain memory elements (latches). An important limitation of some of the fancier FPGA families is the speed at which they can drive signals. (Some devices have gigabit transceivers) (Some devices have 100 gigabit ethernet MAC interfaces).


Feedback? Questions? Drop me a line!

Tom's Computer Info / tom@mmto.org