In this section we return to the Viterbi decoder from Chapter 11. After an initial synthesis run that shows how logic synthesis works with a real example, we step back and study some of the issues and problems of using HDLs for logic synthesis.
Some logic synthesizers can include I/O cells automatically, but the designer may have to use directives to designate special pads (clock buffers, for example). It may also be necessary to use commands to set I/O cell features such as selection of pull-up resistor, slew rate, and so on. Unfortunately there are no standards in this area. Worse, there is currently no accepted way to set these parameters from an HDL. Designers may also use either generic technology-independent I/O models or instantiate I/O cells directly from an I/O cell library. Thus, for example, in the Compass tools the statement
The next example illustrates the use of generic I/O cells from a standard-component library. These components are technology independent (so they may equally well be used with a 0.6 m m or 0.35 m m technology).
The module PadTri can be used for simulation and as the basis for synthesizing an I/O cell. However, the synthesizer also has to be told to synthesize an I/O cell connected to a bonding pad and the outside world and not just an internal three-state buffer. There is currently no standard mechanism for doing this, and every tool and every ASIC company handles it differently.
In Chapter 8 we used the halfgate example to demonstrate an FPGA design flow—including I/O. If the synthesis tool is not capable of synthesizing I/O cells, then we may have to instantiate them by hand; the following code is a hand-instantiated version of lines 19 – 22 in module allPads :
I/O cell models allow us to simulate the behavior of the synthesized logic inside an ASIC “all the way to the pads.” To simulate “outside the pads” at a system level, we should use these same I/O cell models. This is important in ASIC design. For example, the designers forgot to put pull-up resistors on the outputs of some of the SparcStation ASICs. This was one of the very few errors in a complex project, but an error that could have been caught if a system-level simulation had included complete I/O cell models for the ASICs.
Most simulators cannot synthesize this model because there are two wait statements in one always statement (line 6 ). We could change the code to use flip-flops from the synthesizer standard-component library by using the following code:
Unfortunately we would have to change all the flip-flop models from 'dff' to 'asDff' and the code would become dependent on a particular synthesis tool. Instead, to maintain independence from vendors, we shall use the following D flip-flop model for synthesis and simulation:
The following code models the top-level Viterbi decoder and instantiates (with instance name v_1 ) a copy of the Verilog module viterbi from Chapter 11. The model uses generic input, output, power, and clock I/O cells from the standard-component library supplied with the synthesis software. The synthesizer will take these generic I/O cells and map them to I/O cells from a technology-specific library. We do not need three-state I/O cells or bidirectional I/O cells for the Viterbi ASIC.
At this point we are ready to begin synthesis. In order to demonstrate how synthesis works, I am cheating here. The code that was presented in Chapter 11 has already been simulated and synthesized (requiring several iterations to produce error-free code). What I am doing is a little like the Galloping Gourmet’s television presentation: “And then we put the soufflé in the oven . . . and look at the soufflé that I prepared earlier.” The synthesis results for the Viterbi decoder are shown in Table 12.6 . Normally the worst thing we can do is prepare a large amount of code, put it in the synthesis oven, close the door, push the “synthesize and optimize” button, and wait. Unfortunately, it is easy to do. In our case it works (at least we may think so at this point) because this is a small ASIC by today’s standards—only a few thousand gates. I made the bus widths small and chose this example so that the code was of a reasonable size. Modern ASICs may be over one million gates, hundreds of times more complicated than our Viterbi decoder example.
The derived schematic for the synthesized core logic is shown in Figure 12.6 . There are eight boxes in Figure 12.6 that represent the eight modules in the Verilog code. The schematics for each of these eight blocks are too complex to be useful. With practice it is possible to “see” the synthesized logic from reports such as Table 12.6 . First we check the following cells at the top level:
FIGURE 12.6 The core logic of the Viterbi decoder ASIC. Bus names are abbreviated in this figure for clarity. For example the label m_out0-3 denotes the four buses: m_out0, m_out1, m_out2, and m_out3.
- pc5c01 is an I/O cell that drives the clock node into the logic core. ASIC designers also call an I/O cell a pad cell , and often refer to the pad cells (the bonding pads and associated logic) as just “the pads .” From the library data book we find this is a “core-driven, noninverting clock buffer capable of driving 125 pF.” This is a large logic cell and does not have a bonding pad, but is placed in a pad site (a slot in the ring of pads around the perimeter of the die) as if it were an I/O cell with a bonding pad.
- pc5d01r is a 5V CMOS input-only I/O cell with a bus repeater. Twenty-four of these I/O cells are used for the 24 inputs ( in0 to in7 ). Two more are used for Res and Clk . The I/O cell for Clk receives the clock signal from the bonding pad and drives the clock buffer cell ( pc5c01 ). The pc5c01 cell then buffers and drives the clock back into the core. The power-hungry clock buffer is placed in the pad ring near the VDD and VSS pads.
- pc5o06 is a CMOS output-only I/O cell with 6X drive strength (6 mA AC drive and 4 mA DC drive). There are four output pads: three pads for the signal outputs, outp[2:0 ], and one pad for the output signal, error .
- pv0f is a power pad that connects all VSS power buses on the chip.
- pvdf is a power pad that connects all VDD power buses on the chip.
- viterbi_p is the core logic. This cell takes its name from the top-level Verilog module ( viterbi ). The software has appended a "_p" suffix (the default) to prevent input files being accidentally overwritten.
The software does not tell us any of this directly. We learn what is going on by looking at the names and number of the synthesized cells, reading the synthesis tool documentation, and from experience. We shall learn more about I/O pads and the layout of power supply buses in Chapter 16.
Next we examine the cells used in the logic core. Most synthesis tools can produce reports, such as that shown in Table 12.7 , which lists all the synthesized cells. The most important types of cells to check are the sequential elements: flip-flops and latches (I have omitted all but the sequential logic cells in Table 12.7 ). One of the most common mistakes in synthesis is to accidentally leave variables unassigned in all situations in the HDL. Unassigned variables require memory and will generate unnecessary sequential logic. In the Viterbi decoder it is easy to identify the sequential logic cells that should be present in the synthesized logic because we used the module dff explicitly whenever we required a flip-flop. By scanning the code in Chapter 11 and counting the references to the dff model, we can see that the only flip-flops that should be inferred are the following:
- 24 (3 ¥ 8) D flip-flops in instance subset_decode
- 132 (11 ¥ 12) D flip-flops in instance path_memory that contains 11 instances of path (12 D flip-flops in each instance of path )
- 12 D flip-flops in instance pathin
- 20 (5 ¥ 4) D flip-flops in instance metric
The total is 24 + 132 + 12 + 20 = 188 D flip-flops, which is the same as the number of dfctnb cell instances in Table 12.7 .
Synthesizer output 3
Table 12.6 gives the total width of the standard cells in the logic core after logic optimization as 18,048 m m. Since the standard-cell height for this library is 72 l (21.6 m m), we can make a first estimate of the total logic cell area as
In the physical layout we shall need additional space for routing. The ratio of routing to logic cell area is called the routing factor . The routing factor depends primarily on whether we use two levels or three levels of metal. With two levels of metal the routing factor is typically between 1 and 2. With three levels of metal, where we may use over-the-cell routing, the routing factor is usually zero to 1. We thus expect a logic core area of 600–1000 mils 2 for the Viterbi decoder using this cell library.
From Table 12.6 we see the I/O cells in this library are 100.8 m m wide or approximately 4 mil (the width of a single pad site). From the I/O cell data book we find the I/O cell height is 650 m m (actually 648.825 m m) or approximately 26 mil. Each I/O cell thus occupies 104 mil 2 . Our 33 pad sites will thus require approximately 3400 mil 2 which is larger than the estimated core logic area.