Chapter start Previous page Next page

Figure 2.32
shows symbols for some other datapath elements. The combinational datapath
cells, NAND, NOR, and so on, and sequential datapath cells (flip-flops and
latches) have standard-cell equivalents and function identically. I use
a bold outline (1 point)
for datapath cells instead of the regular (0.5 point)
line I use for scalar symbols. We call a set of identical cells a **vector**
of datapath elements in the same way that a bold symbol, **A** , represents
a vector and A represents a scalar.

A **subtracter** is similar
to an adder, except in a **full subtracter **we have a borrow-in signal,
BIN; a borrow-out signal, BOUT; and a difference signal, DIFF:

DIFF = A NOT(B) NOT( BIN) = SUM(A, NOT(B), NOT(BIN))(2.65)

NOT(BOUT) = A · NOT(B) + A · NOT(BIN) + NOT(B) · NOT(BIN)

= MAJ(NOT(A), B, NOT(BIN))(2.66)

These equations are the same as those for the FA (Eqs. 2.38 and 2.39) except that the B input is inverted and the sense of the carry chain is inverted. To build a subtracter that calculates (A B) we invert the entire B input bus and connect the BIN[0] input to VDD (not to VSS as we did for CIN[0] in an adder). As an example, to subtract B = '0011' from A = '1001' we calculate '1001' + '1100' + '1' = '0110'. As with an adder, the true overflow is XOR(BOUT[MSB], BOUT[MSB 1]).

We can build a ripple-borrow
subtracter (a type of borrow-propagate subtracter), a borrow-save subtracter,
and a borrow-select subtracter in the same way we built these adder architectures.
An **adder/subtracter** has a control signal that gates the A input with
an exclusive-OR cell (forming a programmable inversion) to switch between
an adder or subtracter. Some adder/subtracters gate both inputs to allow
us to compute (A B).
We must be careful to connect the input to the LSB of the carry chain (CIN[0]
or BIN[0]) when changing between addition (connect to VSS) and subtraction
(connect to VDD).

A **barrel shifter** rotates
or shifts an input bus by a specified amount. For example if we have an
eight-input barrel shifter with input '1111 0000' and we specify a
shift of '0001 0000' (3, coded by bit position) the right-shifted 8-bit
output is '0001 1110'. A barrel shifter may rotate left or right (or
switch between the two under a separate control). A barrel shifter may also
have an output width that is smaller than the input. To use a simple example,
we may have an 8-bit input and a 4-bit output. This situation is equivalent
to having a barrel shifter with two 4-bit inputs and a 4-bit output. Barrel
shifters are used extensively in floating-point arithmetic to align (we
call this **normalize** and **denormalize** ) floating-point numbers
(with sign, exponent, and mantissa).

A **leading-one detector**
is used with a normalizing (left-shift) barrel shifter to align mantissas
in floating-point numbers. The input is an
n -bit bus A, the output is an n
-bit bus, S, with a single '1' in the bit position corresponding to the
most significant '1' in the input. Thus, for example, if the input is A = '0000 0101'
the leading-one detector output is S = '0000 0100',
indicating the leading one in A is in bit position 2 (bit 7 is the MSB,
bit zero is the LSB). If we feed the output, S, of the leading-one detector
to the shift select input of a normalizing (left-shift) barrel shifter,
the shifter will normalize the input A. In our example, with an input of
A = '0000 0101',
and a left-shift of S = '0000 0100',
the barrel shifter will shift A left by five bits and the output of the
shifter is Z = '1010 0000'.
Now that Z is aligned (with the MSB equal to '1') we can multiply Z with
another normalized number.

The output of a **priority
encoder** is the binary-encoded position of the leading one in an input.
For example, with an input A = '0000 0101'
the leading 1 is in bit position 3 (MSB is bit position 7) so the output
of a 4-bit priority encoder would be Z = '0011'
(3). In some cell libraries the encoding is reversed so that the MSB has
an output code of zero, in this case Z = '0101'
(5). This second, reversed, encoding scheme is useful in floating-point
arithmetic. If A is a mantissa and we normalize A to '1010 0000' we
have to subtract 5 from the exponent, this **exponent correction** is
equal to the output of the priority encoder.

An **accumulator** is an
adder/subtracter and a register. Sometimes these are combined with a multiplier
to form a **multiplieraccumulator** (** MAC** ). An **incrementer**
adds 1 to the input bus, Z =
A + 1,
so we can use this function, together with a register, to negate a two's
complement number for example. The implementation is Z[
i ] = XOR(A[
i ], CIN[ i ]), and COUT[
i ] = AND(A[
i ], CIN[ i ]). The carry-in
control input, CIN[0], thus acts as an enable: If it is set to '0' the output
is the same as the input.

The implementation of arithmetic cells is often a little more complicated than we have explained. CMOS logic is naturally inverting, so that it is faster to implement an incrementer as

Z[ i (even)] = XOR(A[ i ], CIN[ i ]) and COUT[ i (even)] = NAND(A[ i ], CIN[ i ]).

This inverts COUT, so that in the following stage we must invert it again. If we push an inverting bubble to the input CIN we find that:

Z[ i (odd)] = XNOR(A[ i ], CIN[ i ]) and COUT[ i (even)] = NOR(NOT(A[ i ]), CIN[ i ]).

In many datapath implementations all odd-bit cells operate on inverted carry signals, and thus the odd-bit and even-bit datapath elements are different. In fact, all the adder and subtracter datapath elements we have described may use this technique. Normally this is completely hidden from the designer in the datapath assembly and any output control signals are inverted, if necessary, by inserting buffers.

A **decrementer** subtracts
1 from the input bus, the logical implementation is Z[
i ] = XOR(A[
i ], CIN[ i ]) and COUT[
i ] = AND(NOT(A[
i ]), CIN[ i ]). The implementation
may invert the odd carry signals, with CIN[0] again acting as an enable.

An **incrementer/decrementer**
has a second control input that gates the input, inverting the input to
the carry chain. This has the effect of selecting either the increment or
decrement function.

Using the **all-zeros detectors**
and **all-ones detectors** , remember that, for a 4-bit number, for example,
zero in ones' complement arithmetic is '1111' or '0000', and that zero in
signed magnitude arithmetic is '1000' or '0000'.

A **register file** (or
scratchpad memory) is a bank of flip-flops arranged across the bus; sometimes
these have the option of multiple ports (multiport register files) for read
and write. Normally these register files are the densest logic and hardest
to fit in a datapath. For large register files it may be more appropriate
to use a multiport memory. We can add control logic to a register file to
create a **first-in first-out register** (** FIFO** ), or **last-in
first-out register** (** LIFO** ).

In Section 2.5 we saw
that the standard-cell version and gate-array macro version of the sequential
cells (latches and flip-flops) each contain their own clock buffers. The
reason for this is that (without intelligent placement software) we do not
know where a standard cell or a gate-array macro will be placed on a chip.
We also have no idea of the condition of the clock signal coming into a
sequential cell. The ability to place the clock buffers outside the sequential
cells in a datapath gives us more flexibility and saves space. For example,
we can place the clock buffers for all the clocked elements at the top of
the datapath (together with the buffers for the control signals) and **river
route** (in river routing the interconnect lines all flow in the same
direction on the same layer) the connections to the clock lines. This saves
space and allows us to guarantee the clock skew and timing. It may mean,
however, that there is a fixed overhead associated with a datapath. For
example, it might make no sense to build a 4-bit datapath if the clock and
control buffers take up twice the space of the datapath logic. Some tools
allow us to design logic using a **portable netlist** . After we complete
the design we can decide whether to implement the portable netlist in a
datapath, standard cells, or even a gate array, based on area, speed, or
power considerations.