# Performance and Power Consumption Analysis of Full Adders Designed in 32nm Technology

Fábio G. R. G. da Silva, Paulo F. Butzen, Cristina Meinhardt

Center for Computational Science – C3

Universidade Federal do Rio Grande – FURG, Rio Grande, Brazil

fabiorgs2010@gmail.com, {paulobutzen, cristinameinhardt}@furg.br

*Abstract*— This paper presents an evaluation of several full adder architectures. The main goal is to compare electrical characteristics and verify which architectures are more appropriated to deal with timing and power project constrains in 32nm circuits designs. The investigated full adders are 6 traditional architectures and 24 circuits composed by the combination of 3 smaller blocks. A 32nm predictive technology [7] is used to obtain the timing and power results.

Keywords— Full Adders, Circuit Design, Nano-Technology.

### I. INTRODUCTION

Integrated circuits are increasingly presents in our lives, being used in new smartphones, notebooks, sophisticated electrical systems of cars and others. This huge dissemination has been possible due to the technology scaling. The challenges introduced in nanometer technologies, as variability, aging, and static power, are faced to circuit designers. However, the power and timing needs are still the main design constraints for each one of these applications. From this perspective, it is necessary to review the traditional design solutions and verify the circuit behaviours in new technologies.

In computer systems, the arithmetic and logic unit (ALU) is responsible for take logical decisions and perform arithmetic operations on data and addresses. The main operation executed for the ALU is the sum. This operation has high importance for two reasons: the high number of times which it is executed and by giving support to other operations such as subtraction, multiplication and division.

This paper presents an analysis of an extensive number of full adders architectures designed in nanometer technologies. The evaluated full adders include the classical architectures, such as CMOS, CPL, Hybrid, TFA, TGA, 14T and also 24 solutions build form a 3-blocks composing strategy [1-6]. These full adder architectures are characterized in a predictive 32 nm technology [7]. The timing and power characteristics are extracted and compared.

The rest of the paper is organized as following. Section 2 present the classical full adder architectures evaluated in this work and the details related to the 3-block composing strategy. The methodology presented in Section 3 explores circuit design and simulation characteristics. The obtained results and

analysis are discussed in Section 4. Finally, Section 5 presents the final considerations.

#### II. FULL ADDERS ARCHITECTURES

The full adder is a three-input two-output block. The inputs A and B are the two bits to be added. The "carry in" bit ( $C_{in}$ ) is the third input and it is derive from the previous bits calculations. The two outputs are the sum (S) and the "carry out" bit ( $C_{out}$ ). These outputs are defined in Equation (1) and (2).

$$Sum = (A \oplus B) \oplus Cin \tag{1}$$

$$Cout = (A * B) + [Cin * (A \oplus B)]$$
<sup>(2)</sup>

A huge number of different versions of full adders can be found in the literature. In this work, standard implementations of full adders will be used as basis for comparison to the 3blocks composing strategies. The classical full adders are: CMOS, CPL, Hybrid, TFA, TGA and 14T. Figure 1 shows the transistor arrangement of each of these classical circuits. The main characteristics of each traditional adder cell are described below.

#### A. CMOS

This architecture is the standard adder. It is based on CMOS logic family that has complementary pull-up and pulldown transistors networks. The high drive capability is its main characteristic [3].

## B. CPL (Complementary Pass-Transistor Logic)

This adder is another well-known architecture. It explores the concept of pass-transistors. It also has strong signal at the output and a good driving capability due to the output inverters [5].

## C. Hybrid

This adder architecture is known as a mixture of previous two described adders. It has been proposed in order to optimize performance and reduce power consumption, mainly at low operation voltages [6].

## D. TGA (Transmission Gate Full Adder)

This adder architecture explores the transmission gate structure, which consist in a NMOS and a PMOS transistor in parallel. The transmission gate can be considered a particular type of "Pass Transistor". When compared to CPL solution, the use of transmission gate does not cause severe signal degradation and the output inverters are not needed to restore the signal [4].

#### E. TFA (Transmission Function Full Adder) or AB1

This adder architecture is based on transmission function theory. It uses pull-up and pull-down networks to achieve good drive capability, and uses transmission gates for the rest of logic. This adder is equivalent to a circuit generated using the 3-blocks composing strategy (AB1), which will be presented in the following sections. Efficient implementations of NOR and XOR gates are explored in this solution [4]. This solution is a particular case of 3-blocks composing strategy presented in the following.

## F. 14T

This adder architecture has been proposed to be a low power solution. It is based on a low power XOR combined with transmission gates [6].



Fig. 1 Classical adders architectures

## **III. 3-BLOCKS COMPOSING STRATEGY**

The 3-blocks composing strategy perform some Boolean manipulation from Equations (1) and (2). The basic idea is to create the intermediates signals H and H' and use them to generate S and  $C_{out}$  signals. The signal H is an XOR between A and B as presented in Equation 3.

$$H = A \oplus B \tag{3}$$

From the Boolean manipulation performed in Equations (1) and (2), the signals S and Cout can be re-written as presented in Equation (4) and (5) respectively.

$$Sum = Cin \oplus H = (H * Cin') + (H'*Cin)$$
(4)

$$Cout = (A * H') + (Cin * H)$$
<sup>(5)</sup>

From previous analysis, the full adder cell can be designed by the composition of three blocks. Figure 2 illustrate the block diagram of full adders designed from this methodology. It is clear that signals H and its complement H' are the key variables in both output adder equations. The optimization in H and H' signal generation could greatly enhance the performance of the full adder cell.



Fig. 2 Block diagram to generate full adders circuits

Since each block can be designed by different solutions, the evaluation of different combinations is interesting to find the better solution for a specific purpose. The structures used in each block in this work are discussed below.

- Block1 contains three different structures. All of them implementing the XOR function to compute the H value as expressed in Equation (3). Figure 3 presents the three structures explored for Block1.
- Block 2 explores four different structures. All of them also represent a XOR function which provides the S value from Equation (4). Figure 4 shows these structures.
- Block 3 contains 2 circuits to calculate the C<sub>out</sub> signal as expressed in Equation (5). Figure 5 shows the circuits implemented for Block 3.

This work explores all possible combinations of these three different blocks, resulting in twenty-four different full adder circuits. These circuits are named 'aa1' to 'cd2', where the first letter indicates the circuit option for Block 1, the second letter the circuit adopted for Block 2 and the last number indicates the circuit used to calculate the  $C_{out}$  value.



Fig. 4 Structures of block 2



#### -

### IV. METHODOLOGY

This work evaluates the 24 adders cells obtained by 3 blocks composing strategy described in the previous Section and compare the results with 6 classical adders architectures.

All circuits were described in SPICE and electrical simulated with NGSPICE [8]. A 32nm predictive technology is used to obtain the timing and power results [7]. The

technology model adopted is for high-performance applications (HP).

The experiment consists in two steps. The first one is the logical validation. The second step consist in extract delay and power consumption results for all described adders cells. The area information is considered as number of transistors. Further studies will consider the layout to estimate area results.

To evaluate power consumption, this work considers the power consumed of the supply network during the simulation time. The power is calculated by the Equation (6) and adopts the current integral measured by the electrical simulator.

$$P = \frac{\int_{0}^{t} I_{Vdd} * V_{dd}}{t}$$
(6)

Delay is measured for all high and low propagation times for the two outputs of the adder cells, C<sub>out</sub> and S. Minimum, average and maximum propagations times are computed and used to compare the performance results.

Figure 6 outlines how the simulations were performed for all full adders. The supply voltage is 0.9V for the adopted 32nm technology. All circuits receive input signal with in\_slope of 0.01ns, and 1 fF output capacitances are used as loads for S and  $C_{out}$  outputs.

Considering the challenges in nanometer technologies, regularity and variability are addressed adopting the same transistor width for all transistors in the circuits. In this project, PMOS and NMOS transistors are adopting transistor width of 100nm. Due to this sizing constraint, two adder circuits, BC1 and BC2, have electrical problems in logical validation stage. These two circuits need an appropriated transistor sizing to work properly. In this sense, BC1 and BC2 adder cells are omitted in future analyses.



. . . . . .

The simulation results for each full adder are presented in Table I. The first five lines show the classical adders solutions ordered by Power-Delay-Product (PDP) column. The PDP was computed from average delay (*Davg*) and power consumption values, which is showed in equation (7). Following, all 3-block composing version are listed according to composed blocks order. The full adders with smaller area are the 14T and BB1, both with 14 transistors. The biggest one is the CPL with 32 transistors.

$$PDP = Davg^*P \tag{7}$$

In terms of timing performance, the faster ones are the CA1, BA1, AA1, CC1, and AC1. They are around four times faster

than traditional CMOS solution in minimum delay. In terms of average delay, the adders CB1 and CD1 present the better results. They are around 2.5 times faster than CMOS adder. These results indicate that the third solution for Block1 and the first solution for Block3 are excellent options to build fast circuits.

The smaller power consumptions are achieved in CD1, CB1, 14T, AD1, and AB1/TFA adder cells. As expected the classical 14T solution, designed to be low power, is in the previous list. Other solutions of composing strategy achieve similar results. The PDP follows the power results.

## VI. CONCLUSIONS

A comparison between an extensive numbers of full adder cells has been performed in this work. Classical adder cells have been compared to 3-block composing strategy solutions designed in nanometer technologies. Considering the needed for regularity in nanometer devices, all transistors present the same size. As expected, the results show that the traditional approaches, mainly the well-known CMOS and CPL present worst performance when compared to composing solutions. To provide a more complete analysis, the use of different size for transistor should be explored to verify whether the behaviour keep the same.

#### REFERENCES

- V. Moalemi and A. Afzali-Kusha, Subthreshold 1-Bit Full Adder Cells in sub-100nm Technologies, IEEE Computer Society Annual Symposium on VLSI (ISVLI'07)(2007).
- [2] A. Silva, C. Meinhardt and P.F. Butzen, *Full Adders Architectures Evaluation for 32nm Technology*, XXVII SIM South Symposium on Microelectronics.
- [3] A. Shams, T. Darwish and M. Bayoumi, *Performance analysis of low Power 1-bit CMOS full adder cells*, IEEE Trans, on VLSI (pp. 20-29). (2002).
- [4] V. Foroutan, K. Navi and M. Haghparast, A New Low Power Dynamic Full Adder Cell Based on Majority Function, World Applied Sciences Journal 4 (1): (133-141), (2008).
- [5] M. Alioto and G. Palumbo, Analysis and Comparison on full adder block in submicron technology, IEEE Trans, on VLSI (pp. 806-823) (2002).
- [6] C. H. Chang, J. Gu and M. Zhang, A review of 0.18um full adder performances for tree structured arithmetic circuits, IEEE Trans. VLSI (pp. 686 – 695) (2005).
- W. Zhao and Y. Cao, New generation of Predictive Technology Model for sub-45nm early design exploration, IEEE Trans (pp. 2816-2823) (2006).
- [8] NGSpice. Available at: http://ngspice.sourceforge.net/

| Adder     | Number of   | Delay(ps) |      |      | Consumption | PDP (aJ) |
|-----------|-------------|-----------|------|------|-------------|----------|
| Cell      | Transistors | Min       | Avg  | Max  | Power (uW)  |          |
| 14T       | 14          | 5.29      | 20.7 | 67.2 | 0.05        | 1.035    |
| TGA       | 20          | 5.10      | 17.3 | 39.5 | 0.29        | 5.017    |
| Hybrid    | 26          | 7.87      | 24.5 | 43.2 | 0.44        | 10.78    |
| CMOS      | 28          | 19.90     | 39.5 | 65.5 | 0.58        | 22.91    |
| CPL       | 32          | 16.80     | 35.3 | 58.4 | 1.28        | 45.184   |
| AA1       | 18          | 4.85      | 20.3 | 38.3 | 0.32        | 6.496    |
| AB1 / TFA | 16          | 5.27      | 16.9 | 28.0 | 0.06        | 1.014    |
| AC1       | 18          | 4.93      | 20.6 | 41.2 | 0.29        | 5.974    |
| AD1       | 17          | 5.06      | 16.7 | 28.1 | 0.05        | 0.835    |
| BA1       | 16          | 4.83      | 44.1 | 169  | 1.35        | 59.5     |
| BB1       | 14          | 5.34      | 25.5 | 111  | 0.41        | 10.455   |
| BD1       | 15          | 5.06      | 27.3 | 109  | 0.41        | 11.19    |
| CA1       | 18          | 4.81      | 33.4 | 147  | 0.80        | 26.72    |
| CB1       | 16          | 5.42      | 15.1 | 30.1 | 0.04        | 0.604    |
| CC1       | 18          | 4.88      | 37.6 | 186  | 0.83        | 31.2     |
| CD1       | 17          | 5.05      | 15.4 | 27.3 | 0.02        | 0.308    |
| AA2       | 24          | 11.8      | 23.7 | 42.5 | 0.53        | 12.56    |
| AB2       | 22          | 5.32      | 19.6 | 42.5 | 0.28        | 5.488    |
| AC2       | 24          | 9.88      | 23.4 | 42.5 | 0.50        | 11.7     |
| AD2       | 23          | 5.67      | 19.7 | 42.5 | 0.27        | 5.31     |
| BA2       | 22          | 11.8      | 40.8 | 158  | 1.64        | 66.9     |
| BB2       | 20          | 5.39      | 25.1 | 102  | 0.63        | 15.8     |
| BD2       | 21          | 5.77      | 25.1 | 98.4 | 0.63        | 15.8     |
| CA2       | 24          | 11.8      | 37.4 | 146  | 1.01        | 37.77    |
| CB2       | 22          | 5.96      | 18.2 | 42.5 | 0.25        | 4.55     |
| CC2       | 24          | 9.86      | 41.5 | 186  | 1.03        | 42.7     |
| CD2       | 23          | 6.40      | 18.9 | 42.5 | 0.24        | 4.53     |

TABLE I - SIMULATION RESULTS