STRUCTURAL AND POWER ANALYSIS OF RIPPLE CARRY ADDER IN QCA

S. Senthilnathan and S. Kumaravel
SENSE, Vellore Institute of Technology, Vellore, India
E-Mail: senthilsrinivasan18@gmail.com

ABSTRACT

Adders and Multipliers are used frequently in the design of several computing subsystems that includes arithmetic and logical units (ALUs). A quantum-dot cellular automata (QCA) which is promising and emerging Nano-technology is used to realize such subsystems with high-performance, ultra-dense and low-power. For reliable realizations of QCA based designs, structural and power analysis is essential. Most of the QCA based Ripple Carry Adder (RCA) reported in the literature doesn’t consider the structural and power consumption issues. In this paper, the existing design of QCA based Ripple carry adder is studied extensively based on structural and power analysis. The QCA based RCA is implemented and verified for its functional output using QCA Designer tool. The power dissipation is estimated using QCA Pro simulator which is an accurate power estimator tool.

Keywords: quantum-dot cellular automata, arithmetic and logical units, ripple carry adder.

1. INTRODUCTION

Physical scalability limits, leakage power consumption, and short channel effects are some serious challenges which conventional CMOS technology is facing [1]. Therefore, these deficiencies have resulted in extensive research on nano-scale technologies. A quantum-dot cellular automaton (QCA) is promising and emerging nano-technology. The fundamental element in QCA is a four-dot squared cell that contains two free identical charges. Due to the Coulombic interaction, these electrons occupy the dots diagonally. Unlike CMOS technology, QCA encodes binary information by relative formation of the charges instead of current. Hence, implementation of QCA circuits aims toward high-performance, ultra-dense and low-power designs [2, 3]. A series of QCA cells are utilized to produce QCA gates and circuits [3-5]. In order for a QCA signal to propagate properly, energy restoration is required. This demand is fulfilled by applying four distinct clocking phases. Moreover, the clock signal synchronizes the QCA cells [6, 7]. Therefore, the locations of cells during the implementation phase, determine the accuracy of QCA operations. For a QCA circuit, cells have to be aligned and positioned correctly in order to perform properly. Since there is no electrical current in QCA computations, the power dissipation is considerably lower than conventional CMOS circuits. Nevertheless, it is necessary to characterize all aspects of a new technology. Furthermore, in recent years, lots of investigations have been launched in order to design various digital circuits based on this technology, so several studies have been performed in the area of QCA power [8-12]. One of the most accurate power dissipation models has been proposed by Timler and Lent [8] and upper bound power dissipation for QCA circuits is estimated by Srivastava et al. [10] there are three main obstacles for exploiting complete potential of QCA circuits as is thoroughly discussed in [13]. The first and the most important problem is the realization of QCA circuits capable of processing at room temperature, however Nanomagnet based QCA circuits can be realized in mentioned temperature but with higher dimensions [14]. The second issue is the means by which the output or input cells could be fixed and measured. The third issue is circuit tolerance to possible fabrication faults. The mentioned obstacles motivated further studies in future.

The main aim of this work is to analyze the structural and power dissipation for efficient QCA based Ripple carry adder circuit. In the first step, functional outputs of the existing RCA are verified. And detailed analysis of structural and power dissipation is performed in the next step.

The rest of this paper is organized as follow: a review on QCA elementary concepts, structures and power dissipation models is presented in Section 2. A design for ripple carry adder and its simulation are addressed in Section 3. In Section 4, structural and power analysis of the ripple carry adder are performed and simulated, finally Section 5 concludes the paper.

2. PRELIMINARIES

2.1. QCA overview

QCA circuits are made up of identical square-shaped QCA cells. Each four-dot QCA cell is constructed of quantum dots which are positioned at the vertices of a square. The dots contain two electrons which can quantum-mechanically tunnel between them. Equation. (1) shows the calculation of polarization of a QCA cell.

\[ P = \frac{(P_1 + P_3) - (P_2 + P_4)}{(P_1 + P_2 + P_3 + P_4)} \]  

(1)

Here \( P_i \) is the charge of the \( i_{th} \) quantum dot (if an electron present \( P_i = 1 \), otherwise 0). These polarizations are represented by binary ‘1’ and binary ‘0’, respectively, as shown in Figure-1 [3].
The 90-degree wire, as a row of cascaded QCA cells, which propagates signal from input to output, is shown in Figure-2(a) and Figure-2(b) depicts the 45-degree QCA wire, which alternates encoded binary signal polarization in consecutive cells [5]. This diagonal wire is used for coplanar wire crossings shown in Figure-3(a). In this technique, although the entire cells are located in a same plane, the design is sensitized to a cell misalignment defect. Besides, the distance between the horizontal cells (90 degree wire and 45 degree wire) is limited in order to prevent transmission error which may occur because kink energy at the crossing point might be less than the nominal level. Other than the coplanar wire crossing, another crossing method which uses multilayer crossovers is used. In the multilayer method there are three layers named as lower layer (main layer), middle layer (interconnect layer), upper layer (bridge layer), QCA signal passes through the upper layer (bridge layer) depicted in Figure-3(b).

2.2. Elementary structures

QCA gates and circuits are constructed based on two fundamental gates, which are Inverter Gate and the Majority Gate (MG) [3, 4]. Some novel architectures for QCA cell paradigm and also some designs for 5-input majority gate have been suggested [15] and several QCA designs and circuits utilizing them have been offered which provide significant improvements in terms of complexity (number of cells needed) as well as delay. Hence these gates play an important role in designing new QCA gates and consequently based on their importance, several efficient QCA designs have been offered for them [16]. Figure-4 shows two formerly proposed QCA designs for an Inverter Gate. The Inverter Gate shown in Fig. 4(a) has more cells in comparison with the one in Figure-4(b); however its functionality is being employed more. The inverter in Figure-4(a) splits the input into two paths and it merges them by leveraging a 45-degree wire, which produces the opposed polarization. A 3-input QCA majority gates, which act as OR gate and AND gate, are depicted in Figure-5(b) and (c), respectively. Due to columbic repulsion between the electrons of the input cells, the central cells of both MGs force the output to the stable polarization (majority polarization). This majority gates function based on the following equation:

$$M(A, B, C) = AB + BC + AC$$

As it is clear, in order for the output to become “1” then at least two inputs with logic “1” are required. A majority gate can be configured such that it functions as an AND gate or an OR gate. To this end, one of the input cells is set to $P = +1$ or $P = -1$ for realizing an OR function and an AND function, respectively. Figure-5 (c) shows the MGs which function as 2-input OR and AND gate.
2.3. QCA clocking

Although synchronization in both CMOS and QCA circuitry is accomplished by clocking, the clock mechanism in QCA is substantially different. In order to properly direct the input to the desired output, to achieve signal energy restoration and to solve the thermodynamic issue in large QCA arrays [7], four distinct and 90-degree phase shifted clock signals are required. Figure-6 depicts the schematic of a QCA wire in the presence of the clock signals. As shown, each clock signal is composed of four phases: Switch, Hold, Release and Relax. Corresponding to each clock phase, a cell zone is constructed. These realizations in clocking cooperate to make QCA circuits considerably more defect tolerant. The cell polarization process starts in the Switch phase (the rising edge) and continues until the cell becomes polarized completely. When the clock reaches a high level (the Hold phase), the cell stores its polarization. Reduction in the cell polarization occurs when the clock goes through the Release phase (the falling edge). Finally, at the low level of the clock (Relax phase), the cell becomes unpolarized. Number of zones in the critical path of a QCA circuit determines its overall delay [17].

2.4. QCA power dissipation

Power dissipation is considered to be a limiting factor in modern integrated circuits [18]. Although the power dissipation in QCA is small, irreversible dissipation from bit erasures becomes a major concern [19]. A quasi-adiabatic model for power dissipation in QCA circuits was first presented by Timlerand Lent [8]. The relation between energy and polarization states of any QCA cell can be established by a Hamiltonian matrix representing the two polarization states of a QCA cell as a two-state basis. For an array of QCA cells, the Hamiltonian can be approximated by intercellular Hartree approximation (ICHA) treating the Columbic interaction between cells by a mean-field approach [17, 18] as follows:

$$
H = \left[ \begin{array}{cc}
\sum_{j=0}^{n} d_{ij} P_i & \sum_{j=0}^{n} E_{kj} \gamma P_i P_j \\
\sum_{j=0}^{n} E_{ij} \gamma & \sum_{j=0}^{n} \frac{E_{kj} \gamma}{2} P_j
\end{array} \right]
$$

(3)

Where $-P_i$ represents the polarization of $i^{th}$ cell, $\gamma$ is the tunnelling energy between the two polarization states, $d_{ij}$ represents a geometrical factor which specifies the interaction between the $j^{th}$ and $i^{th}$ cells, and $E_{kj}$ is the Kink energy between two neighboring cells. The total instantaneous power of a QCA circuit is given by [19]:

$$
P_i = \frac{dE}{dt} = \frac{h}{2} \left[ \frac{d\lambda}{dt} \lambda + \frac{d\Gamma}{dt} \Gamma \right] = P_1 + P_2
$$

(4)

Where $\lambda$ and $\Gamma$ represent Coherence vector and a three dimensional energy vector respectively. $h$ is the Reduced Planck's constant. The first term of above equation accounts for difference of power input (Pin) and power output (Pout) i.e., power gain achieved by the circuit. The second term gives the power dissipated ($P_{\text{diss}}$) by the circuit.
A non-adiabatic power dissipation model based on quasi-adiabatic assumption was proposed by Bhanja and Srivastava [12]. A tool named QCA Pro has been developed by Srivastava et al. [12] for estimation of polarization error and non-adiabatic switching power loss in QCA circuits. The power dissipation in QCA clocking wires is fairly small [20] and hence, overall power dissipation in QCA is dominated by that in the QCA logic devices. QCA wires can be considered as shift registers where there is no information loss.

3. RIPPLE CARRY ADDER (RCA)

A ripple carry adder is formed by cascading several 1-bit full-adders. A 4-bit ripple carry adder to add numbers A3A2A1A0 and B3B2B1B0 is depicted in Figure-7. The input carry to the rightmost one-bit adder is denoted by C0. The output carries and sums are denoted by Ci, i = 1, . . . , 4 and Si, j = 0, . . . , 3 respectively. This can be extended to an n-bit adder in the same fashion. It is clear that efficient QCA design of the RCA depends on the design of the one-bit full-adder. One approach to full-adder design in QCA is based on directly replacing XOR, AND and OR gates in the realization by majority gates and inverters.

Proof: $M(a, b, (M(a, b, c))) = ab + a(b' + bc) + b(abc + ac)$

$= ab + ab'c + abc$

$= a(b + b'c + b) + a(c + c')$

$= M(a, b, c)$

As a consequence of Lemma 2, we have

$M(a, (M(a, b, c))) = M(a, b', c)$

and

$M((M(a, b, c)), b, c) = M(a', b, c)$

Lemma 3 Let $f_1$, $f_2$ and $f_3$ be three Boolean functions such that $f_1$ and $f_2$ satisfy $f_1f_2 = f_1$ and $f_1 + f_2 = f_2$. Then

$M(f_1, f_2, f_3) = f_1 + f_2$.

Proof: $M(f_1, f_2, f_3) = f_1f_2 + f_1f_3 + f_2f_3 = f_1 + (f_1 + f_2)f_3 = f_1 + f_2f_3$

This lemma establishes that carry generation requires one majority gate and sum generation requires just two majority gates plus one inverter in a one-bit full adder. Let $a_0$, $b_0$ and $c_0$ be inputs to a full adder and let $s_i$ and $c_{i+1}$ be its outputs.

Lemma 4 A 1-bit full adder can be realized using 3 majority gates and 1 inverter.

Results from [21],

$ci + 1 = M(a_i, b_i, c_i)$

A one-bit full adder that incorporates appropriate clocking is shown in Figure-8. The D-latch convention presented in [22] enables us to obtain the total circuit delay. One D-latch (namely, D0) is used to indicate that one-quarter of a clock is required to apply the inputs to the majority logic. One-fourth clock zone delay is assumed whenever a majority gate is immediately followed by an inverter or vice-versa (D1 is introduced at the output of inverter that follows the majority gate [22]). Proceeding this way, we have a total circuit delay of 1 clock (4 clock zones) for generating $S_i$ as well as $C_{i+1}$ for a 1-bit adder.
Figure-8. Full adder realization using three majority gates and one inverter and D-latches enable delay determination [21].

The result presented in Lemma 4 improves upon a result in [4] that requires 2 inverters. We can use the result on a 1-bit full adder to derive the following n-bit RCA.

4. DESIGN OF RIPPLE CARRY ADDER (RCA) WITH STRUCTURAL AND POWER ANALYSIS

Design of one bit adder is composed of 124 QCA cells as shown in Figure-9.

Figure-9. Design of one bit adder.

It is worth pointing out that all the input cells are implemented in a single layer with no need to additional layers to access. In addition, the input and output cells are not surrounded by the other cells. To authenticate the proper functionality of Ripple carry adder, Coherence vector engine of QCA Designer tool version 2.0.3 [17] is used with the options summarized in Table 1. It is clear from Figure-10 that the expected highly polarized output waveform is achieved.

Figure-10. Output wave form of one bit adder.

4.1 Structural analysis

An n-bit RCA requires at most 3n majority gates and n inverters. A 4-bit RCA requires 12 majority gates and 4 inverters. Note that the path from input to the “last” output contains seven clock zones as shown in Figure-11. So we have a total delay of 4clocks. While the RCA is simple, the delay increases (linearly) as the size of the adder increases. Design of ripple carry adder is composed of 745 QCA cells as shown in Figure-12.

Figure-11. 4-bit RCA critical path composed of 7 D-latches [21].

Figure-12. Design of four bit ripple carry adder.
VOL. 13, NO. 8, APRIL 2018

ARPN Journal of Engineering and Applied Sciences
©2006-2018 Asian Research Publishing Network (ARPN). All rights reserved.

Table-1. Coherence vector parameters.

<table>
<thead>
<tr>
<th>Parameter</th>
<th>Value</th>
</tr>
</thead>
<tbody>
<tr>
<td>Cell size</td>
<td>18*18 nm²</td>
</tr>
<tr>
<td>Dot diameter</td>
<td>5 nm</td>
</tr>
<tr>
<td>Center-to-Center Distance</td>
<td>20 nm</td>
</tr>
<tr>
<td>Temperature</td>
<td>2.000000 K</td>
</tr>
<tr>
<td>Relaxation time</td>
<td>4.1356675e-14 s</td>
</tr>
<tr>
<td>Time step</td>
<td>1.000000e-016s</td>
</tr>
<tr>
<td>Total simulation time</td>
<td>7.000000e-011s</td>
</tr>
<tr>
<td>Clock high</td>
<td>9.800000e-022 J</td>
</tr>
<tr>
<td>Clock low</td>
<td>3.800000e-023 J</td>
</tr>
<tr>
<td>Clock shift</td>
<td>0.000000e-000</td>
</tr>
<tr>
<td>Clock amplitude factor</td>
<td>2.000000</td>
</tr>
<tr>
<td>Radius of effect</td>
<td>80.000000 nm</td>
</tr>
<tr>
<td>Relative permittivity</td>
<td>12.900000</td>
</tr>
<tr>
<td>Layer separation</td>
<td>11.500000 nm</td>
</tr>
</tbody>
</table>

Area occupation, layer accessibility of the input, output cells and number of cells consumed by the designs are reported in Table-2.

Table-2. Structural analysis result.

| Single layer accessibility to the input cells | Yes |
| Single layer accessibility to the output cell | Yes |
| Consumed cell count | 745 |
| Area occupation (µm²) | 0.95 |

4.2 Power analysis

As noted before, in order to estimate consumed power of the design, QCA Pro is used, [8] which is an acceptable power valuactor tool. To examine the design structures under three different tunneling energy levels (0.5Eₜ, 1Eₜ and 1.5Eₜ) at 2K temperature. The power dissipation maps with (0.5Eₜ, 1Eₜ and 1.5Eₜ) at 2K temperature are shown in Figures 13. a, b, c. It is clear that high power dissipating cells are indicated using thermal hotspots with darker colors.

Evidently in Figures 13, (a), (b), (c) the middle cells (or voter cells) have dissipated more power in contrast to other cells. As a result, the adjusted position of the input cells as the surrounding cells and their effects on the voter cell could be considered as one of the most significant factors in increasing power dissipations.

The expectation value for QCA cell energy at each clock cycle is calculated as

\[ E = \langle H \rangle = \frac{h}{2} \Gamma^u \cdot \Gamma^u \]

Here \( \Gamma \) is the energy environment vector of the cell including its neighbors effects, \( h \) is reduced Planck constant and \( \Delta \) represents Coherence vector.

In Equation (4), the term P1 includes two main components: first, the power gain achieved from difference of the input and output signal powers (P_{in-P_{out}}) and second, the transferred clocking power to the cell (P_{clock}) and the term P2 represents dissipated power (P_{diss}) [10].

Based on [8], the energy dissipation during one clock cycle \( T_s = [-T, T] \) can be represented in terms of Hamiltonian and Coherence vectors as

\[ P_{diss} = \frac{h}{2} \int_{-T}^{T} \Gamma^u \cdot \frac{\partial \Gamma^u}{\partial t} dt = \frac{h}{2} \left[ \left( \Gamma^u \cdot \frac{\partial \Gamma^u}{\partial t} \right)_{-T} - \left( \Gamma^u \cdot \frac{\partial \Gamma^u}{\partial t} \right)_{T} \right] \]

Here \( k_B \) represents the Boltzmann constant and \( T \) is the temperature. In an array of similar QCA cells, the total dissipated power can be calculated by adding the dissipated power of all cells since the presented model for each QCA cell is identical. By applying the mentioned concepts, a power dissipation model for QCA circuits with separating the total power in to two main components which called “leakage” and “switching”. Power losses during clock changes lead to leakage power and power loss due to the switching period leads to switching power. Based on this model a power is estimated for (0.5Eₜ, 1Eₜ and1.5Eₜ) at 2K temperature under non-adiabatic switching.

Figure-13(a). The power dissipation maps with 0.5 Eₜ
It is observed that from the above power dissipation maps of the design, power dissipation is increases with increase in Kink Energy ($E_k$). Therefore kink energy is directly proportional to the power dissipation.

<table>
<thead>
<tr>
<th>Kink energy ($E_k$)</th>
<th>Avg. leakage energy dissipation (meV)</th>
<th>Avg. switching energy dissipation (meV)</th>
<th>Total energy consumption (meV)</th>
</tr>
</thead>
<tbody>
<tr>
<td>0.5$E_k$</td>
<td>0.20470</td>
<td>0.65304</td>
<td>0.85774</td>
</tr>
<tr>
<td>1$E_k$</td>
<td>0.57828</td>
<td>0.53217</td>
<td>1.11045</td>
</tr>
<tr>
<td>1.5$E_k$</td>
<td>0.98648</td>
<td>0.43518</td>
<td>1.42166</td>
</tr>
</tbody>
</table>

5. CONCLUSIONS
In this paper, the existing design of QCA based Ripple carry adder is studied extensively based on structural and power analysis. The energy consumption are observed for different kink energy ($E_k$), by adding average leakage energy dissipation and switching energy dissipation, which is estimated by using QCA Pro tool and its functional outputs are verified using QCA Designer tool. The entire design of QCA occupies 0.95 $\mu$m$^2$ area with 745 QCA cells. And a single bit adder consist of 124 cells with 0.10 $\mu$m$^2$ area and the total energy consumption 0.85774 mV, 1.11045 mV, 1.42166 mV for 0.5$E_k$, 1$E_k$, 1.5$E_k$ respectively. Hence the structural and power analysis is estimated for QCA based RCA. These results helps in optimizing the performance, reducing the area and power especially for designing power-efficient high-speed digital systems.

REFERENCES
automata (QCA) wires and logic devices, Nanotechnol. IEEE Trans. 3(3): 368-376.


