# Hierarchical Design of Robust and Low Data Dependent FinFET Based SRAM Array

Mohsen Imani, Shruti Patil, Tajana Simunic Rosing Department of Computer Science and Engineering University of California, San Diego {moimani,patil,tajana}@ucsd.edu

Abstract— This paper proposes a new FinFET based SRAM cell and a cache architecture that efficiently exploits our SRAM cell for low-power and robust memory design. Our cache architecture uses invert coding scheme to encode the input data of a word line by taking into account the data composition. Based on the new data distribution, we propose two new asymmetric SRAM cells (AABG and ADWL) utilizing adaptive back-gate feedback that significantly improve cache power consumption and reliability, and provide higher performance in state-of-theart SRAM caches. The results show that the AABG cell is a good candidate for robust and low power caches, while the ADWLbased SRAM cache is low power and high performance cache. The simulations are performed on SPEC CPU 2006 benchmarks with GEM5 and HSPICE in 20nm independent gate FinFET technology. The results show that the proposed AABG (ADWL)based cache improves static and dynamic power by at least 13% and 35% (17% and 12%) respectively, compared to other stateof-the-art cells, while guaranteeing 2.7X (1.98X) lower NBTI degradation with less than 1.5% area overhead.

#### Keywords—SRAM, FinFET, NBTI, Process Variations

#### I. INTRODUCTION

SRAM caches dominate the chip area in today's system-onchips. However, scaling SRAM designs to minimum feature size of a nanoscale technology degrades the yield and stability due to several issues that affect the cell, such as process variations, short channel effect and Negative Bias Temperature Instability (NBTI) degradation. To solve some of these challenges, FinFET devices with double gate and highly controllable channels have been introduced. FinFET structures have sharper current-voltage characteristics, which decrease the short channel effect. FinFETs structured with independent gates increase the design flexibility since two gates can be biased separately [1, 2]. Innovative back-gate design and dynamic back-gate biasing not only improves the power/performance characteristics of circuits but also increases their reliability and stability [3-6].

In SRAM cells, each cell operation mode, i.e. read, write and hold, requires specific back-gate biasing to achieve high performance. For example, for high stability, the write operation needs strong access transistors; in contrast, the read operation needs weak access transistors. The same requirement exists for pull up and pull down transistors in main body for read and write operations [7]. Several recent papers have optimized SRAM cells for read, write or hold characteristics while minimizing the overhead in other modes without considering feasibility and reliability issues [8, 9]. Other efforts have biased PFET back-gates to control the threshold voltage and NBTI degradation [10]. However, biasing FinFET backgates with varying voltage levels in different modes increases the complexity and overhead of the design, and may impose high penalty on its reliability.

To the best of our knowledge, this is the first FinFET-based SRAM design that improves cell characteristics while considering their variability and reliability. Our main contributions are outlined below.

- In this paper, we exploit the asymmetry of cell-level characteristics with respect storing data. Within cache word lines, we skew the stored data towards a high proportion of '0' using invert coding. Given the new data distribution, we propose two asymmetric FinFET based SRAM cells that are optimized for storing zeros, AABG & ADWL. They utilize adaptive back-gate feedback, which improves write performance, power consumption and cell variability while considering read stability.
- Cell level results show that both the proposed cells improve static power, write margin and variability compared to other state-of-the-art cells. In addition we show that data encoding dramatically reduces NBTI degradation. Considering process variations, after three years of experiencing NBTI, the AABG and ADWL cells have 2.76X and 1.98X lower read failure with respect to other state-of-the-art cells.
- Our cache level simulations with SPEC CPU2006 benchmarks show that the proposed AABG (ADWL) cache has 13% and 35% (17% and 12%) better static and dynamic power respectively with respect to other state-ofthe-art caches with about 1.5% peripheral area overhead. Finally we show that the AABG SRAM array is good for designing low power and robust caches, while the ADWL cache is appropriate for low power and high performance usage.

# II. BACKGROUND & RELATED WORK

Independent-gate FinFETs add a degree of freedom when designing back gate connections of transistors in SRAM cells. Prior efforts have used the back-gate of FinFET to improve write operation at the expense of read characteristics or vice versa. In [11], the back-gate is used to control the read performance and write ability of SRAM cells. A Double Gate (DG) FinFET SRAM cell is investigated in [12]. The back-gate of access transistors in DG cell is grounded to make them weak

[13], improving read stability at the expense of write performance. In Yin-Yang feedback cell [14], the storage node is connected directly to the back-gate of access transistor (Adaptive Back Gate feedback/ABG), which improves the write ability and write performance. Work in [15] uses double word line, Write Word line (WWL) and Read Word Line (RWL) for front and back gates of access transistors respectively. During write operation, both lines are activated, while in read mode, only RWL is activated. This method improves read stability of cell. Yet another modification is introduced on Yin-Yang cells in [8] (BKL cell), which connects the back gates of PFET transistors to write word line, improving the write margin of cell. In read and hold modes, bias of 0V or 0.48V is assigned to PFET back gates. This improves the read characteristics, but also increases its hold power. Due to PFET back-gate biasing in read and hold modes, the cell has high NBTI degradation and reliability issues, which is not considered. On similar lines, in the design in [9], both PFET back gates are biased to '1' in hold mode for decreasing static power. Back-gate voltages are optimized during read operation for achieving maximal read SNM. However, the design requires large area, energy and latency overhead to bias the cells during various operating modes. Further, due to supply voltage variation and voltage scaling features in modern caches, the effective voltages over cell frequently change, thereby altering the optimum back gate voltage dynamically. Further, both cells ([8] and [9]) are controlled with high external voltage, which increases intermediate node activity and reduces cache stability. In summary, prior cell designs only achieve improvements in one or two cell characteristics, while degrading other parameters. In contrast, we design cache structures with reduced dependency on the data stored in the cells, allowing us to reconfigure back-gate connections, to improve power, performance, reliability and variability simultaneously.

The goal of this paper is to leverage adaptive back-gate feedback in the 6T SRAM structure to improve cell characteristics in all three modes of hold, read and write. The optimal connections desired for FinFET back-gates in 6T SRAM cell are illustrated in Table 1 [9]. Data stored in an SRAM cell (i.e. 0 or 1) is non-deterministic. Therefore, it is not possible to improve all characteristics of SRAM cell by modifying the back gate connections. Instead, Table 1 shows that each proposed configuration (back-gate modification) is appropriate for one or two specific modes and cannot satisfy the requirements of all modes simultaneously. Designers typically consider two design goals for back-gate biasing: (i) the mode(s) to be improved: write, read or hold. The requirements of these modes are in conflict in several cases, for e.g., as seen by the PFET and NFET back gate voltages in read and hold modes (Table 1); (ii) the stored data for which cell performs better: '0' or '1'. As shown in Table 1, the best backgate connection tends to depend on the cell value. Since the input data is considered to be random, there is no simple way to achieve the optimal cell modification. To address this issue, we first decrease the data dependency on stored bit in cache architecture and then design the cell to improve read, write and hold characteristics simultaneously for the particular optimized data value (in our case zeros).

TABLE 1. OPTIMAL SIGNALS FOR BACK GATE OF FINFETS IN 6T SRAM [9]

|             | Pull up<br>PFET |     | Pull down<br>NFET |     | Access<br>Transistors |     |
|-------------|-----------------|-----|-------------------|-----|-----------------------|-----|
| Stored Data | 0               | Vdd | 0                 | Vdd | 0                     | Vdd |
| Read        | 1               | 0   | 1                 | 0   | 0                     | 1   |
| Write       | 0               | 1   | 0                 | 1   | 1                     | 1   |
| Hold        | 1               | X*  | Х                 | 0   | 0                     | Х   |

\*X: Don't care

#### III. PROPOSED SRAM

# A. Data Encoding

As we discussed in the previous section, the optimal design of the FinFET SRAM cell depends on the data saved in it. In order to decrease the data dependency in storage nodes, we use data encoding to increase the proportion of zeroes stored in the cache. This allows us to optimize FinFET cell structure for storing zeros, and enables us to improve overall SRAM array characteristics.

In the cache structure, we count the number of 1's in each word line and invert lines with more than 50% 1's. While systems have evolved to store 64-bit data, majority of the values stored in caches have narrow-widths. This has been exploited for power and performance optimizations in the cache, register files of microprocessor [16, 17]. Our simulations verify this observation on ten SPEC CPU 2006 benchmarks, showing that more than 93% of data used  $\leq$ 32 bits to represent data, while the upper 32-bits had  $\geq$ 96% zeros. Therefore, our proposed method implements the invert-coding scheme only for the lower 32 bits. This significantly reduces the power and area overhead of peripheral circuitry. In a system where data is not expected to be narrow-width, the invert-coding scheme must be implemented on the entire 64 bit length.



The structure of SRAM array with proposed cells is shown in Figure 1. An invert-coding & decoding blocks are used to process data before writing to the cache. The block operates in

parallel with the address (row/column) decoding, and therefore does not add latency to read and write operations. In the cache structure, an extra flag bit (*Invert-Bit*) is added to each word line to indicate whether the original or inverted data are written on cache.

The invert-coding block, shown in Figure 2, consists of two stages. First the comparator block counts the number of ones in each word and compares it to a threshold value (N/2 where N is length of each word). If the number of ones is higher than threshold, the *invert-bit* becomes '1' and the second stage XORs invert the input data before putting them on the cache I/O. Details of invert-coding and comparator circuits are described in [18]. Similarly, in read mode the decoder circuit uses XOR blocks to transfer cache data to output bus based on flag bit (*Invert-Bit*). The area overhead of peripheral circuits is negligible compared to cache area (~ 0.1% for a 32KB cache). However, the area overhead of extra flag bit in each word is about 1.5% for 64-bit line cache.

## B. Proposed SRAM cells

With data encoding, storage nodes are expected to store '0' for majority of the time. This provides an opportunity to design back-gate connections for low power and stable cell. Table 1 shows that the back-gate connection of the conventional 6T-SRAM cell (in main body) is well suited for the read operation. Due to the expected data characteristics, the cell (main body NFETs and PFETs) can now be modified such that write and hold characteristics are improved. Since back-gate of access transistors greatly affect the read stability of cell, we can compensate degradation with access transistor modification. Figure 3 shows two proposed low power Asymmetric Double Word Line (ADWL) and Asymmetric Adaptive Back Gate feedback (AABG) cells based on 6T SRAM. AABG is designed to be robust, low variation and low power, while the ADWL targets low power and high performance with less emphasis on variability.



Figure 3. The proposed 6T-SRAM cells (a) Asymmetric Adaptive Back Gate feedback cell (AABG) (b) Asymmetric Double Word Line cell (ADWL)

The cell modifications with respect to the original 6T cell are: (i) Connecting back-gate of M3 to VDD node: As QB node is '1' majority of the time, connecting back gate of M3 to Vdd makes M3 weak. As Table 1 shows, this connection improves the write ability due to weakness of main body without negative effect on hold characteristics. In addition, this asymmetric structure increases the effective average time that the gates of pull-up PFET are biased with '1', since both gates of M4 and back gate of M3 are biased by '1' voltage when Q storing zero. This connection decreases the cell power consumption in write, read and hold modes and also significantly decreases the NBTI degradation over cell, improving the lifetime of the cache.

(ii) Connecting the back-gate of M2 to GND node: The write operation can be performed from either side of the cell. When storing '0' on Q, the right side invertor is in weak mode due to bias of '1' at the back gate of PFET transistors. Based on Table 1, connecting back-gate of M2 to GND improves the write characteristics without affecting the hold mode. Moreover, this connection weakens the left back-to-back inverter, and creates fair competition between them (decreasing static power and asymmetry of the cell).

(iii) Back gate connection on access transistors: Our proposed cells use two approaches for back-gate connection in access transistors. The AABG connects back-gates of access transistors to storage nodes, and ADWL uses an extra read word line for back-gate of access transistors.

*iii. A. Access transistor connections in AABG cell [14]* This connection improves the read ability of cell (see Table 1). When the cell saves '0', the related access transistor goes to weak mode and decreases the drive current through cell. This can increase stability of SRAM (high drive current when reading from cell can change the saved value). This connection reduces the area and improves stability, balancing the performance in both read and write operations. Two disadvantages of this connection are: high static power and low write performance. Hold power is high since the access transistors in the node which saves '1' never turns off and is always leaky through the bitline (BL). Moreover it decreases the write current of cell because the write operation is performed from both sides of cell. The weakness of access transistor on one side can degrade write performance.

# iii. B. Double word lines in ADWL access connection [15]

As mentioned above, strong access transistors during read operation causes read failure. High read current may be considered as a write operation, flipping the cell. However, the solution of adaptive back-gate connection improves read stability with overhead in hold power and write performance. In ADWL cell (Figure 3 b), we use dual word line strategy. For write operation, both WWL and RWL are active with strong access transistors, and high write current with proportionally high write performance. In read mode, we only activate RWL signal to decrease the read path current and read failure of system. This approach needs an extra read word line (RWL) and control circuit. In summary, this connection is good for very low leakage and stable systems that require high write performance.

# IV. EXPRIMENTAL RESULTS

We next compare the characteristics of both AABG and ADWL cells with DG [12], ABG [14] and BKL cells [8] that connect the transistor's back-gate to fixed voltages or nodes. First, we compare cell level characteristics of cells at different supply voltages. Second, we evaluate NBTI degradation and the effects of process variations on cell failures. Finally, we show the power improvement of proposed cells at the cache level using SPEC CPU 2006 benchmarks [19]. GEM5

Simulator [20] is used to extract cache level memory trace from benchmarks. All circuit-level simulations are performed with HSPICE in 20nm Independent-Gate FinFET technology [21] (nominal Vdd=900mV).

# A. Cell characteristics

Leakage Power is the main source of power consumption in SRAM cells. In proposed AABG and ADWL cells, the new configuration on FinFET back-gates effectively decreases the static power when the cell stores '0' on Q node. In both proposed cells, when saving '0', the M2 is in weak mode and the Q voltage is close to zero. This voltage applies to front gate of M3 and puts it in ON mode (not strong). Thus, the back-toback inverter loops become weaker than DG and ABG cells and they consume lower leakage power. On the other hand, when the cell saves '1', the back-to-back inverters are in the strong mode and consume similar static power as other cells. Table 2 compares the average static power of all cells at different supply voltages. The power consumption of BKL cell is very large since back gates of both PFETs are connected to zero voltage (WL signal), and both inverters have high leakage. The power difference between AABG and ADWL is due to their leakage current through access transistors. During hold mode, both front and back gates of access transistors in ADWL cell are biased with zero voltage but the access transistors in AABG cell are leaky from the storage nodes. As Table 2 shows, at lower supply voltages, the differences in power consumption become minor since access transistors cannot be strongly active. The results show that the AABG and ADWL cells consume less power as compared to other cells since the hold power is averaged over saving '1s' and '0s'. This power benefit of proposed cells is expected to increase if data has a higher proportion of ones.

| ****            | SRAM Cells |         |                |       |       |  |  |
|-----------------|------------|---------|----------------|-------|-------|--|--|
| Vdd DG [12] ABG |            | ABG[14] | BG[14] BKL [8] |       | ADWL  |  |  |
| 0.9             | 8.17n      | 8.19n   | 49.64u         | 7.58n | 7.43n |  |  |
| 0.8             | 6.01n      | 6.02n   | 30.92u         | 5.58n | 5.47n |  |  |
| 0.7             | 4.41n      | 4.42n   | 16.37u         | 4.09n | 4.02n |  |  |
| 0.6             | 3.13n      | 3.13n   | 6.92u          | 2.91n | 2.85n |  |  |
| 0.5             | 2.10n      | 2.10n   | 2.06u          | 1.95n | 1.91n |  |  |

TABLE 2. STATIC POWER OF CELLS WITH DIFFERENT SUPPLY VOLTAGES

Write Performance: Two effective methods to increase the write performance are: weakness of main body and increasing the strength of access transistors in write mode. When the main body is weak, the access transistors have stronger control over the cell and can immediately flip the storage node values. Using ABG feedback on access transistors weakens them, increasing the write access time (ABG & AABG cells). Instead, DG and ADWL cells have better write performance since they have stronger access transistors on write operation. Figure 4 compares the access time of different cells at 900mV supply voltage. In BKL cell, applying write word line to back gates of pull up PFETs makes them weak. The same scenario exists in proposed cells where both inverters of the main are weaken during write. Thus, the average access time improvement of AABG cell is comparable with BKL cell. The main body of AABG and ADWL cells is weak when storing '0', hence writing a '1' on the two cells requires 1.8X and 2.6X shorter time respectively, compared to writing a '0'.



Figure 4 compares the write margin of cells at different supply voltages. In SRAM cell high write margin mean write ability which depends on strength of access transistor and weakness of main body. As Figure 4 shows the ADWL cell has the higher write margin than other cells since write mode it has both features in access transistor and main body. In contrast the cell with weak access transistor and strong main body (ABG cell) results in poor margin.

Read and hold stability: High current through access transistor changes the value of the cell and causes read failures. Thus, increasing read current for performance purposes must consider read stability. The ABG connection on Yin-Yang cell is used to decrease the read current through the discharge node (node which stores zero). Both BKL and AABG cells use this connection to improve the read stability. In ADWL cell, we use dual word line for read and write operation. In read mode only RWL is active, and weak access transistors control the current through access transistor. Although all these cells have comparable read currents, we will explain in next section that the read failure rates are different due to process variations and NBTI degradation. For SRAM cell, the Signal to Noise Margin (SNM) is a parameter that shows the stability of a cell. Figure 5 shows the read and hold SNMs of all SRAM cells at different supply voltages. The proposed cells have similar stability compared to other cells at all supply voltages, especially when Vdd is higher than 800mV. The ABG and DG cell has high SNM stability since they have strong main body in read and hold modes respectively. When considering process variations, stability will decrease because there is no adaptive feedback in cell structure. In other words, BKL, AABG, ADWL and several other designs accept some degradation on cell stability to improve power, performance and variability of system.



Figure 5. (a) Read and (b) Hold SNM of SRAM cells vs. supply voltage

# B. Process variations & cell failure rates

In highly scaled technologies, variability changes device characteristics and may cause cell failures. As new designs use transistors with minimum size, the device variations have a large impact on cell variability. In this paper, we modeled process variations as Gaussian distribution with  $3\sigma$  equal to 10% on the size and threshold voltage of transistors [22]. The simulations are performed by applying 1000 iterations of Monte Carlo simulation on SRAM cell during read operation. Figure 6 shows variation diagrams of read and hold modes with considering process variations. The low variability of proposed cells is due to adaptive feedback on M3 and M2 transistor. These connections prevent the effect of device variations from being transferred to cell characteristics. Therefore, the cells with ABG connection on access transistors have better controllability in read path current and lower variability compare to DG and ADWL cells.



Figure 6. Sigma/mean variations of SRAM cells at 900mV supply voltage

In sub-45nm technologies the NBTI effect increases the absolute threshold voltage of transistor over time and creates cell failures [23]. The change in the threshold voltage of the PFETs over time due to NBTI is given by [24]:

$$\Delta V_{th-static} = A \left( (1+\delta)t_{ox} + \sqrt{Ct} \right)^{2n} \tag{1}$$

Where *C* is a constant exponentially related to activation energy, *n* is type of the trap generator molecules (PFET is ~1.6), *t* is time,  $t_{ox}$  is oxide thickness and  $\delta$  is the fitting parameter. *A* is related linearly to hole concentration and exponentially to temperature and oxide electric field. The acvariation for threshold voltage is modeled as:

$$\Delta V_{th-ac} = \alpha \Delta V_{th-static} \tag{2}$$

Value of  $\alpha$  is 0.796 for 50% duty cycle. Higher the percentage of one in gate of PFET transistors, means smaller  $\alpha$  and proportionally lower the threshold variation during the time. Data encoding in proposed cache increases the percentage of 1's on pull-up PFET transistors since the left back-to-back inverter (M4) is mostly biased with '1', and the back-gate of M3 is always biased to '1'. This can significantly improve the NBTI degradation.

For workloads that mostly comprise of zeros, the new mapping and cell structure has huge improvements in performance and power of the cache. We used average percentage of '0' in SPEC 2006 benchmarks to evaluate and model the NBTI degradation on cells (section *IV.B*). The variations in the characteristics of proposed cells are minimal with respect to other cells due to modifications of the backgates of PFET transistors and data mapping. The BKL cell has high read failure since it biases PFET back-gates with '0' voltage during hold and read modes. This voltage connection considerably changes the threshold voltage of PFETs over time. Figure 7 shows the effect of NBTI degradation on read failure of different cells after 1000 iterations of Monte Carlo Simulation. After three years NBTI degradation, the AABG

and ADWL cells have 2.76X and 1.98X better read failure respect to other cells. In addition, the absolute value of SNM change for different cells in hold and read SNMs are listed in Table 3. The proposed cells have low SNM variation due to NBTI effect, demonstrating their robustness. Our future work is to use techniques to uniformly distribute data in cache, which can efficiently reduce the worst-case failure on cache.



Figure 7. Read failures of SRAM cells before and after three years NBTI effect with 1000 Monte Carlo iterations

TABLE 3. THE READ AND HOLD SNMS VARIATIONS AFTER THREE YEARS

|          | SRAM Cells |          |         |        |        |
|----------|------------|----------|---------|--------|--------|
|          | DG [12]    | ABG [14] | BKL [8] | AABG   | ADWL   |
| Hold SNM | 5.34mV     | 5.35mV   | 36.61mV | 2.38mV | 2.56mV |
| Read SNM | 12.103mV   | 11.04mV  | 7.36mV  | 5.76mV | 7.49mV |

### C. Cache Level Results

In this section, we design 32KB L1 cache utilizing different cell structures. The components of cache (Figure 1), such as sense amplifier, cells, decoders, buffers, etc. are designed at the transistor level with HSPICE tool. We used ten SPEC CPU2006 benchmarks to compare the proposed cells to other cell designs. The memory trace of L1 cache for different benchmarks is extracted using GEM5 simulator. Processor configurations are listed in Table 4.

 TABLE 4. BASELINE PROCESSOR CONFIGURATIONS

 Frequency
 2GHz

| Frequency | 2GHz              |
|-----------|-------------------|
| L1 Cache  | 32KB, 2-way, LRU  |
| L2 Cache  | 512KB, 8-way, LRU |
| Memory    | 4G                |

We first use the invert-coding scheme to increase the percentage on zeros. Figure 8 shows the data distribution on different SPEC CPU2006 benchmarks after implementing invert-coding scheme on lower 32 bits of data. The dynamic and static power results for different caches are shown in Figure 9 and Figure 10. In both figures, the results are normalized to ABG dynamic and static power. Figure 9 shows that the AABG and ADWL cells have 13% and 17% lower static power with respect to other cells. The power is obtained by considering power overhead of added peripheral and flag bit to proposed caches. The statistical comparison of memory trace shows that the number of writes in proposed cells decreases significantly due to writing a '0' on cache which contains majority '0' data. The comparison of dynamic power in Figure 10 shows that AABG and ADWL have at least 35% and 12% lower dynamic power with respect to other cells. The small amount higher dynamic power of DG [12] is due to high write current and performance of the cells while the access transistor connection in ABG and AABG cells controls the write/dynamic power.



Figure 8. Data distribution on SPEC 2006 benchmarks after data mapping on lower 32 bits



Figure 9. Normalized static power consumption of different caches with SPEC 2006 benchmarks



Figure 10. Normalized dynamic power consumption of different caches with SPEC 2006 benchmarks

## V. CONCLUSION

We propose a new SRAM architecture that efficiently exploits FinFET transistors for low-power cache design. First, we devised a cache architecture that encodes stored data to have a high proportion of zeros. To fully utilize such asymmetric data dependency, we designed two new cells that improve power consumption and cell robustness while providing the same performance as traditional SRAM cache. The results showed that the proposed design improves dynamic and static power compared to other state-of-the-art cells while guaranteeing at least ~2X lower failure with less than 1.5% area overhead.

# ACKNOWLEDGEMENT

This work was sponsored in part by NSF grant #1218666.

#### REFERENCES

- [1] R. Joshi, K. Kim, and R. Kanj, "FinFET SRAM design," in Nanoelectronic Circuit Design, ed: Springer, 2011, pp. 55-95.
- A. Datta, A. Goel, R. T. Cakici, et al., "Modeling and circuit synthesis [2] for independently controlled double gate FinFET devices," Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on, vol. 26, pp. 1957-1966, 2007.

- T.-J. King, "FinFETs for nanoscale CMOS digital integrated circuits," in Computer-Aided Design, ICCAD'05. IEEE/ACM International Conference on, 2005, pp. 207-210. [3]
- P. Mishra, A. Muttreja, and N. K. Jha, "Finfet circuit design," in [4] Nanoelectronic Circuit Design, ed: Springer, 2011, pp. 23-54.
- A. Muttreja, N. Agarwal, and N. K. Jha, "CMOS logic design with independent-gate FinFETs," in Computer Design, ICCD'07. 25th International Conference on, 2007, pp. 560-567.
  M. Jafari, M. Imani, M. Ansari, et al., "Design of an ultra-low power 32-bit adder operating at subthreshold voltages in 45-nm FinFET," in During & Tachaeu of International Conference on Lower 2010. [5]
- [6] Design & Technology of Integrated Systems in Nanoscale Era (DTIS), 2013 8th International Conference on, 2013, pp. 167-168.
- M. Jafari, M. Imani, and M. Fathipour, "Analysis of Power Gating in [7] Different Hierarchical Levels of 2MB Cache, Considering Variation, International Journal of Electronics, vol. 102, 2014.
- A. Carlson, Z. Guo, S. Balasubramanian, et al., "SRAM read/write [8] margin enhancements using FinFETs," Very Large Scale Integration (VLSI) Systems, IEEE Transactions on, vol. 18, pp. 887-900, 2010.
- B. Ebrahimi, A. Afzali-Kusha, and H. Mahmoodi, "Robust FinFET [9] SRAM design based on dynamic back-gate voltage adjustment, Microelectronics Reliability, 2014.
- [10] N. Yadav, S. Jain, M. Pattanaik, et al., "NBTI aware IG-FinFET based SRAM design using adaptable trip-point sensing technique," in Nanoscale Architectures (NANOARCH), 2014 IEEE/ACM International Symposium on, 2014, pp. 122-128.
- [11] M.-L. Fan, Y.-S. Wu, V.-H. Hu, et al., "Investigation of cell stability and write ability of FinFET subthreshold SRAM using analytical SNM model," Electron Devices, IEEE Transactions on, vol. 57, pp. 1375-1381, 2010.
- [12] T. Cakici, K. Kim, and K. Roy, "FinFET based SRAM design for low standby power applications," in Quality Electronic Design, ISQED'07. 8th International Symposium on, 2007, pp. 127-132.
- [13] M. Yamaoka, K. i. Osada, R. Tsuchiya, et al., "Low power SRAM menu for SOC application using Yin-Yang-feedback memory cell technology," in VLSI Circuits, Digest of Technical Papers. 2004 Symposium on, 2004, pp. 288-291.
- [14] Z. Guo, S. Balasubramanian, R. Zlatanovici, et al., "FinFET-based SRAM design," in Proceedings of the international symposium on Low power electronics and design, 2005, pp. 2-7.
- [15] O. Thomas, M. Reyboz, and M. Belleville, "Sub-1V, robust and compact 6T SRAM cell in double gate MOS technology," in Circuits and Systems, ISCAS'07. IEEE International Symposium on, 2007, pp. 2778-2781
- [16] S. Wang, T. Jin, C. Zheng, et al., "Low power aging-aware register file design by duty cycle balancing," in Proceedings of the Conference on Design, Automation and Test in Europe, 2012, pp. 546-549
- A. Aggarwal and M. Franklin, "Energy efficient asymmetrically ported register files," in Computer Design, Proceedings. 21st International Conference on, 2003, pp. 2-7.
- [18] M. R. Stan and W. P. Burleson, "Bus-invert coding for low-power I/O," Very Large Scale Integration (VLSI) Systems, IEEE Transactions on, vol. 3, pp. 49-58, 1995.
- [19] "SPEC - http://www.spec.org/cpu2006."
- N. Binkert, B. Beckmann, G. Black, et al., "The gem5 simulator," ACM [20] SIGARCH Computer Architecture News, vol. 39, pp. 1-7, 2011.
- M. Y. Zarei, R. Asadpour, S. Mohammadi, et al., "Modeling symmetrical independent gate FinFET using predictive technology model," in Proceedings of the 23rd ACM international conference on Great lakes symposium on VLSI, 2013, pp. 299-304. [21]
- [22] H. Kawasaki, K. Okano, A. Kaneko, et al., "Embedded bulk FinFET SRAM cell technology with planar FET peripheral circuit for hp32 nm node and beyond," in VLSI Technology, Digest of Technical Papers. 2006 Symposium on, 2006, pp. 70-71.
- [23] V. Huard, C. Parthasarathy, C. Guerin, et al., "NBTI degradation: From transistor to SRAM arrays," in Reliability Physics Symposium, IRPS 2008. IEEE International, 2008, pp. 289-300.
- [24] W. Wang, S. Yang, S. Bhardwaj, et al., "The impact of NBTI effect on combinational circuit: modeling, simulation, and analysis," Very Large Scale Integration (VLSI) Systems, IEEE Transactions on, vol. 18, pp. 173-183, 2010.