### PAPER

# High-speed and low-power repeater for VLSI interconnects

To cite this article: A. Karthikeyan and P. S. Mallick 2017 J. Semicond. 38 105006

View the article online for updates and enhancements.

## **Related content**

- <u>VLSI scaling methods and low power</u> <u>CMOS buffer circuit</u> Vijay Kumar Sharma and Manisha Pattanaik
- Performance analysis of a complete adiabatic logic system driven by the proposed power clock generator Jitendra Kanungo and S. Dasgupta
- <u>Digitally controlled oscillator design with a</u> <u>variable capacitance XOR gate</u>
  Manoj Kumar, Sandeep K. Arya and Sujata Pandey

# High-speed and low-power repeater for VLSI interconnects

A. Karthikeyan and P. S. Mallick<sup>†</sup>

School of Electrical Engineering, VIT University, Vellore-632014, India

**Abstract:** This paper proposes a repeater for boosting the speed of interconnects with low power dissipation. We have designed and implemented at 45 and 32 nm technology nodes. Delay and power dissipation performances are analyzed for various voltage levels at these technology nodes using Spice simulations. A significant reduction in delay and power dissipation are observed compared to a conventional repeater. The results show that the proposed high-speed low-power repeater has a reduced delay for higher load capacitance. The proposed repeater is also compared with LPTG CMOS repeater, and the results shows that the proposed repeater has reduced delay. The proposed repeater can be suitable for high-speed global interconnects and has the capacity to drive large loads.

Key words: repeater; interconnects; delay; power dissipation; charge pump DOI: 10.1088/1674-4926/38/10/105006 EEACC: 2570

### 1. Introduction

Transistors and interconnects are the basic building blocks of integrated circuits. Scaling of transistors at deep submicron technologies improves their performance. On the other hand, the interconnect performance is required to be suitable in terms of speed and power and also satisfying the ITRS requirements. Repeater insertion at proper intervals reduces the delay of long wires<sup>[1]</sup>. Delay of an interconnect is directly proportional to the square of its length. Inserting repeaters at intermediate levels, divides the wire into smaller segments which makes the delay linear to the length of the interconnect.

Scaling down the supply voltage for low-power operation also increases the signal delay. Technology node scaling leads to more power dissipation and increases the signal delay<sup>[2]</sup>. Therefore, it is required to design a repeater to boost the signal at low operating voltages but not compromising speed and power. Banerjee et al.<sup>[3]</sup> have developed a method to estimate the repeater size for power optimization, but with an increase in delay. Various algorithms for buffer insertion were developed to reduce the delay in interconnects<sup>[4, 5]</sup>. Repeaters are also used to reduce the crosstalk noise with an increase of power dissipation<sup>[6, 7]</sup>. Power delay product is used as a better criterion to obtain the optimum number of repeaters to reduce the overall delay, power and crosstalk<sup>[7]</sup>. Repeater insertion can approximately cancel the negative effects of wire sizing on delay<sup>[8]</sup>. Optimization of repeaters for delay minimization for scaled voltage leads to area and power saving<sup>[9, 10]</sup>. Temperature analysis of two different repeaters for global copper interconnects was done<sup>[11]</sup>.

Performance of CNT interconnects are better than copper interconnects<sup>[12, 13]</sup>. Recently, many researchers have introduced SWCNT and MWCNTbased interconnects which reduce the use of a number of repeaters<sup>[14]</sup>. A multiple equivalent single conductor model was introduced for CNT interconnects<sup>[15]</sup>. The same group have shown that semiconducting carbon nanotubes are more suitable for crosstalk reduction in global interconnects<sup>[16]</sup>. Mixed CNTbased interconnects may be the best solution for their superior properties<sup>[17]</sup>. Different optimization techniques in VLSI interconnects<sup>[18]</sup> can help in this regard in a better way, because most of the works are concentrated on optimization of repeaters for reduced delay, power dissipation and crosstalk. A high-speed and powerefficient repeater can further reduce the number of repeaters.

Some other works concentrated on alternate repeater insertion methods<sup>[2, 19]</sup>. Sharma *et al.* proposed a new low-power transmission gate (LPTG) CMOS repeater which reduces the power dissipation with an increase of delay penalty<sup>[2]</sup>. Schmitt trigger is used as a buffer for delay and power reduction in VLSI interconnects<sup>[19]</sup>. Alpha power law model is used to get an accurate modeling of CMOS buffers. A new resistance was introduced for the buffer that can be used for calculation of crosstalk<sup>[20]</sup>. A new approach, connecting the repeaters in parallel outperforms the serial repeaters<sup>[21]</sup>. Boosting the signals in interconnects is more effective in reducing the delay. Boosting can be done at the driver stage or at the repeater phase in an interconnect. High boosting predrivers have a better energy efficiency compared to a conventional repeater while they operate at the sub threshold region<sup>[22]</sup>. Variation tolerant boosting techniques were introduced to boost the switching speed of the interconnect at critical paths<sup>[23]</sup> and variation tolerant capacitive boosting technique for sub threshold circuits<sup>[24]</sup>. The authors<sup>[25]</sup> have introduced a novel boosting structure using double gate all around (DGAA) transistors by controlling the two gates independently. Boosters are bidirectional and provide low impedance that improves the signal integrity. Booster requirements are three times lesser compared to the repeater for the same interconnect length thus saving area and power<sup>[26]</sup>. A boosting technique for better power efficiency in on-chip interconnects was obtained<sup>[27]</sup>. Most of the boosting techniques were concentrated on either delay or power dissipation.

This paper analyzes and compares both the delay and power dissipation of the proposed repeater. The proposed repeater was also analyzed for different technology nodes The organization of this paper is as follows. In Section 2 we discuss the operation of conventional repeater, in Section 3, the

© 2017 Chinese Institute of Electronics

<sup>†</sup> Corresponding author. Email: psmallick@vit.ac.in

Received February 2017, revised manuscript received 17 March 2017



Fig. 1. Conventional repeater.

schematic design of the proposed repeater is explained. Then in Section 4 we discuss the results and discussions and finally Section 5 concludes the paper.

#### 2. Conventional repeater

A conventional repeater consists of two CMOS inverters connected in cascade<sup>[1]</sup>. A CMOS inverter is a series combination of PMOS and NMOS devices. It consists of a driver and a load. The propagation delay in a conventional repeater is calculated as half of the sum of rising delay and falling delay<sup>[1]</sup>. Fig. 1 shows the conventional repeater.

Here the width of PMOS is considered double the width of NMOS to have an equally sized inverter. The dynamic power dissipation of a conventional repeater is directly proportional to supply voltage, load capacitance and frequency of switching. The delay in conventional buffer depends on voltage and load capacitance. Scaling the device size will increase the delay and reduces the power dissipation. Increasing the width of the transistors reduces the delay but leads to power dissipation. There is a tradeoff between delay and power dissipation.

# 3. Schematic design of the proposed high-speed and low-power repeater

Fig. 2 shows the proposed high-speed low-power repeater that consists of a CMOS-transmission gate. CMOS-transmission gate is constructed by connecting the input signal of the CMOS inverter to the gate terminal of NMOS N1 and PMOS P2 and the output of CMOS is connected to the gate terminal of PMOS P1 and NMOS N2. The output of the transmission gate is connected to output node with the load capacitance  $C_{\rm L}$ . The output node is again connected to a CMOS inverter and then to a charge pump  $C_{\rm pump}$ . The charge pump is connected between the output of the CMOS inverter and node n1. The charge pump would get charged for a logic 1 and discharge the signal through node n1 for logic 0. Three PMOS transistors and three NMOS transistors are connected in series and parallel to reduce the propagation delay and power dissipation of the repeater.

While passing logic 0 at the input, the output node also becomes logic 0. The inverted output of the CMOS inverter at the driver is connected to the gate of NMOS transistors N3, N4 and N5, and turns on all the three transistors to perform a strong pull down, which reduces the falling delay. The output node which



Fig. 2. High-speed, low-power repeater.

is connected to a CMOS inverter at the load side will turn on the PMOS transistor of that inverter and charges the capacitor  $C_{\text{pump}}$  to high state. While passing logic 1, the output node also becomes logic 1 which will turn on the NMOS transistor of the CMOS inverter at the load side. This would pull down the  $C_{\text{pump}}$  and the charge stored at  $C_{\text{pump}}$  starts discharging through node 1 and making the voltage at node 1 greater than VDD. The PMOS transistors P3, P4 and P5 are also turned on with a boosted signal of voltage greater than VDD that is applied to the source of P3 and pulls the output node to high state. Due to this the rising delay of the output node is reduced. The rising delay is more compared to a conventional repeater due to the series connection of PMOS transistors. There is a gradual reduction in falling delay due to NMOS transistors connected in parallel. So the widths of NMOS transistor are fixed. The CMOS inverters connected to driver and load sides are equally sized inverters. The width of the PMOS transistors P3, P4, P5 connected in series can be increased to reduce the rising delay. Increasing the width of the PMOS transistor P4 will have lesser power dissipation compared to increasing the width of P3 and P5. P4 is connected between P3 and P5, whereas P3 and P5 are connected directly to the nodes. The width of NMOS and PMOS transistors are kept at 1 : 2 ratio which is the same as a conventional repeater. The width of PMOS transistor P4 is increased to 8 times the width of NMOS, to improve the signal strength. The chances of short circuit in transistor P4 is also lesser compared to varying the width of P3 and P5, since it is not connected directly to any node or other NMOS transistors. Varying the width of PMOS transistor P4 plays a major role in reducing the delay or power dissipation. Increasing or decreasing the width of a conventional repeater will have an increase in power dissipation or delay. Fig. 3 shows the delay of conventional repeater and proposed repeater.

The dotted lines show the delay of conventional repeater and solid lines show the delay of proposed repeater. The rising delay of proposed repeater is slightly more and falling delay of proposed repeater is less, which leads to reduced average propagation delay of the proposed repeater. Transmission gates are used here at the input side. Transmission gates are also used for high frequency applications<sup>[28]</sup>.

#### 4. Results and discussions

This section gives the results of our proposed approach. The proposed repeater was validated by Spice simulations. In



Fig. 3. Delay of proposed repeater versus conventional repeater.



Fig. 4. (Color online) Rising delay and falling delay at 45 nm technology node due to voltage scaling.

the simulations predictive technology model  $(PTM)^{[29]}$  of 45 and 32 nm was employed.

#### 4.1. Delay analysis

# 4.1.1. Delay due to voltage scaling and technology node scaling

Rising delay and falling delay of the conventional repeater and proposed repeater at 45 and 32 nm technology node are shown in Figs. 4 and 5 respectively.

The conventional CMOS repeater has lesser rising delay compared to the proposed repeater. The falling delay of the proposed repeater is much less compared to the conventional repeater. Scaling the technology node leads to increase in delay. At 32 nm technology node both rising delay and falling delay of both the repeaters are increased. The reduction of falling delay is more for the proposed repeater at 32 nm. Due to equally sized CMOS inverters used in conventional repeaters, the rising and falling delay of the conventional repeaters are almost the same.

For both cases, the delay increases due to scaling of voltage as well as the technology node. The overall propagation delay depends on falling delay and rising delay in the case of an inverter. The equation for propagation delay is as follows:



Fig. 5. (Color online) Rising delay and falling delay at 32 nm technology node due to voltage scaling.



Fig. 6. (Color online) Propagation delay due to voltage scaling.

$$t_{\rm p} = \frac{t_{\rm pHL} + t_{\rm pLH}}{2}.$$
 (1)

Fig. 6 shows the propagation delay due to voltage scaling at 45 and 32 nm technology node.

At both the technology nodes and for scaling of voltages the propagation delay of the proposed repeater is lesser compared to the conventional repeater. The proposed repeater is more suitable for global interconnects.

# 4.1.2. Delay due to load scaling and technology node scaling

The delay of an inverter is directly proportional to load and inversely proportional to supply voltage. Figs. 7 and 8 show the rising delay and falling delay at 45 and 32 nm technology node for the variation of load capacitance. As the load capacitance increases, the rising delay of the repeater becomes lesser and falling delay becomes more compared to the proposed repeater.

At 32 nm technology node, the increase in rising delay and falling delay of the proposed repeater is much less compared to the conventional repeater. Here, the rising and falling delay of the proposed repeater varies linearly with the technology nodes. Fig. 9 shows the propagation delay due to the load capacitance.

The propagation delay is less at both the technology nodes

| Table 1. Comparison with other published repeaters. |                         |                      |               |
|-----------------------------------------------------|-------------------------|----------------------|---------------|
| Parameter                                           | Ref. [2] (Conventional) | Ref. [2] (LPTG CMOS) | Proposed work |
| Technology                                          | 32 nm bulk              | 32 nm bulk           | 32 nm bulk    |
| Delay (ps)                                          | 350-400                 | 1000–1100            | 192.2         |
| Power dissipation (nW)                              | 35-40                   | 4–6                  | 18.8          |



Fig. 7. (Color online) Rising delay and falling delay at 45 nm technology node due to load scaling.



Fig. 8. (Color online) Rising delay and falling delay at 32 nm technology node due to load scaling.

for the proposed repeater compared to the conventional repeater. The proposed repeater is more suitable to drive the high loads with high speed.

#### 4.2. Power dissipation analysis

A common technique to reduce the power dissipation is to reduce the supply voltage. Scaling the technology node increases the power dissipation. Here the power dissipation is analyzed by scaling the voltages for 45 and 32 nm technology nodes. The proposed repeater has lesser power dissipation compared to conventional CMOS repeater as shown in Fig. 10.

#### 4.3. Comparison of delay and power dissipation of the proposed repeater

The proposed high-speed and low-power repeater was also validated and compared with conventional repeater<sup>[2]</sup> and



Fig. 9. (Color online) Propagation delay due to load scaling.



Fig. 10. (Color online) Power dissipation at both technology nodes.

LPTG CMOS Buffer<sup>[2]</sup> using Berkeley predictive technology model (BPTM) BSIM4 bulk CMOS files at 32 nm technology at a nominal  $V_{\text{DD}}$  of 0.8 V.

Table 1 shows the comparison with other published repeaters.

The results show that the delay of the proposed repeater is less than the conventional repeater<sup>[2]</sup> and LPTG CMOS buffer<sup>[2]</sup>, power dissipation is less than the conventional repeater and more than the LPTG CMOS buffer. The power delay product of proposed high-speed and low-power repeater is better than conventional repeater and LPTG CMOS buffer.

#### 5. Conclusion

This paper has analyzed the performance of conventional CMOS repeater with a new repeater in terms of power dissipation and delay. The delay is reduced by increasing the width of the transistors and by boosting the signal, the rising delay is slightly reduced and parallel connection of NMOS transistors are used to reduce the falling delay. The power dissipation is reduced by choosing a proper transistor The result shows the delay and power dissipation are 41% and 29% less than the conventional repeater. The proposed repeater can be used for global interconnects and is more suitable to drive larger loads. The proposed repeater is more suitable for critical path circuits.

### References

- Rabaey J, Chandrakasan A, Nikolic B. Digital integrated circuits: a design perspective. Prentice Hall of India, 2003
- [2] Sharma V K, Pattanaik M. VLSI scaling methods and low power CMOS buffer circuits. J Semicond, 2013, 34(9): 095001
- [3] Banerjee K, Mehrotra A. A power-optimal repeater insertion methodology for global interconnects in nanometer designs. IEEE Trans Electron Devices, 2002, 49(11): 2001
- [4] Alpert J C, Devgan A, Quay T S. Buffer insertion for noise and delay optimization. IEEE Trans Comput-Aided Des Integr Circuits Syst, 1999, 18(2): 1633
- [5] Wang X, Liu W, Yu M. A distinctive O (mn) time algorithm for optimal buffer insertions. IEEE International Symposium on Quality Electronic Design, 2015, 16: 293
- [6] Kaushik B K, Sarkar S, Agarwal R P, et al. Crosstalk analysis and repeater insertion in crosstalk aware coupled VLSI interconnects. Microelectron Int, 2006, 23(3): 55
- [7] Kaushik B K, Agarwal R P, Sarkar S. Repeater insertion in crosstalk-aware inductively and capacitively coupled interconnects. Int J Circuit Theory Appl, 2011, 39(6): 629
- [8] Hasani F, Masoumi N. Interconnect sizing and spacing with consideration of buffer insertion for simultaneous crosstalk-delay optimization. International Conference on Design & Technology of Integrated Systems in Nanoscale Era, 2008, 3: 1
- [9] Chandel R, Sarkar S, Agarwal R P. Repeater insertion in global interconnects in VLSI circuits. Microelectron Int, 2005, 22(1): 43
- [10] Chandel R, Sarkar S, Agarwal R P. An analysis of interconnect delay minimization by low-voltage repeater insertion. Microelectron J, 2007, 38(4): 649
- [11] Alizadeh A, Sarvari R. On temperature dependency of delay for local, intermediate, and repeater inserted global copper interconnects. IEEE Trans Very Large Scale Integr (VLSI) Syst, 2015, 23(12): 3143
- [12] Libo Q, Zhangming Z, Ruixue D, et al. Circuit modeling and per-

formance analysis of SWCNT bundle 3D interconnects. J Semicond, 2013, 34(9): 095014

- [13] Zhao W S, Wang G, Sun L. Repeater insertion for carbon nanotube interconnects. IET Micro Nano Lett, 2014, 9(5): 337
- [14] Liang F, Wang G, Ding W. Estimation of time delay and repeater insertion in multiwall carbon nanotube interconnects. IEEE Trans Electron Devices, 2011, 58(8): 2712
- [15] Sathyakam P U, Mallick P S. Transient analysis of mixed carbon nanotube bundle interconnects. Electron Lett, 2011, 47(20): 1134
- [16] Sathyakam P U, Karthikeyan A, Mallick P S. Role of semiconducting carbon nanotubes in crosstalk reduction of CNT interconnects. IEEE Trans Nanotechnol, 2013, 12(5): 662
- [17] Sathyakam P U, Mallick P S. Towards realization of mixed carbon nanotube bundles as VLSI interconnects: a review. Nano Comm Netw, 2012, 3(3): 175
- [18] Karthikeyan A, Mallick P S. Optimization techniques for CNT based VLSI interconnects — a review. J Circuits, Syst, Comput, 2017, 26(3): 173002
- [19] Saini S, Kumar A M, Veeramachaneni S. An alternative approach to buffer insertion for delay and power reduction in VLSI interconnects. IEEE International Conference on VLSI Design 2010, 23: 411
- [20] Mehri M, Kouhani M H M, Masoumi N, et al. New approach to VLSI buffer modeling considering overshooting effect. IEEE Trans Very Large Scale Integr (VLSI) Syst, 2013, 21(8): 1568
- [21] Awwad F R, Nekili M, Ramachandran V, et al. On modeling of parallel repeater-insertion methodologies for SoC interconnects. IEEE Trans Circuits Syst, 2008, 55(1): 322
- [22] Ho Y, Chen H K, Su C. Energy-effective sub-threshold interconnect design using high-boosting predrivers. IEEE J Emerg Sel Top Circuits Syst, 2012, 2(2): 307
- [23] Shim K N, Hu J. Boostable repeater design for variation resilience in VLSI interconnects. IEEE Trans Very Large Scale Integr (VLSI) Syst, 2013, 21(9): 1619
- [24] Kil J, Gu J, Kim C. A high-speed variation-tolerant interconnect technique for sub-threshold circuits using capacitive boosting. IEEE Trans Very Large Scale Integr (VLSI) Syst, 2008, 16(4): 456
- [25] Lee J, Ryu M, Kim Y. On-chip interconnect boosting technique by using of 10-nm double gate-all-around (DGAA) transistor. IE-ICE Electron Express, 2015, 12(12): 1
- [26] Nalamalpu A, Srinivasan S, Burleson P W. Boosters for driving long on chip interconnects- design issues, interconnect synthesis, and comparison with repeaters. IEEE Trans Comput-Aided Des Integr Circuits Syst, 2002, 21(1): 50
- [27] Nigussie E, Tuuna S, Plosila J. Boosting performance of selftimed delay-insensitive bit parallel on-chip interconnects. IET Circuits Dev Syst, 2011, 5(6): 505
- [28] Gholami M. Phase frequency detector using transmission gates for high speed applications. Int J Eng Trans A, 2016, 29(7): 916
- [29] www.ptm.asu.edu