# 行政院國家科學委員會專題研究計畫 成果報告

# 多重電壓島佈局環境之 X-結構時脈繞線合成方法及結合可 製造性設計之應用研究(I)

## 研究成果報告(精簡版)

| 計 | 畫 | 類 | 別 | : | 個別型                    |
|---|---|---|---|---|------------------------|
| 計 | 畫 | 編 | 號 | : | NSC 98-2221-E-343-006- |
| 執 | 行 | 期 | 間 | : | 98年08月01日至99年07月31日    |
| 執 | 行 | 單 | 位 | : | 南華大學資訊工程學系             |
|   |   |   |   |   |                        |

計畫主持人: 蔡加春

- 計畫參與人員:大專生-兼任助理人員:王伯堯等五人 博士班研究生-兼任助理人員:郭仲傑
- 報告附件:出席國際會議研究心得報告及發表論文

處 理 方 式 : 本計畫涉及專利或其他智慧財產權,1年後可公開查詢

### 中華民國 99年09月27日

## 行政院國家科學委員會補助專題研究計畫成果報告

※整合可製造性設計方法應用於 X-結構時脈繞線合成之研究 ※ ※

\*\*\*\*\*

計畫類別: ☑個別型計畫 □整合型計畫 計畫編號: NSC 98-2221-E-343-006-執行期間: 98 年 8 月 1 日至 99 年 7 月 31 日

計畫主持人: 蔡加春 教授 共同主持人: 計畫參與人員: 郭仲傑(博士生) 古琳正(碩士生) 及臨時大專生

執行單位:南華大學資工系

中華民國 99 年 9 月 26 日

### 多重電壓島佈局環境之 X-結構時脈繞線合成方法及結合可製造性設計之應用研究(I) X-Architecture Clock Routing Synthesis Associated with Design for Manufacturability to the Application of Multi-Voltage Island Environment (I)

計畫類別:個別型計畫 國科會計劃編號:NSC 98-2221-E-343-006 執行期限:98年8月1日至99年7月31日 主持人:南華大學資工系 蔡加春教授

摘要:目前的 VLSI 設計技術已進入奈米時代中,功率消耗成為評估一顆晶片系統效能的關鍵重要因素。電壓 島的佈局設計環境已非常普及,它更是節省功率消耗最有效的方法之一。本工作主要在積體電路設計中之多重 電壓島佈局環境下完成 X 時脈樹建置,透過各種電壓島的連接排列組合和插入位準轉換器的排列組合,找出最 小功率消耗的每個子 X 時脈樹建置組合,應用於多重電壓島之不同區塊配置、不同電壓源的佈局環境之 X 時 脈樹分工與整合建置。實驗 10 個標準例子之結果顯示,二個與三個電壓島的 X 時脈樹較單一電壓島在平均功 率消耗上分別減少 11.1%與 19.6%。

**關鍵字:**X結構時脈樹、多重電壓島、位準轉換器、功率消耗。

**Abstract:** As the VLSI technology advances into the nanometer era, power consumption becomes a critical issue of evaluating a chip system performance. Voltage-island design methodology uses multiple supplying voltages is one of efficient ways to reduce overall power consumption. This work proposes an algorithm to complete an X-clock tree that connects several voltage-islands. We first construct the X-clock tree for each voltage-island and then combine these X-clock trees based on a well-defined connection with inserted level-shifters to reduce power consumption. Experimental results show that two- and three-voltage-island-based X-clock trees can save 11.1% and 19.6% in power consumption, respectively.

Keywords: X-clock Tree, Multi-voltage Island, Level Shifter, Power Consumption.

#### I. INTRODUCTION

Due to the advanced VLSI fabrication technologies, a lot of devices can be implemented on a single chip. Hence, power consumption becomes an important issue in nanometer chip design. Nowadays, a system-on-chip (SoC) usually has multiple operation frequencies because the required performances of all functional blocks are not the same [1]. The voltage-island design methodologies [2-4] assign different supplying voltages to individual functional blocks in a system for reducing power consumption. For instance, the performance-critical block (e.g., processor) requires the highest supplying voltage and other blocks (e.g., control logics and peripheral units) can operate at lower voltages. For transmitting a signal on the different voltage islands, a level-shifter (*LS*) has to be inserted into the interconnection that transmits a signal from a low-voltage island to a high one because a circuit may suffer from excessive leakage energy when low voltage gates directly drive high voltage ones [1, 4].

Many works addressed the floorplan and placement [8-9] issues of multi-voltage-based designs. However, constructing a clock tree that connects all the clock sinks of voltage-islands is a new challenge and has not been discussed more. Generally, there are two straight approaches. The first approach is to complete the whole clock tree routing, then insert level-shifters for post refinement and compensate clock skew to be zero. Figure 1(a) shows a clock tree that connects three voltage-islands (e.g., Islands 1, 2, and 3 respectively operate at 1.0V, 1.1V, and 1.2V). Therefore, level-shifters are required for the interconnections from the low-voltage island (Island 1) to other two high-voltage islands (Islands 2 and 3). Notably, the clock skew has been refined to be zero.

Another approach is to route the subclock tree for each voltage-island, then combine all the subclock tree routings to be a complete one with level-shifter insertion. Figure 1(b) shows that a clock tree routing combines three subclocks  $(CLK_1, CLK_2, \text{ and } CLK_3)$  with two level-shifters those respectively connect Island 1 (1.0V) to Island 2 (1.1V) and Island 3 (1.2V). Compared with the first approach shown in Fig. 1(a), the second one can save four level-shifters.

In this work, we apply the second approach to complete the clock routing on multi-voltage islands. As shown in Fig. 1(b), the system clock source adopts the voltage of 1.0V of Island 1 and two level-shifters are required for driving Islands 2 and 3. That is, two subclocks  $CLK_2$  and  $CLK_3$  are first combined and then  $CLK_1$  is integrated with two level-shifters and them. On the other hand, we may select the voltage of 1.1 V of Island 2 as the system clock source, such that the subclocks  $CLK_1$  and  $CLK_3$  are first combined and a level-shifter is inserted when  $CLK_2$  is integrated with them. Based on the above discussion, we can get that different combinations of subclocks may cause another whole-chip clock routing with distinct inserted level-shifters, clock delay, and power consumption. However, delay and

power consumption are always the trade-off part. The problem of clock routing on multi-voltage islands is defined as follows.

Given a set of clock sinks on a set of voltage-islands, the objective is to construct a multi-voltage-island-based zero-skew clock tree with minimum power consumption.



Fig. 1. Clock routings on three voltage-islands constructed from (a) approach A with six LSs and (b) approach B with two LSs.

For the above problem, our solution is that the PMXF [5] algorithm is first applied to construct the subclock tree of each voltage-island with X-architecture routing scheme due to X-architecture performs better than Manhattan-architecture in delay, wirelength, and power consumption. Then, we combine all the subclocks according to different combinations of voltage-islands and obtain an integrated X-clock tree by inserting the required level-shifters for achieving minimum power consumption.

#### II. WIRE AND LEVEL-SHIFTER MODELS FOR DELAY AND POWER CALCULATION

Fitted Elmore delay (FED) model [6] is widely used for wire delay calculation in clock tree synthesis. A wire *wire<sub>i</sub>* with the width  $w_i$  and length  $l_i$  based on the FED model is shown in Fig. 2(a), where r,  $c_a$ , and  $c_f$  are the sheet resistance, unit area capacitance, and fringing capacitance, respectively. The delay of the wire *wire<sub>i</sub>* with a loading capacitance  $C_{L,i}$  at *sink<sub>i</sub>* is formulated as follows.

$$Delay(i) = (rl_i / w_i) \left[ 0.5(Dc_a w_i + Ec_f) l_i + FC_{L,i} \right]$$
(1)

where the coefficients D, E, and F are obtained by using the curve fitting techniques [6].



Fig. 2. Equivalent circuits of (a) a wire and (b) a level-shifter.

Level-shifter (*LS*) is a key component in multiple voltage-island design. A *LS* is inserted into the interface from a low-voltage island to a high-voltage island. As reported in [1], a level-shifter acts like a conventional buffer to consume power and affect delay. Figure 2(b) shows that the equivalent circuit of a level-shifter contains the intrinsic delay  $T_{LS}$ , input capacitance  $c_{LS}$ , and output resistance  $r_{LS}$ . When a level-shifter drives the wire wire<sub>i</sub> with a loading capacitance  $C_{L_i}$  shown in Fig. 2(a), the delay is formulated as follows.

$$Delay(i) = T_{LS} + (r_{LS} + rl_i / w_i) \left[ 0.5(Dc_a w_i + Ec_f) l_i + FC_{Li} \right]$$
(2)

Power consumption is applied to measure one of chip performances. For a voltage-island-based clock tree, the calculation of total power consumption  $P_{total}$  should include the equivalent wire capacitances of interconnections, input capacitances of inserted level-shifters, and loading capacitances of multi-voltage islands. Thus,  $P_{total}$  is formulated as follows.

$$P_{total} = \sum_{\forall e_i} C_{load,i} F_{clk} V_{dd}^2$$
(3)

where  $C_{load,i}$ ,  $F_{clk}$ , and  $V_{dd}$  are the capacitance of  $sink_i$  (or *node<sub>i</sub>*), clock frequency, and supplying voltage, respectively.

#### III. MULTI-VOLTAGE ISLAND X-CLOCK TREE FOR POWER MINIMIZATION

Depending on different functional requirements, each island can operate at a specified supplying voltage, such that the total power consumption of a chip can be reduced. This work proposes a multi-voltage-island-based X-clock tree construction (MuVIX) algorithm to minimize the power consumption. Figure 3 shows the proposed MuVIX algorithm.

| Algorithm: MuVIX (Multi-voltage island-based X-clock tree construction)              |  |  |  |  |  |  |  |
|--------------------------------------------------------------------------------------|--|--|--|--|--|--|--|
| <b>Input:</b> A set of voltage-islands VI and a set of supplying voltages SV         |  |  |  |  |  |  |  |
| <b>Output:</b> A multi-voltage island-based X-clock tree with minimum power          |  |  |  |  |  |  |  |
| consumption.                                                                         |  |  |  |  |  |  |  |
| 1 $SV_{sys} \leftarrow$ Determine the supplying voltage for system clock source.     |  |  |  |  |  |  |  |
| 2 $PMXF(VI)$ ; /*construct an X-clock tree for each voltage island $\in VI$ .*/      |  |  |  |  |  |  |  |
| 3 Let each constructed X-clock tree be a leaf-node.                                  |  |  |  |  |  |  |  |
| 4 $CS(VI) \leftarrow$ Obtain the connection sequences of VI.                         |  |  |  |  |  |  |  |
| 5 for each $vi \in CS(VI)$                                                           |  |  |  |  |  |  |  |
| 6 { $CS(LS) \leftarrow Obtain the connection sequences for level shifter insertion.$ |  |  |  |  |  |  |  |
| 7 do                                                                                 |  |  |  |  |  |  |  |
| 8 { Make combination for each $ls \in CS(LS)$ }                                      |  |  |  |  |  |  |  |
| 9 while ( <i>power</i> is improved)                                                  |  |  |  |  |  |  |  |
| 103                                                                                  |  |  |  |  |  |  |  |

Fig. 3. The proposed MuVIX algorithm.

In the algorithm, for a given set of voltage-islands, denoted as VI, and a set of supplying voltages, denoted as SV, the supplying voltage for the system clock source, denoted as  $SV_{sys}$ , is first determined. Then, the PMXF [5] algorithm constructs the X-clock tree for each voltage-island belonging to VI and marks these constructed X-clock trees to be leaf nodes. To connect the leaf nodes for minimum power consumption of integrating these island-based X-clock trees, all the connection sequences of voltage-islands with different combinations, denoted as CS(VI), can be obtained. For a connection sequence  $vi \in CS(VI)$  associated with the  $SV_{sys}$  of these islands, level-shifters are required to insert into the interface from low-to-high voltage islands. The combination of connection sequences with level–shifter insertion is denoted as CS(LS). After that, we combine all the leaf-nodes of island-based X-clock trees and calculate the power consumption for each connection sequence  $ls \in CS(LS)$ . Finally, we can get a multi-voltage-island-based X-clock tree with the well-defined connection sequence for minimum power consumption.

#### A. Determine Supplying Voltage for System Clock Source

Before constructing the system clock tree that connects all the island-based subclock trees, we define the supplying voltage for system clock source *SVsys* as follows.

(4)

$$SV_{sys} = \min_{\forall yi} SV_k$$

For each voltage-island  $v_{i_k} \in VI$ ,  $v_{i_k}$  can operate at several supplying voltages  $SV_k = \{sv_1, sv_2, ...\}$ . In this work, we set the lowest supplying voltage of all the islands as the  $SV_{sys}$  for the expectation of minimum power consumption, but some level-shifters should be required for the interfaces from low-to-high voltage islands.

#### B. Construct X-clock Tree for a Voltage-Island

To construct the X-clock tree for each voltage-island, we apply the PMXF algorithm [5] for it. Given a chip with several voltage-islands, as shown in Fig. 4(a), PMXF connects all the clock sinks in one of multi-voltage islands with X-architecture routing scheme to complete a subclock tree. Figure 4(b) shows the sub-X-clock tree for Island 2. A system clock source enters the clock source of Island 2, denoted as  $CLK_2$ , to drive all the clock sinks synchronously. Here, we let its clock source  $CLK_2$  be a leaf-node, denoted as  $Leaf-node_2$ , and present its supplying voltages as  $SV_2 = \{sv_1, sv_2, ...\}$ . Similarly, *Leaf-node*<sub>3</sub> and  $SV_3$  respectively represent the clock source and supplying voltage of Island 3.



Fig. 4. (a) Given three voltage-islands and (b) PMXF constructs the X-clock tree of Island 2 and (c) labels it as a leaf node.

#### C. Connection Sequences of Voltage-Islands

To construct a multi-voltage island-based X-clock tree, we should know how to connect these islands with different supplying voltages to achieve minimum power consumption. Because the construction of X-clock tree is based on binary tree structure, the combination of connection sequences is k! if there are k leaf-nodes. For the three voltage-islands shown in Fig. 4(a), they are labelled as *Leaf-node*<sub>1</sub> for Island 1, *Leaf-node*<sub>2</sub> for Island 2, and *Leaf-node*<sub>3</sub> for Island 3 with three supplying voltages 1.0V, 1,1V, and 1.2V, respectively. Hence, there are six connection sequences (i.e, 3!), denoted as  $CS(VI) = \{vi_1, vi_2, vi_3, vi_4, vi_5, vi_6\}$ . For  $vi_1 \in CS(VI)$  shown in Fig. 5(a), *Leaf-node*<sub>1</sub> and *Leaf-node*<sub>2</sub> are connected first and then they are connected with *Leaf-node*<sub>3</sub> to complete the voltage-island-based X-clock tree. Figure 5(b) shows the other connection sequence  $vi_6 \in CS(VI)$ .



Fig. 5. (a) The first connection sequence  $vi_1$  and (b) the sixth one  $vi_6$ .

After determining the supplying voltage for system clock  $SV_{sys}$  and the supplying voltage for each island, denoted as  $V_{dd}$ , such as  $SV_{sys}=1.0V$ ,  $V_{dd}1=1.0V$ ,  $V_{dd}2=1.1V$ , and  $V_{dd}3=1.2V$ , we can integrate a multi-voltage-island-based X-clock tree with combining three leaf-nodes and inserting the required level-shifters. For  $vi_1 \in CS(VI)$  shown in Fig. 6(a), the  $SV_{sys}$  is 1.0V due to Islands 1-3 respectively operate at 1.0V, 1.1V, and 1.2V. The inserted level-shifters  $LS_1$  and  $LS_2$  deliver the system clock source at 1.0V to *Leaf-node*<sub>3</sub> and *Leaf-node*<sub>2</sub> at 1.2V and 1.1V, respectively. This is the connection sequence for  $vi_1$  with level-shifter insertion. On the other hand, Figs. 6(b) and 6(c) show the other two connection sequences for  $vi_6$  with different level-shifter insertions. In Fig. 6(c),  $LS_1$  delivers the system clock source at 1.0V to *Leaf-node*<sub>2</sub> at 1.1V to 1.2V. Hence, we can get that each connection sequence of voltage-islands has at least one connection sequence with level-shifter insertion. Then, different clock delay and power consumption.



Fig. 6. The connection sequences with inserted level-shifters for (a) vi1, as well as, (b) and (c) for vi6

#### D. Delay Calculation with Inserted Level-Shifters

When the clock signal is delivered from a voltage-island operating at  $SV_{sys}$  to another island which supplying voltage is higher than  $SV_{sys}$ , a level-shifter has to be inserted. Figure 7(a) shows that Islands 1 and 2 operate at 1.0V and 1.1V, respectively. When two islands are connected, a level-shifter is inserted and delivers the clock signal from the system clock source to the *Leaf-node*<sub>2</sub> of Island 2. To respectively calculate the clock delay from the system clock source to *Leaf-node*<sub>1</sub> and *Leaf-node*<sub>2</sub> with (1) and (2), Fig. 7(b) shows the equivalent model of Fig. 7(a) based on FED model.





#### E. Time Complexity Analysis

For given a set of *n* clock sinks in a set of *m* voltage-islands, the proposed MuVIX algorithm shown in Fig. 3 can complete the design of multi-voltage-island-based X-clock tree. PMXF [5] constructs the X-clock tree for each voltage-island in O(nlogn). For each connection sequence, it takes O(mlogn) to combine *m* leaf-nodes with inserted level-shifters. Because we always determine the lowest supplying voltage as the supplying voltage for system clock source, the combination of connection sequences for searching the minimum power consumption is less than *m*!. Moreover,  $m \le n$ . Hence, the time complexity of MuVIX algorithm is O(nlogn).

#### IV. EXPERIMENTAL RESULTS

The proposed MuVIX algorithm has been implemented by using the C++ programming language and performed on a Windows machine with 2.5GHz Intel processor and 2GB memory. The fabrication parameters of FED delay model [7] and level-shifter (*LS*) under 130nm process are listed in Table I for delay and power calculation. For comparative study, the adopted benchmarks contain IBM r1-r5 [10], MCNC Primary1-2 [11], and ISCAS89 s1423, s5378, and s15850 [12].

| TABLE I TECHNOLOGY PARAMETERS OF FED DELAY MODEL AND LEVEL-SHIFTER UNDER TSUNM PROCESS. |         |   |            |                  |      |  |  |
|-----------------------------------------------------------------------------------------|---------|---|------------|------------------|------|--|--|
| $r (\Omega/\mu m)$                                                                      | 0.623   | D | 1.12673ln2 | $r_{LS}(\Omega)$ | 250  |  |  |
| $c_a$ (fF/µm)                                                                           | 0.00598 | E | 1.10463ln2 | $C_{LS}$ (fF)    | 23.5 |  |  |
| $c_f$ (fF/µm)                                                                           | 0.043   | F | 1.04836ln2 | $T_{LS}$ (ps)    | 54.4 |  |  |
|                                                                                         |         |   |            | $F_{clk}$ (Hz)   | 100M |  |  |

TABLE I TECHNOLOGY PARAMETERS OF FED DELAY MODEL AND LEVEL-SHIFTER UNDER 130NM PROCESS.

In the experiments, the X-clock tree of a given benchmark with single voltage-island is first constructed by using PMXF. To design a multi-voltage-island-based X-clock tree, the benchmark is partitioned into several voltage-islands. Here, we respectively perform two and three voltage-islands, as shown in Fig. 8. The width (*w*) and height (*h*) of benchmarks are listed in the second column of Table II. After partitioning, we use PMXF to connect the clock sinks in each voltage-island and to construct the island-based sub-X-clock trees. Then, we determine the connection sequence of islands associated with the different supplying voltages to achieve minimum power consumption. Finally, all the island-based sub-X-clock trees are merged to a new one and the level-shifters are inserted if the clock signal is delivered from a low-voltage island to a high-voltage island.



Fig. 8. The partition of (a) two and (b) three voltage-islands.

Table II lists the power, delay, and wirelength of single- and two-voltage-island-based X-clock tree. Here, "ratio" is defined as the ratio of two-voltage island to single-voltage island in experimental results. As listed in Table II, two-voltage-island-based X-clock tree averagely achieves reductions of 11.1% and 7.4% in power consumption and delay, respectively, but more wirelength by 2.9% is required.

| Donohmark  | $u \times h (um)$      | Po            | ower (W)   |         | D             | elay (µs)  |         | Wirelength (µm) |            |         |
|------------|------------------------|---------------|------------|---------|---------------|------------|---------|-----------------|------------|---------|
| Benchinark | $W \times n$ (µIII)    | Single-island | Two-island | (ratio) | Single-island | Two-island | (ratio) | Single-island   | Two-island | (ratio) |
| <i>r</i> 1 | $7000 \times 69984$    | 0.078951      | 0.074122   | (0.939) | 0.309829      | 0.332149   | (1.072) | 1419028         | 1448706    | (1.020) |
| r2         | 93134 × 94016          | 0.194193      | 0.171951   | (0.885) | 1.122692      | 0.873511   | (0.778) | 2911773         | 2858025    | (0.981) |
| r3         | $98500 \times 97000$   | 0.259581      | 0.239897   | (0.924) | 1.799442      | 1.710492   | (0.950) | 3658510         | 3732397    | (1.020) |
| r4         | $126988 \times 126970$ | 0.599519      | 0.585885   | (0.977) | 4.792344      | 3.899586   | (0.813) | 7230327         | 7597801    | (1.050) |
| r5         | $145224 \times 142920$ | 0.993228      | 0.931658   | (0.938) | 8.564433      | 9.635292   | (1.125) | 10837358        | 11177321   | (1.031) |
| Primary1   | $6000 \times 6000$     | 0.175772      | 0.128923   | (0.733) | 0.058590      | 0.051669   | (0.881) | 146514          | 137207     | (0.936) |
| Primary2   | $10500 \times 10500$   | 0.416336      | 0.294377   | (0.707) | 0.236469      | 0.177393   | (0.750) | 321887          | 333225     | (1.035) |
| s1423      | $11000 \times 14000$   | 0.006842      | 0.005257   | (0.768) | 0.007418      | 0.007006   | (0.944) | 113406          | 117452     | (1.035) |
| s5378      | $13000 \times 13000$   | 0.017297      | 0.014881   | (0.860) | 0.017287      | 0.015403   | (0.891) | 194411          | 234776     | (1.207) |
| s15850     | $15000 \times 16000$   | 0.064583      | 0.047912   | (0.741) | 0.049490      | 0.052300   | (1.056) | 477166          | 478488     | (1.002) |
| Average    | -                      | -             | -          | (0.889) | -             | -          | (0.926) | -               | -          | (1.029) |

| TABLE II COMPARISON OF SINGLE- AND TWO-VOLTAGE ISLAND-BASED X-CLOCK TREES IN POWER, DELAY, AND WIRELENG | GTH. |
|---------------------------------------------------------------------------------------------------------|------|
|---------------------------------------------------------------------------------------------------------|------|

Moreover, we construct the three-voltage-island-based X-clock trees and compare the results with single-voltage ones, as listed in Table III. When a benchmark is partitioned into three voltage-islands, there are six (i.e., 3!) connection sequences for these islands. Therefore, the best and worst results in power consumption, delay, and wirelength are reported. Compared with single-voltage-island-based X-clock trees, the worst case of three-voltage-island-based X-clock trees averagely achieves the improvements of 14.3% and 19% in power consumption and delay, respectively, but more wirelength by 4.5% is required. For the best result of three-voltage-island-based X-clock trees, it averagely saves up to 19.6% in power consumption, but more delay and wirelength by 2.7% and 2.6% are respectively required.

Comparing the best and worst results, the power consumption is reduced by 5.3% (19.6%–14.3%), but the delay is increased by 21.7% (19%+2.7%). Therefore, the trade-off of power and delay can be approved.

|           | TABLE III COMPARISON OF SINGLE- AND THREE- VOLTAGE ISLAND-DASED X-CLOCK TREES IN FOWER, DELAT, AND WIRELENGTH. |          |         |          |         |            |          |         |          |                 |              |          |         |              |         |
|-----------|----------------------------------------------------------------------------------------------------------------|----------|---------|----------|---------|------------|----------|---------|----------|-----------------|--------------|----------|---------|--------------|---------|
|           | Power (W)                                                                                                      |          |         |          |         | Delay (µs) |          |         |          | Wirelength (µm) |              |          |         |              |         |
| Benchmark | Single-                                                                                                        |          | Three   | -island  |         | Single-    |          | Three   | e-island |                 | Single-      |          | Three   | -island      |         |
|           | island                                                                                                         | Worst    | (ratio) | Best     | (ratio) | island     | Worst    | (ratio) | Best     | (ratio)         | Island       | Worst    | (ratio) | Best         | (ratio) |
| r1        | 0.078951                                                                                                       | 0.066076 | (0.836) | 0.060118 | (0.761) | 0.309829   | 0.280666 | (0.905) | 0.357672 | (1.154)         | 1419028      | 1560785  | (1.099) | 1530900      | (1.078) |
| r2        | 0.194193                                                                                                       | 0.187952 | (0.967) | 0.177705 | (0.915) | 1.122692   | 0.813500 | (0.724) | 1.062546 | (0.946)         | 2911773      | 2973264  | (1.021) | 2947478      | (1.012) |
| r3        | 0.259581                                                                                                       | 0.252201 | (0.971) | 0.241556 | (0.930) | 1.799442   | 1.299613 | (0.722) | 1.574203 | (0.874)         | 3658510      | 3747365  | (1.024) | 3704503      | (1.012) |
| r4        | 0.599519                                                                                                       | 0.497435 | (0.829) | 0.471943 | (0.787) | 4.792344   | 3.875012 | (0.808) | 4.292754 | (0.895)         | 7230327      | 7542628  | (1.043) | 7429623      | (1.027) |
| r5        | 0.993228                                                                                                       | 0.974362 | (0.981) | 0.935441 | (0.941) | 8.564433   | 7.478526 | (0.873) | 8.328679 | (0.972)         | 1083735<br>8 | 11251663 | (1.038) | 1108715<br>0 | (1.023) |
| Primary1  | 0.175772                                                                                                       | 0.131632 | (0.748) | 0.124627 | (0.709) | 0.058590   | 0.048759 | (0.832) | 0.065411 | (1.116)         | 146514       | 133745   | (0.912) | 132401       | (0.903) |
| Primary2  | 0.416336                                                                                                       | 0.327933 | (0.787) | 0.308303 | (0.740) | 0.236469   | 0.175433 | (0.741) | 0.220812 | (0.933)         | 321887       | 336035   | (1.043) | 329151       | (1.022) |
| s1423     | 0.006842                                                                                                       | 0.005860 | (0.856) | 0.005341 | (0.780) | 0.007418   | 0.006769 | (0.912) | 0.009779 | (1.318)         | 113406       | 131571   | (1.160) | 128554       | (1.133) |
| s5378     | 0.017297                                                                                                       | 0.013909 | (0.804) | 0.012722 | (0.735) | 0.017287   | 0.012638 | (0.731) | 0.017695 | (1.023)         | 194411       | 202862   | (1.043) | 195655       | (1.006) |
| s15850    | 0.064583                                                                                                       | 0.051700 | (0.800) | 0.047998 | (0.743) | 0.049490   | 0.042637 | (0.861) | 0.051666 | (1.043)         | 477166       | 509537   | (1.067) | 498418       | (1.044) |
| Average   | -                                                                                                              | -        | (0.857) | -        | (0.804) | -          | -        | (0.810) | -        | (1.027)         | -            | -        | (1.045) | -            | (1.026) |

TABLE III COMPARISON OF SINGLE- AND THREE-VOLTAGE ISLAND-BASED X-CLOCK TREES IN POWER, DELAY, AND WIRELENGTH.

Figure 9 presents the three-voltage-island-based X-clock tree of Primary1.



Fig. 9. (a) The benchmark Primary1 is partitioned into three voltage-islands and (b) the integrated X-clock tree is constructed by using the proposed MuVIX algorithm.

#### V. CONCLUSION

Constructing a clock tree with multi-voltage islands in a chip can efficiently reduce power consumption. Experimental results on benchmarks have shown that two- and three-voltage-island-based X-clock tree consume less power than single one. Expanded work is to partition a chip with different-shape voltage-islands and how to integrate them to be a well-defined X-clock tree under control in delay and power. Moreover, the DFM issues such as the insertion of jumpers and redundant vias for antenna- and via-effect avoidances can be considered during the integration of voltage-island-based X-clock tree.

#### Reference

- W. K. Mak, and Jr-Wei Chen, "Voltage Island Generation under Performance Requirement for SoC Designs," in Proc. Design Automation Conference in Asia and South Pacific, Jan., 2007, pp. 798-803.
- [2] M. C. Lu, M. C. Wu, H. M. Chen, and H. R. Jiang, "Performance Constraints Aware Voltage Islands Generation in SoC Floorplan Design," in *Proc. IEEE International SOC Conference*, Sept., 2006, pp. 211-214.
- [3] J. Hut, Y. Shins, N. Dhanwadat, and R. Marculescut, "Architecting Voltage Islands in Core-based System-on-a-Chip Designs," in Proc. IEEE International Symposium on Low Power Electronics and Design, 2004, pp. 180-185.
- [4] W. P. Lee, H. Y. Liu, and Y. W. Chang, "An ILP Algorithm for Post-Floorplanning Voltage-Island Generation Considering Power-Network Planning," in *Proc. IEEE/ACM International Conference on Computer-Aided Design*, Nov., 2007, pp. 650-655.
- [5] C. C. Tsai, C. C. Kuo, J. O. Wu, T. Y. Lee, and R. S. Hsiao, "X-clock routing based on pattern matching," in *Porc. IEEE International SOC Conference*, Sept. 2008, pp. 357-360.
- [6] A. I. AbouSeido, B. Nowak, and C. Chu, "Fitted Elmore Delay: A Simple and Accurate Interconnect Delay Model," IEEE Transactions on Very Large Scale Integration Systems, vol. 12, no. 7, pp. 691-696, July 2004.
- [7] T. C. Chen, S. R. Pan, and Y. W. Chang, "Timing Modeling and Optimization under The Transmission Line Model," *IEEE Transactions on Very Large Scale Integration Systems, vol. 12, no. 1*, pp. 28-41, Jan. 2004.
- [8] W.-P. Lee, H.-Y. Liu, and Y.-W. Chang, "Voltage-island partitioning and floorplanning under timing constraints," *IEEE Trans. on CAD.*, vol. 28, no. 5, pp. 690-702, May 2009.
- [9] B. Yu, S. Dong, and S. GOTO, "Multi-voltage and level-shifter assignment driven floorplanning," in Proc. IEEE Int. Conf. on ASIC, pp. 1264-1267, Oct. 2009.

- [10] R. S. Tsay, "Exact Zero Skew," in Proc. IEEE International Conference on Computer-Aided Design, 1991, pp. 336-339.
- [11] M. A. B. Jackson, A. Srinivasan, and E. S. Kuh, "Clock Routing for High Performance ICs," in Proc. ACM/IEEE Design Automation Conference, June 1990, pp. 573-579.
- [12] J. G. Xi and W. W.-M. Dai, "Useful-Skew Clock Routing With Gate Sizing for Low Power Design," in Proc. ACM/IEEE Design Automation Conference, June 1996, pp. 383-388.

### 研究成果與相關論文發表

- 1. <u>Chia-Chun Tsai</u>, Chung-Chieh Kuo, and Trong-Yen Lee, "Pattern-matching-based X-Architecture Zero-skew Clock Tree Construction with X-Flip Technique and Via Delay Consideration," accepted by *Integration, the VLSI Journal*, 2010 (SCI)
- 2. <u>Chia-Chun Tsai</u>, Chung-Chieh Kuo, and Trong-Yen Lee "Jumper Insertion for Antenna Avoidance in X-clock Routing" Accepted by *Far East Journal of Electronics and Communications*, 2010. (EI)
- 3. <u>Chia-Chun Tsai</u>, Jan-Ou Wu, and Trong-Yen Lee, "Maximal Delay Reduction for RLC-Based Multi-source Multi-sink Bus with Repeater Insertion," *Circuits, Systems & Signal Processing*, Vol. 28, No. 6, pp. 805-817, Aug. 2009. (SCI, EI)
- 4. <u>Chia-Chun Tsai</u>, Chin-Yen Lin, Yuh-Shyan Hwang, and Trong-Yen Lee, "The Design of a Li-Ion Battery Charger Based on Multimode LDO Technology," *Journal of Circuits, Systems, and Computers*, Vol. 18, No. 5, pp. 947-963, Aug. 2009. (EI)
- 5. <u>Chia-Chun Tsai</u>, Kai-Wei Hong, and Trong-Yen Lee, "A Bisection-Based Power Reduction Design for CMOS Flash Analog-to-Digital Converters," *Journal of Circuits, Systems, and Computers*, Vol. 18, No. 5, pp. 933-945, Aug. 2009. (EI)
- 6. <u>Chia-Chun Tsai</u>, Chung-Chieh Kuo, Lin-Jeng Gu, and Trong-Yen Lee, "Double-Via Insertion for Improving the Reliability of X-Architecture Clock Tree," The 21st VLSI Design/CAD Symposium, August 3-6, 2010, Kaohsiung, Taiwan. (Best Paper Nominee)
- 7. <u>Chia-Chun Tsai</u>, Chung-Chieh Kuo, Lin-Jeng Gu, and Trong-Yen Lee, "Double-via Insertion Enhanced X-Architecture Clock Routing for Reliability," IEEE International Symposium on Circuits and Systems, pp. 3413-3416, May 30-June 2, 2010, Paris France.
- 8. <u>Chia-Chun Tsai</u>, Chung-Chieh Kuo, Lin-Jeng Gu, and Trong-Yen Lee, "Antenna Violation Avoidance/Fixing for X-Clock Routing," International Symposium on Quality Electronic Design, pp. 508-514, Mar. 22-24, 2010, San Jose, CA, USA.
- 9. <u>Chia-Chun Tsai</u>, Chung-Chieh Kuo, Trong-Yen Lee, Lin-Jeng Gu, and Jan-Ou Wu, "Buffer Insertion and Sizing for X-Architecture Clock Routing," The 20th VLSI Design/CAD Symposium, August 2009, Hualein, Taiwan.
- 10. <u>Chia-Chun Tsai</u>, Chung-Chieh Kuo, and Trong-Yen Lee, "Antenna detection and fixing with jumper insertion for X-clock routing," in Proceedings of International Ph.D. Student Workshop (IPS), Aug. 2009, Hualein, Taiwan.
- 11. <u>Chia-Chun Tsai</u>, Chung-Chieh Kuo, Trong-Yen Lee, and Jan-Ou Wu, "X-architecture Clock Tree Construction Associated with Buffer Insertion and Sizing," The 1st Asia Symposium on Quality Electronic Design, pp. 298-303, July 15-16, 2009, Kuala Lumpur, Malaysia.

## 出席國際會議研討心得報告

南華大學資工系 蔡加春 教授

國科會專題計畫補助: NSC 98-2221-E-343-006, 2009/8/1~2010/7/31

多重電壓島佈局環境之 X-結構時脈繞線合成方法及結合可製造性設計之應用研究(I) X-Architecture Clock Routing Synthesis Associated with Design for Manufacturability to the Application of Multi-Voltage Island Environment (I)

## ● 2010 年電路與系統國際研討會 (ISCAS 20010)

電路與系統國際研討會(ISCAS---International Symposium on Circuits and Systems)是一個高品質又多元專業的技術研討會,每年在不同國家舉辦,它提供給此領域的業界、學術界及電路設計與系統技術應用者等經驗交流的機會。 ISCAS 2010 的舉辦單位為 IEEE Circuits and Systems Society,承辦單位為法國 IEEE 巴黎分支「The Institut Supérieur d'Electronique de Paris」,今年於 2010 年 5 月 30 日至 6 月 2 日在法國巴黎近郊的 Disney's Hotel New York 的國際廳(The conference headquarters)舉行,會議主題為「Nano-Bio Circuit Fabrics and Systems」,包含 extremely scaled CMOS and non-CMOS 元件 與 mixture of standard CMOS and evolving nano-structure elements 電路的設 計及製造,以及 implementation cost, switching speed, energy efficiency, and reliability等。





Disney's Hotel New York 的國際廳

ISCAS 2010 國際研討會主要包括 5/30 的 9 場 tutorials, 5/31~6/2 的 oral and poster sessions 及 special sessions (1: Bio inspired Systems and Electronics Interfacing with BioMedia 與 2: Nanomedecine/Analysis Systems) 的發表,另有 專業書與少數廠商的展示等,及以下三場的 Keynotes。

| Date   | Title                                                                           | Author                                     | Affiliation        |
|--------|---------------------------------------------------------------------------------|--------------------------------------------|--------------------|
| May 31 | <u>Nanosystems: devices, circuits, architectures and</u><br><u>applications</u> | <u>Prof. Giovanni De</u><br><u>Micheli</u> | EPFL Lauzanne      |
| June 1 | From Smart Pacemaker to Remote Monitoring of Cardiac<br><u>Function</u>         | <u>Dr. Alain Ripart</u>                    | Sorin Group<br>CRM |
| June 2 | Energy-saving approaches for warehouse-scale<br>computing                       | <u>Dr. Wolf-Dietrich</u><br><u>Weber</u>   | Google             |

國內有好幾個團及一些個別散戶等大量的教授與研究生及少數業者代表前 來發表論文或觀摩,總人數超過100人以上,本人參與成大團,計有九位教授與 八位研究生及奇美旅行社的領隊陳永全先生等十八人一齊往返,含謝錫堃教 授、吳宗憲教授、魏嘉玲教授、陳中和教授與陳培殷教授等。本人計有一篇lecture

Paper ID:1996,在會中發表:

<u>Chia-Chun Tsai</u>, Chung-Chieh Kuo, Lin-Jeng Gu, and Trong-Yen Lee, "Double-via Insertion Enhanced X-Architecture Clock Routing for Reliability," *IEEE International Symposium on Circuits and Systems*, pp. 3413-3416, May 30-June 2, 2010, Paris France.



Session Chair -USA 馬里蘭大學教授

在會場相關的活動包含 Welcome reception, Coffee break, Poster interaction,

及 Farewell 等



Welcome reception

Coffee break

與香港科大博士生合影



與大陸復旦師生合影

參與 Poster

Farewell

此行參加研討會,也參觀了一些法國巴黎花都豐富的歐洲文化與古蹟,包含 ~印象畫派之奧薇小鎮巡禮,造訪**梵谷故居**及著名畫作「奧薇教室」、「麥田群 鴉,、「杜比尼的花園」等作品;**莫內故居**與花園及名畫;傷兵之家---拿破崙 陵寢,每年五月五日法國政府一定在此舉行隆重軍禮紀念; 凱旋門是世 上最繁忙的交通總匯,也是 12 條大道的交會點,拿破崙在 1806 年紀念帝國勝 利而下令修建,但直到1836年才完工;香榭麗舍大道,其綠草坪與椴樹一直 綿延到塞納河畔;**協和廣場**(路易十五世建,於法國大革命時,皇后 等均在此上斷頭臺)是歐洲最美、最具歷史的廣場之一,具有 3200 年 歷史方尖碑矗立著;艾菲爾鐵塔(Eiffel Tower)在1889年為紀念世界博覽會和 法國革命 100 周年而建造,塔高 320 米,目前仍是巴黎最有名的地標;羅浮 宫(Muse du Louvre)原是 12 世紀時為鞏固塞納河(聯合國教科文組織列名之世界 文化遺產)的險要位置而建成的要塞,現已是蒐集世界名畫、雕塑古物之藝術 寶庫,鎮館三寶-蒙娜麗莎的微笑、米羅的維納斯和勝利女神像;羅亞爾河流 域 (LA Lorie river) (流入大西洋,計有 143 castles) 規模最大、最壯麗有歐洲堡

4

王之稱的**香波古堡** (Chambord castle)及雪濃梭古堡(Chenonceau castle)等



梵谷故居

莫內故居

羅浮宮



協和廣場

凱旋門

艾菲爾鐵塔

香波古堡

此行參加研討會,與來自世界各地之國際學者及業界相互交流,藉此了解他們研究方向與成果,並帶回大會相關資料及論文光碟片。感謝國科會計畫所補助之機票與部份生活費等,感謝南華大學的支持與補助註冊費;雖然經費稍有所透支,但無形豐碩的收穫更勝於有形的付出。

無研發成果推廣資料

# 98年度專題研究計畫研究成果彙整表

| 計畫主                                           | 持人:蔡加春          | 計畫        | 2221-Е-343-             | 006-                          |                    |      |                                                      |  |  |
|-----------------------------------------------|-----------------|-----------|-------------------------|-------------------------------|--------------------|------|------------------------------------------------------|--|--|
| 計畫名稱:多重電壓島佈局環境之 X-結構時脈繞線合成方法及結合可製造性設計之應用研究(I) |                 |           |                         |                               |                    |      |                                                      |  |  |
|                                               | 成果項             | 〔<br>目    | 實際已達成<br>數(被接受<br>或已發表) | 量化<br>預期總達成<br>數(含實際已<br>達成數) | 本計畫實<br>際貢獻百<br>分比 | 單位   | 備註(質化說<br>明:如數個計畫<br>共同成果、成果<br>列為該期刊之<br>封面故事<br>等) |  |  |
|                                               |                 | 期刊論文      | 0                       | 0                             | 100%               |      |                                                      |  |  |
|                                               | <b>扒 士 茁 任</b>  | 研究報告/技術報告 | 0                       | 0                             | 100%               | 篇    |                                                      |  |  |
|                                               | <b></b>         | 研討會論文     | 1                       | 1                             | 100%               |      |                                                      |  |  |
|                                               |                 | 專書        | 0                       | 0                             | 100%               |      |                                                      |  |  |
|                                               | 重利              | 申請中件數     | 0                       | 0                             | 100%               | 化    |                                                      |  |  |
|                                               | 子们              | 已獲得件數     | 0                       | 0                             | 100%               | 17   |                                                      |  |  |
| 國內                                            | 技術移轉            | 件數        | 0                       | 0                             | 100%               | 件    |                                                      |  |  |
|                                               |                 | 權利金       | 0                       | 0                             | 100%               | 千元   |                                                      |  |  |
|                                               |                 | 碩士生       | 0                       | 0                             | 100%               |      |                                                      |  |  |
|                                               | 參與計畫人力<br>(本國籍) | 博士生       | 1                       | 1                             | 100%               | 1-2  |                                                      |  |  |
|                                               |                 | 博士後研究員    | 0                       | 0                             | 100%               | 八八   |                                                      |  |  |
|                                               |                 | 專任助理      | 0                       | 0                             | 100%               |      |                                                      |  |  |
|                                               |                 | 期刊論文      | 1                       | 1                             | 100%               |      |                                                      |  |  |
|                                               | <b>扒</b> 古 苯 佐  | 研究報告/技術報告 | 0                       | 0                             | 100%               | 篇    |                                                      |  |  |
|                                               | 珊天有非            | 研討會論文     | 1                       | 1                             | 100%               |      |                                                      |  |  |
|                                               |                 | 專書        | 0                       | 0                             | 100%               | 章/本  |                                                      |  |  |
|                                               | 惠利              | 申請中件數     | 0                       | 0                             | 100%               | 件    |                                                      |  |  |
| <b>E</b> 4                                    | -1-1-1          | 已獲得件數     | 0                       | 0                             | 100%               | • •  |                                                      |  |  |
| 國外                                            | 技術移轉            | 件數        | 0                       | 0                             | 100%               | 件    |                                                      |  |  |
|                                               |                 | 權利金       | 0                       | 0                             | 100%               | 千元   |                                                      |  |  |
|                                               |                 | 碩士生       | 0                       | 0                             | 100%               |      |                                                      |  |  |
|                                               | 參與計畫人力          | 博士生       | 0                       | 0                             | 100%               | 1 -5 |                                                      |  |  |
|                                               | (外國籍)           | 博士後研究員    | 0                       | 0                             | 100%               | 八八   |                                                      |  |  |
|                                               |                 | 專任助理      | 0                       | 0                             | 100%               |      |                                                      |  |  |

|      | 協助國科會專題計畫審查           |            |             |           |       |           |  |  |
|------|-----------------------|------------|-------------|-----------|-------|-----------|--|--|
|      | 其他成果                  | 參加 EDA     | Workshop 身  | 與協助評量課程推, | 廣執行成效 |           |  |  |
| (無)  | 法以量化表達之成              | 參與 21st    | t VLSI/CAD, | 發表論文並獲最   | 佳論文候選 |           |  |  |
| 米如得獎 | P辦理學術沽動、獲<br>集項、重要國際合 |            |             |           |       |           |  |  |
| 作、   | 研究成果國際影響              |            |             |           |       |           |  |  |
| 力及   | 其他協助產業技               |            |             |           |       |           |  |  |
| 術發   | 展之具體效益事               |            |             |           |       |           |  |  |
| 項等   | ,請以文字敘述填              |            |             |           |       |           |  |  |
| 列。   | )                     |            |             |           |       |           |  |  |
|      |                       |            |             |           |       |           |  |  |
|      | 成另                    | <b>長項目</b> |             | 量化        |       | 名稱或內容性質簡述 |  |  |
| 钮    | 測驗工具(会質性虛             | 量性)        | 0           |           |       |           |  |  |

|        | 成禾塤日                   | 重化 | 石碑或內谷性質間処 |
|--------|------------------------|----|-----------|
| 科      | 測驗工具(含質性與量性)           | 0  |           |
| 教      | 課程/模組                  | 0  |           |
| 處      | 電腦及網路系統或工具             | 0  |           |
| 計<br># |                        | 0  |           |
| 重加     | 舉辦之活動/競賽               | 0  |           |
| 填      | 研討會/工作坊                | 0  |           |
| 項      | 電子報、網站                 | 0  |           |
| 目      | -<br>計畫成果推廣之參與 (閱聽) 人數 | 0  |           |

# 國科會補助專題研究計畫成果報告自評表

請就研究內容與原計畫相符程度、達成預期目標情況、研究成果之學術或應用價值(簡要敘述成果所代表之意義、價值、影響或進一步發展之可能性)、是否適 合在學術期刊發表或申請專利、主要發現或其他有關價值等,作一綜合評估。

| 1. | 請就研究內容與原計畫相符程度、達成預期目標情況作一綜合評估               |
|----|---------------------------------------------|
|    | ■達成目標                                       |
|    | □未達成目標(請說明,以100字為限)                         |
|    | □實驗失敗                                       |
|    | □因故實驗中斷                                     |
|    | □其他原因                                       |
|    | 說明:                                         |
| 2. | 研究成果在學術期刊發表或申請專利等情形:                        |
|    | 論文:□已發表 □未發表之文稿 ■撰寫中 □無                     |
|    | 專利:□已獲得 □申請中 ■無                             |
|    | 技轉:□已技轉 □洽談中 ■無                             |
|    | 其他:(以100字為限)                                |
|    | 論文初稿已投稿至 ISCAS2011                          |
| 3. | 請依學術成就、技術創新、社會影響等方面,評估研究成果之學術或應用價           |
|    | 值(簡要敘述成果所代表之意義、價值、影響或進一步發展之可能性)(以           |
|    | 500 字為限)                                    |
|    | 目前的 VLSI 設計技術已進入奈米時代中,功率消耗成為評估一顆晶片系統效能的關鍵重  |
|    | 要因素。電壓島的佈局設計環境已非常普及,它更是節省功率消耗最有效的方法之一。本     |
|    | 工作主要在積體電路設計中之多重電壓島佈局環境下完成 X 時脈樹建置,透過各種電壓島   |
|    | 的連接排列組合和插入位準轉換器的排列組合,找出最小功率消耗的每個子 X 時脈樹建置   |
|    | 組合,應用於多重電壓島之不同區塊配置、不同電壓源的佈局環境之 X 時脈樹分工與整合   |
|    | 建置。實驗 10 個標準例子之結果顯示,二個與三個電壓島的 X 時脈樹較單一電壓島在平 |
|    | 均功率消耗上分別減少11.1%與19.6%。                      |
|    | 本工作主題尚未有相關論文發表,所提出的方法與初步結果可為學術界與產業界參考與      |
|    | 延伸。                                         |