Научная статья на тему 'A performance-driven placement of standard cells'

A performance-driven placement of standard cells Текст научной статьи по специальности «Математика»

CC BY
139
36
i Надоели баннеры? Вы всегда можете отключить рекламу.
i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Текст научной работы на тему «A performance-driven placement of standard cells»

экспоненциальную зависимость, коэффициент перед которой также линейно зависит от числа вершин в кадре.

Анализ зависимостей t(Rj) показывает, что при К = 1, т.е. когда все вершины в кадре на изоморфизм проверяются полным перебором, она имеет явно экспоненциальную зависимость, но при уменьшении К от I до 0.5 уже при R.2 = 1.5 наблюдается насыщение зависимости t(R2) и она при дальнейшем увеличении Ri практически не растет.

ЛИТЕРАТУРА

1. Воеводин В.В. , Математические модели и методы в параллельных процессах.-М.: Наука, 1986.-296 с.

2. Кофман А. Введение в прикладную комбинаторику. Изд-во Наука.-М.: 1975.-479 с.

3. Бершадский А.М. Применение графов и гиперграфов для автоматизации конструкторского проектирования РЭА и ЭВА. Изд-во Сарат.ун-та, 1983. - 128 с.

4. Алгоритмы и программы решения задач на графах и сетях/Нечепурснко М.И. и др.- Новосибирск: Наука.Сиб.отд-ние,1990.- 515 с.

5. Мелихов А.Н., Карелин В.П. Методы распознавания изоморфизма и изоморфного влэженйя четких и нечетких графов: Учебное пособие. Таганрог: ТРТУ, 1995. 90 <

6. Карели1 3.11. Г'-оркя и средства поддержки комбинаторных моделей принятия решек.1Й ь организационно-технологических системах. Дис.... докт.техн.наук. Таганрог ¡995.

7. Глушань В.М. и др. К вопросу об аппаратном определении изоморфизме

графов/Автоматиэация проектирования электронной аппаратуры:

Межвед.тематич.науч.сб.: Таганрог: ТРТИ, 1989. - Вып.6.- 122 с.

УДК 658.512

A.Y.Tetelbaum, C.-L.Wey, T.A.Bickart A performance-driven placement of standard cells

Introduction

The objective of al standard-cell placement algorithmO is to arrange a given set of standard cells of common height and variable width on the chip surface [1]. The cells are connected by agiven set of interconnections called nets. The placement has to atisfy one or more objective functions which usually model the mutability of a design. As IC technology scales down to half micron, interconnect delays play an increasingly important role in the performance of chips. Gate delays are becoming the same order as path delays. Most designers now realise that timing delays on interconnects create problems for high performance VLSI chips and that many existing CAD tools cannot handle these timing problems. Some existing performance-driven standard-cell placement procedures attempt to solve the timing constraints by minimizing the total net length. However, since the performance of a design tends to be path-oriented in nature, while physical design is net-oriented [2], the minimization of total net length may not lead to the improvement of performance. Therefore, the routing space on the chip may end up with being decreased at the cost of greater than necessary of acceptable net delays on some critical paths. In addition, delays caused by the same net length may vary significantly for the paths with different driver strength and load characteristics.

Existing performance-driven placement algorithms [2-10] .may be naturally divided into two major categories, namely, net-based approach [2,3] and path-based approach [4,8]. In path-based approch, path delays are studied extensively during physical placement. It formulates the placement problem as an optimisation problem with both timing requirements and physical placement requirements as constraints. The conversion of path constraints to net constraints in the path-based approach yields a convenient way to perform timing optimization during physical design. However, it is overly constraining because the important telationship between nets and pats is lost [2]. In net-based approach, timing requirements are translated into net information, and the placement algorithm tries to generate a placement making use of such net information. It carries out a pre-timing analysis to generate a set of upper-bound on net length, then the upper-bounds are used to guide the placement process. Utilization of upper-bounds simplifies the placement problem by translating timing requirements into physical placement constraints. Since the choice of upper-bound is not unique, it is important to be able to choose a set of upper-bounds which will provide the placement algorithm with maximum flexibility [9,10]. It is also important to have an effective procedure to modify a chosen set when the placement algorithm fails to obtain a physical placement with net wire lengths satisfying the bounds.

Previous work has been concent -ntec’ on im[~roviny efficiency <T placement algorithms for' perfbiwnr«» <*rn' i.ient. Little emphasis, howe-'T, has been devoted to combining logic synthesr '¡d placement for performa, ce improvement. None of the existing algorithms can guarantee to reduse the Jrcuit delay, even though reducing the numLer of logic level between input nodes and output m in a logic synthesis procedure might drcrease circuit delay. Thus, logic levt! information generated from logic synthesic is ignored during the placement procedure. In order to enhance the linkage between logic synthesis and physical synthreis, this study develops a performance-drives approach w hich uses the logic level information to guide the placement. The approach is comprised of three major s.eps [11]: Coarse Placement, Placement Compaction, Oand ¡Placement Optimization.

The oretical foundation for performance-driven placement

The objective of the performance-driven st9ndard-cell placement problem is to minimize chip area under delay constraint. Theoretically speaking, given a delay constraint , the final placement must satisfy (1) tc7-t5*7,e, for all critical paths with delay tn. In practice, however, for the row-based standard-cell placement, the number of rows required for achieving the minimum chip area is unknown. Moreover, the logic cell paths are given by the logic synthesis procedure and they may be used to define the relationship between delay constraint and cell placement.

A non-redundant mapped logic neuvork resulted from logik synthesis procedure consists of logical paths connected from the primary inputs to the primary outputs through some cells in the network. More specifically, a path p contains a ordered set of cells, COO, Cl,...,Ck, and their connections, where CO Oand Ck are the primary input and primary output nodes, respectively. A connection e=<Ci,Ci+l> means that the. cell Ci+1 is a fan-out node (cell) of Ci. For simplicity, CO and Ck are called th« Iprimary Inodes (cells) Oand all other Ci s arel non-primary cells.O Thus, the path p is comprised of (k-1) non-primary cells and k connections.

The cell delay model employed here is defined as the sum of the intrinsic delay7 tintO and the capacitive load delay 7tload, i.e., 7tcell= 7tint+7tload, where the delay7 tint is given from a cell library, while the delay7 tload can be calculated from a mapped logic network. For simplicity, we first assume that all cells are identical, i.e. they have the same width in the standard cell library. Based on this assumption and the cell delay model, a critical path contains the most number of cells and connections. A path delay is the sum of the total cell delays and the total connection line delay. More

specifically, if a path is comprised of k connections and (k-l) tolls, its path delay is 7t=k7dL+(k-l)7tcell, where L is the average connection length on the pjth, in terms of the feature size, and d is the unit delay. Similarly, if a critical path pc<) is laid out optimally, i.e., each connection on the path may take the least length Lmin. Suppose that the critical path contains n connections and (n-1) non-primary cells, its path delay tc=t5*0, where7 tc= =t5*=ndLmin+(n-1 )tcell. In practice, the value of Lmin can be either obtained from the physical design rules or estimated by the Rent's rule prediction.

Consider an arbitrary path which has Np connections and (Np-1) non-primary cells. Let Le be the average connection length on That path. The path delay7 t_<t5* is NpdLe+(Np-l)tcell_<.ndLmin+(n-l)tcell.

Therefore, Le<.{ndLmin+(n-Np)tcell)/(Npd). This implies that a placement satisfies the delay constraint if its true critical path has Np connections and the connection length is less than Le5*, called a I maximum admissible connection lengthO that will be defined later.

A mapped logic network can be modeled by a graph G=(V,E), where the vertices and edges represent the cells and connections, respectively. A weighted graph is employed to guide the placement procedure. Let Ole Oand WeO be the length and weight of a connection e. The objective function of the placement problem is 7S0 Wele 5<* Based on the cell level information, the weight WeO is determined by the criticalit of the connection ç. The criticalities of cells and connections are calculated by the following algorithm. Let R(i) be the logical level of the cell Ci, Q(i) and Qe(iJ) be the criticalities of cell Ci and the connection e, respectively, where m is the number of cell levels. For simplicity, all primary input nodes are merged as a single node S, while all primary output nodes are as a node T during the weight calculation.

initializeO Q(s)=Q(T)=m; assignO R(i) by the logic level of the cell Ci; for each cell level k from n to 2 do for each cell C at level k do

for each cell Ci, e=<Ci,Cj>AR(i)>R(j) do Qe(i j)=(n+ l)-(R(i)-R(j)); if(Qe(iJ)>Q0)) QG)=Qe{i j);

for each cell Ci at level 11 do Qe(i,S)=Q(i);

In fact, the criticaly of a connection on a connection critical path is equal to the number of cell levels. Since all cells are assumed to be identical, the criticality also means the number of connections on the critical path. Thus, the maximum admissible connection length Le5* can be expressed in terms of the connection criticality,

Qeij=Qe(ij), i.e., LE5*=nLmin/Qeij+[(n-Qeij)/(Qeij7d)] (1)

In this implementttation, we assume that the product of connection weight and connection length is a constant, i.e., WeOleO=constantO. Thus, the edge with higher weight, i.e., higher criiticality, will be assigned to have shorter connection length. Since the minimum connection weight is and the connection criticality is 2, we obtain the connection length e=LE5* and, by (1), the constantO is equal to [nLmin+(n-2)7tcell/7d]/2. This implies that, for any connection e, its connecton length is e=[nLmin+(n-Qeij)7tcell/7d]/Qeij and its connection weight is We=[(nLmin+(n-2)7tcell/7d)Qeij]/[2(nLmin+(n-Qeij)7tcell/7d)] (2)

If we assume that7 tcell=3ns, 7d=0.0ns/l, and LminO= 30710, by (2), we can calculate the connection weights by the following formula We=21Qeij/[6+12(5-Qeij)]. Based on the weighted graph, this research develops a cell-level-based approach performance-driven placement procedure. The procedure is comprised of three major steps: Coarse placement, Placement compaction, and Placement

optimization.

Main placement procedures and discussion

Coarse PlacementO. The objective of this step is to provide an initial placement. A straightforward solution is to rearrange the cells in the weighted graph so that the communications between the cells on the non-critical paths, referred to as non-critical cells, can be reduced. This will decrease the possibility of the non-critical paths to become critical after placement. More specifically, the non-critical cells are first partitioned into two clusters, where each cluster only communicates with the critical cells. Ideally, there exist no communications between two clusters. In practical however, it may allow to have certain communication between those cells with lower criticalities. Note that the size of ■ the partitioned cluster is not of concern. The clusters may be recursively partitioned if cell communication costs can be further reduced. Placement CompactionO. Once the initial placement is obtained, the cell-level-based placement is compacted for minimizing chip area under delay constraint. It is believed that the non-critical paths will never become critical if the cell-level based structure is preserved. In order to determine the number of rows and the dimension of each row, this implementation adopts a concept of "boundary glass" restriction, i.e., each cell is not allowed to placed outside a predefined boundary.

Placement OptimizationO after compaction, the placement can be further

improved by existing methods such as iterative methods, pin assignment methods, local move methods, and etc.

In the first two . stages of the proposed placement algorithm, a weighted graph is used to guide the generation of the initial placement and to perform the compaction operation. Thus, the "path connections", instead of "nets", are considered in these two stages and the chip area is minimized under delay constraint. On the other hand, the "connection" are converted to "nets" during the placement optimization stage. With the application of1 optimization techniques, the final placement will be further improved both delays and chip area.

In the coarse placement stage, there exist no false critical paths if the

communication rust is managed properly and also if the cell-level structure is preserved. In pruttice however, a high communication cost may imply that there exist some clusters of cells which are difficult to be partitioned. Even though those dis he placed and router later, long paths are probably required. As a result,

»>rne in critical paths may become critical. We found that the communication cost may be reduced by re-synthesizing those portions.

REFERENCES

1. Kinfi R -M. and Banerjee P., Optimization by simulatedcvolution with application to standard cell placement, Proc. of DAC, pp. 20-25, 1990.

2. Jackson M.A.B. and Kuh E.S., Performance-driven placement of cell based IC's, Proc. of DAC, pp. 370-375, 1989.

3. Donalh W.E. et. al.. Timing driven placement using complete path delays. Proc. of DAC, pp. 8-89. 1990.

4. Hunf,i' PS Nair R.. Yoffa E.J., Circuit placement for predictable

performance, Proc. oflEEE/ACM ICCAD, pp. 88-91, 1987.

5. Dunlop A E. et. al., Chip layout optimization using critical path weighing,

Proc. of DAC-84. pp. 133-136, 1984.

6. Marek-Satfowska M., Lin S.P., Timing driven placement, Proc. of ICCAD,

pp. 94-97, 1989.

7. Ogara ) ¡skit T. et. al., Efficient placement algorithms optimizing delay for hi^h-speed rit I. mastevslice LSI's, Proc. of DAC-86, pp. 404-410, 1986.

8. Dai H' M.. hen H.H. et. al., BEAR: a new building-block layout system, Proc. of ICCAD, pp. 34-38, 1987.

9. Gao T., Vaidya P.M., Liu C.L., A new performance-driven placement algorithm, Proc. of ICCAD, pp. 44-47, 1991.

10. Gao T., Vaidya P.M., Liu C.L., A performance-driven-macro-cell

placement algorithm, Proc. of ICCAD, pp. 44-47, 1991.

11. Tetelbaum A.Y., Wey C.L., Bickart T.A., A cell-level-based

performance-driven standard-cell placement, Technical report, Department of

Electrical Engineering, Michigan State Univenlty, June, 1993.

YAK 658.512

M.Gams, B.Hrlbovsek

AN INTELLIGENT OS-AGENT INTERFACE

I. INTRODUCTION.

Paper describes designing and testing an intelllgent-agent man-machine interface based on a mixture of a syntax-based approach, a memory-based approach [4,7] and an approach based on intelligent agents [1,5,6]. Communication between a human user and an operating system VAX/VMS was chosen for a test domain enabling testing of software agent technotogy [2].

The paper is organised as follows: Related approaches are discussed in Section 2, an intelligent operating interface IOI in Section 3, tests in Section 4, followed by the concludind discussion in Section 5.

2. RELATED APPROACHES

According to [4], traditional approaches to translation often use explicit rules of knowledge and are guided by rigid control rules, i.e., the syntax of programming languages. However, for domains such as translation between two natural languages it is almost impossible to obtain a complete set of rules for a given problem. Memory-based approach builds on memory as the foundation of intelligence. It is assumed that large numbers of specific events are stored in memory. New situations are first handled by recalling and comparing with previous events, and then performing similar or modified reactions.

According to [5], intelligent agents represent personal assistants collaborating with the user in the same environment. Agents and humans both initiate communications, monitor events and perform tasks. The assisting agent learns and modifies according to the user's interests, habits and preferences. There is a rich set of emerging views and new terms such as knowbots, knobots, softbots, userbots, taskbots, personal agents etc.

The approach by Maes is based on self-programmable agents that leam by •obsenring the user

♦receiving positive and negative feedback from the user ‘receiving explicit instructions from the user ♦experience from the environment.

The above directions were guiding also the design of the IOI system. In addition, self-programmable agents often rely on memory-based approaches, as does IOI.

3. AN INTELLIGENT OS INTERFACE IOI

We have designed and implemented an Intelligent Operating Interface (IOI) for VAX/VMS [3] based on the above and two additional purposes:

♦to implement a rule-based and a memory-based approach and make a comparison

i Надоели баннеры? Вы всегда можете отключить рекламу.