Bus Deadlock Avoidance Using Reconfigurable Arbiter

4 downloads 898 Views 594KB Size Report
QIS Institute of Technology, Ongole, Andhra Pradesh, India. Vaddempudi Koteswara ... may happen on the bus connecting the masters and slaves. Besides the ordered .... transaction, the bus specifies the request information for each transfer.
International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 3 (2016) pp 1989-1997 © Research India Publications. http://www.ripublication.com

Bus Deadlock Avoidance Using Reconfigurable Arbiter Challa Harshavardhan [email protected] PG Scholar, 2nd M.Tech (VLSI & Embedded Systems) Department of Electronics and Communication Engineering QIS Institute of Technology, Ongole, Andhra Pradesh, India. Vaddempudi Koteswara Rao [email protected], [email protected] M.Tech., [Ph.D], MIEEE, FIETE, PMACM. MACCS, MVSI, Associate Professor, Department of Electronics and Communication Engineering QIS Institute of Technology, Ongole Andhra Pradesh, India.

Abstract Now a days on chip communication has become complicated owing to the presence of many a number of slaves as well as masters. Communication protocols such as OCP (Open Core Protocol) and AXI (Advanced eXtensible Interface) have been developed to support the innumerable transactions that may happen on the bus connecting the masters and slaves. Besides the ordered transactions, the out of order transactions are not uncommon on the bus. In order to improve the communication efficiency, the responses from the out of order transactions may need be received in an order different from the request order. While doing so, a bus deadlock is possible if the bus system goes into an unsafe state with the arrival of the requests forming a trivial cycle. Bus deadlock avoidance techniques such as DALS (Deadlock Avoidance by Least Stalling) can avoid deadlock by not forwarding the request that may cause deadlock until the previous requests are fulfilled. In case the last request is of high priority its response may unduly be delayed. This paper presents a design of bus system with reconfigurable arbiter to prioritize the requests in the desired manner at the time the deadlock is likely to occur and executes them one by one in the prioritized order so that the prioritized transactions shall not be delayed. Keywords: Advanced Extensible Interface (AXI), Dead Lock Avoidance by Least Stalling (DALS), Open Core Protocol (OCP), out of order transaction, tagged transactions, Bus dead lock, On-chip Bus, Reconfigurable arbiter.

Introduction As the complexity of system-on-chip (SOC) is increasing, the communication traffic is substantially increasing among the modules of the SOC. Development of communication architectures have become a major concern in the design of the SOC. Several interfacing modules are also needed for the effective communication between various modules and the

bus. ARM developed AMBA (Advanced Microcontroller Bus Architecture) [2] protocol for use with SOCs (System-onchips) especially for the microcontrollers. This is consisting of two parts AHB (Advanced High performance Bus) and APB (Advanced Peripheral Bus) with a bridge in between them to match the speeds between the high speed AHB bus and the low speed APB bus. The AHB connects the high speed peripherals and the APB, the low speed peripherals. In these architectures, only one master and one slave can communicate at a time. The development of Advanced eXtensible Interface (AXI) [1] and Open Core Protocol (OCP) [3] facilitated parallel communication with more number of masters and slave. Authors of [18] developed a memory controller to interface the low speed memory with AMBA-AXI bus with full duplex operation so that two masters can simultaneously communicate with each other. AXI and OCP support various types of transactions such as burst transactions, outstanding (pipelined) transactions, out of order transactions [2], [3]. The way in which the out of order transactions are handled by the master and the slave determines the performance of the processor. A dynamic RAM controller is used to handle these out of order transactions [3], [4], [5] in an order different from the one they arose and to also improve the performance of the bus. When more than one master tries to access the slave in a circular wait and hold state bus dead-lock may occur and needs to be solved. The authors of [1] devised a scheme called DALS (Dead Lock Avoidance by Least Stalling] based on BSG (Bus Status Graph). In this scheme a request is stalled only when the forwarding request forms nontrivial cycle in BSG, thus avoiding the dead lock. In DALS scheme, if all the forwarding requests arrive so that they may form a trivial cycle in the BSG, the arbiter stalls the last one so that the deadlock is avoided by avoiding the formation of trivial cycle. In this DALS scheme, no priority is assigned to any of the requests. In the proposed method, a reconfigurable arbiter is incorporated so that the forwarding request that may form a

1989

International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 3 (2016) pp 1989-1997 © Research India Publications. http://www.ripublication.com trivial cycle in the BSG may be assigned priority so that higher priority request is granted access first and so on in the order fixed so that there may not be delay for the execution of high priority tasks. Four methods of fixing the priority to the requests are presented in this paper.

Some Definitions and Classification a.

Bus transactions and models: The bus transaction is one in which a master sends a request to access a slave through a bus arbiter. The bus arbiter determines whether the master can access the slave and forwards the request to the slave. A classification of transactions is shown in fig.1.

b.

Single and burst transactions: The basic bus transactions can be categorized into two - single and burst transactions. Single transactions are those that request only one response. The burst transactions are those that request multiple responses. The burst transactions may be pipelined or non-pipelined. Non-pipelined transaction: The basic non pipelined transaction is shown in fig.2. The basic nonpipelined bus transaction has two phases, the request

c.

phase and the response phase. In the request phase, the master sends request (Req A) to the bus. If the slave is determined to be accessible by the master, the arbiter forwards this request (Req B) to the slave. The slave, when free, accepts this request (Req B) and sends acknowledgement (Ack B) to the bus. The bus forwards this acknowledgement (Ack A) to the master, which completes the request phase. The slave after some latency forwards the response (Resp C) to the bus, which is then forwarded to the master (Resp D). The master then acknowledges (Ack D) this response to the bus which is further forwarded (ACK C) to the slave. This completes the response phase. In AXI and OCP, burst transactions are admitted and these burst transactions require multiple data transfers. AXI allows only single request-burst transactions whereas the OCP allows both single and multiple request-burst transactions. In single burst transaction, the master specifies the request information such as the access address, the burst type, the burst length etc., only for the initial transfer. In multiple burst transaction, the bus specifies the request information for each transfer.

Figure 1: Classification of Bus transactions.

Figure 2: Basic non-pipelined transaction

1990

International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 3 (2016) pp 1989-1997 © Research India Publications. http://www.ripublication.com

Figure 3: Pipelined transaction d.

e.

Pipelined transaction: Pipelined transaction is one that is issued before the preceding one is completed. For carrying out the pipelined transactions, the masters/slaves require the use of buffers at the interfaces with the bus as shown in fig. 3. Pipelined transactions are further categorized as tagged and untagged. Tagged transactions are those that are tagged with IDs. The ordered transactions are those that are executed in the same order, the requests are made. The out of order transactions are those that are executed in an order different from the order the requests are made. Pipeline depth: In case of pipelined transactions successively issued, the uncompleted transactions will be buffered either in the bus or in the slave. In this regard the pipelined depth is defined as the number of transactions that a master can issue before

f.

g.

the previous transactions are completed. This pipeline depth is limited by the memory space available in the buffers. Tagged transactions: A tagged transaction is one which is given an ID when it is issued. Ordered transactions are those for which the order of responses is the same as the issued order. Out-oforder transactions are those for which the order of responses will be different from the issued order. Figure 4 illustrates the concept of IDs and tagged transactions. Generally the out-of-order transactions arise when a master sends requests to more than one slave. Bus status graph (BSG): This graph is formed with slaves and IDs of transactions. A BSG formed with one tag ID and two slaves is shown in Fig. 5.

Figure 4: Tagged transactions.

Figure 5: Bus Status Graph.

1991

International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 3 (2016) pp 1989-1997 © Research India Publications. http://www.ripublication.com

Figure 6: Condition for bus deadlock formed with four IDs and four slaves.

a.

A salve or tag ID represents a vertex in the BSG. If an arrow indicates a request with a tag ID d that is the first accepted one by the slave Sj among all the accepted transactions but not yet completed, it is called the prime edge. The arrow that indicates a request for the same tag ID d by another slave Si is called a non-prime edge in the BSG graph.

Bus Deadlock Certain terminology related to the bus deadlock is described below. A slave Si is said to be waiting for another slave Sj if a transaction with tag ID d is requesting to access Si is accepted by Si and an earlier transaction with the same tag ID has been accepted by Sj but Sj did not yet returned the response. This waiting relation is denoted as . =1 if Si is waiting for Sj with tag ID d, otherwise =0. If two slaves Sj and Si are issued transactions successively with the same tag ID d and if Si is waiting to return the response in case Sj has not yet returned the response, Si is said to be blocked by Sj.

solution to the case when all requests that may form trivial cycle arrive simultaneously at the same time. Moreover, in the implementation of the DALS scheme, if all the requests that may form a trivial cycle arrive at the same time, no solution is given. In order to solve this problem, a reconfigurable arbiter is proposed which assigns priority to the various masters as per the programmed priority so that when all the requests that form a trivial cycle arrive at the same time, the masters are granted requests on the assigned priority basis and the deadlock is avoided.

Proposed Method for Deadlock Avoidance In the proposed method a reconfigurable arbiter is incorporated with two configuration pins to provide different types of priorities such as 1. Fixed priority 2. Round robin 3. First come first serve 4. Random selection.

Existing Method for Deadlock Avoidance

This method is implemented with four masters and four slaves interfaced to the bus as shown in fig. 7. The bus control circuitry is present inside the bus. The bus control circuitry consists of 1. Request buffer 2. Recorders 3. Decoder 4. Control circuit 5. Reconfigurable arbiter 6. Multiplexers 7. Request busy checker 8. Tagged Transaction Access Controller 9. Write data buffer, and 10. Read data and write status buffer

The authors of [1] devised a scheme called DALS (Deadlock Avoidance by Least Stalling) for deadlock avoidance. In this method, the arbiter is designed such that the arbiter will not forward the request in case it forms a trivial cycle with the requests that are already under process. DALS did not tell a

Requests are made by the masters along with tag IDs. These requests are stored first in the request buffer. From the request buffer, all the transactions with same tag ID are recorded in the same recorder. The number of recorders will

A bus system is said to be in an unsafe state if K different slaves Si1, Si2, Si3,…..Sik and k different IDs d1, d2, d3,…., dk are in a circular waiting state relation, shown in figure 6. If a set of slaves are blocked in a circular way the bus system is said to be in a deadlock. Figure 6 shows four slaves and four IDs that are forming deadlock

1992

International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 3 (2016) pp 1989-1997 © Research India Publications. http://www.ripublication.com be equal to the number of IDs. In the present implementation, four IDs are used and as such, four recorders are used, one recorder for each ID. The transaction address is decoded by the decoder. The outputs of the decoder, recorder and request buffer are given as inputs to the control circuit which generates the control signal to the

Master 1

reconfigurable arbiter. Based on the configuration pins programming, the reconfigurable arbiter generates the select signal to the multiplexer, which determines the slave to which the request is to be forwarded. The request busy checker will check whether the previous request is completed or not. If the previous request is completed, then only

Master 2

Master 3

Request Buffer

Master 4

Read Data & Write Status Buffer

Recorder Tagged Transaction Access Controller

Decoder

MUX

Configuration pin

...

Control Circuit

MUX

… …

MUX

… .



Reconfigurable Arbiter



MUX MUX Request Busy Checker

… MUX .

Slave 1

Write Data buffer

Slave 1

Slave 1

Slave 1

Figure 7: Architecture of bus with reconfigurable arbiter. Table-I: Priorities assigned to masters based on configuration of reconfigurable arbiter Technique

Select Arbitration

Order of priority

Fixed Priority

00

Master 2,1,3,4

Round robin

01

Master 1,2,3,4

FCFS

10

In the incoming order of requests

Random Selection

11

As decided by random sequence generator

1993

International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 3 (2016) pp 1989-1997 © Research India Publications. http://www.ripublication.com it enables the control circuit to initiate the arbiter. The write data will be stored in the write data buffer. The tagged transaction access controller determines the master to which the write data or read data permission is to be granted. Accordingly, it generates select signals for controlling the multiplexers. The read data and the status of the write data are stored in the „read data and write status buffer‟. In this way the transaction will be completed. The configuration of the arbiter is carried out based on the programming done with the configuration pins as shown in table-I.

b.

c.

Simulation Results The proposed method is implemented in VHDL using Xilinx 13.2 targeted towards Artix-7 FPGA. The simulation results are given in Figs. 8 to 12. a. Fig.8 shows the bus dead lock formation when the reconfigurable arbiter is disabled and all the four masters send requests at the same time each with the same two IDs. The dead lock can be noticed with unsafe state going high. When dead lock is formed none of the slaves are granted permission to send the data. This can be

d.

e.

observed from all the four signals of the slaves gone low. Fig. 9 shows the dead lock avoidance when the four masters send requests simultaneously and permission granted in fixed priority (00) as given in table.1. The unsafe state signal goes low and the slaves are granted permission in the fixed order as can be observed form the test bench waveform of the fig.9. Slave1 is granted permission first, and, slave 2, slave 3 and slave 4, in the given order. Fig. 10 shows the dead lock avoidance when the four masters send requests simultaneously and permission granted in round robin technique (01) as given in table.1. Fig. 11 shows the dead lock avoidance when the four masters send requests in different times and permission granted in first come first serve basis (10) as given in table.1. Fig. 12 shows the dead lock avoidance when the four masters send requests simultaneously and permission granted in random selection (11) determined by a sequence generator as given in table.1

Figure 8: Simulation results with bus under dead lock condition when all the 4 masters send requests at the same time, the reconfigurable arbiter is disabled and the unsafe state high.

1994

International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 3 (2016) pp 1989-1997 © Research India Publications. http://www.ripublication.com

Figure 9: Simulation results with reconfigurable arbiter programmed for fixed priority (00) showing dead lock avoidance and unsafe state low.

Figure 10: Simulation results with reconfigurable arbiter programmed for round robin technique (01) showing dead lock avoidance and unsafe state low.

1995

International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 3 (2016) pp 1989-1997 © Research India Publications. http://www.ripublication.com

Figure 11: Simulation results with reconfigurable arbiter programmed for first come first serve basis (10) showing deadlock avoidance and unsafe state low.

Figure 12: Simulation results with reconfigurable arbiter programmed for random Selection (11) showing dead lock avoidance and unsafe state low.

Conclusion The bus dead lock avoidance with reconfigurable arbiter is implemented in VHDL using Xilinx 13.2 software targeted towards Artix-7 FPGA. The simulation results shows that

when all requests that may form trivial cycle arrive at the same time, the reconfigurable arbiter is proved to be more useful since it can assign priorities to various transactions in the desired manner. This kind of prioritized arbiter is useful in applications with some high priority devices.

1996

International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 3 (2016) pp 1989-1997 © Research India Publications. http://www.ripublication.com [14] H.-W. Wang, C.-S. Lai, C.-F. Wu, S.-A. Hwang, and Y.-H. Lin, “Onchip Interconnection Design and SoC Integration with OCP,” in Proc. IEEE Int. Symp. VLSI Design, Autom., Test, Apr. 2008, pp. 25–28. [15] N. Y.-C. Chang, Y.-Z. Liao, and T.-S. Chang, “Analysis of shared -link AXI,” IET Comput. Digit. Tech., vol. 3, no. 4, pp. 373–383, Jul. 2009. [16] O. Ogawa, S. Bayon de Noyer, P. Chauvet, K. Shinohara, Y. Watanabe, H. Niizuma, T. Sasaki, and Y. Takai, “A practical approach for bus architecture optimization at transaction level,” in Proc. Design, Autom., Test Europe Conf. Exhibit., 2003, pp. 176– 181. [17] Synopsys Design Compiler.(2010) [Online]. Available :http://www.synopsys.com/Tools/Implementation/RT LSynthesis/Pages/ default.aspx. [18] Santhoshi Yadav, Pulicharla, Vaddempudi Koteswara Rao, ” Amba-Axi Compliant Memory Controller”, IJRRECS/August 2013/Volume-1/Issue-4/316-321 ISSN 2321-5461.

Acknowledgement The authors express their gratitude to the management of QIS Institute of Technology, Ongole, Andhra Pradesh, India to have provided laboratory facilities for the development of this project.

References [1]

[2]

[3] [4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

Chin-Yao Chang; Kuen-Jong Lee, "On Deadlock Problem of On-Chip Buses Supporting Out-of-Order Transactions," in Very Large Scale Integration (VLSI) Systems, IEEE Transactions on , vol.22, no.3, pp.484496, March 2014. Advanced Microcontroller Bus Architecture Specification. (1997) [Online]. Available: http://www.arm.com Open Core Protocol Specification. (2006) [Online].Available:http://www.ocpip.org/home A.T.Tran and B.M.Bass, “RoShaQ: High-performance on-chip router with shared queues,” in Proc. IEEE 29th Int. Conf. Comput. Design, Oct. 2011, pp. 232– 238. J. Shao and B. T. Davis, “A burst scheduling access reordering mechanism,” in Proc. IEEE 13th Int. Symp. High Perform. Comput. Archit., Feb. 2007, pp. 285– 294. J. Pang, L. Yang, L. Shi, T. Zhang, D. Wang, and C. Hou, “A priority expression- based burst scheduling of memory reordering access,”in Proc. Int. Conf. Embedded Comput. Syst., Archit., Model., Simul., Jul. 2008, pp. 203–209. X. Xiao and J. J. Lee, “A true O(1) parallel deadlock detection algorithm for single-unit resource systems and its hardware implementation,” IEEE Trans. Parallel Distrib. Syst., vol. 21, no. 1, pp. 4–19, Jan. 2010. A. Silberschatz, P. B. Galvin, and G. Gagen, Operating System Concepts, 7th ed. New York, USA: Wiley, 1993. T. S. Cummins, “Method and apparatus for detecting a bus deadlock in an electronic system,” U.S. Patent 6 292 910, Sep. 18, 2001. Technical Reference Manual of PrimeCell AXI Configurable Interconnect (PL300), ARM, Cambridge, U.K., 2010. K. Lahiri, A. Raghunathan, and G. Lakshminarayana,, “The LOTTERYBUS on-chip communication architecture,” IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 14, no. 6, pp. 596–608, Jun. 2006. K. Sekar, K. Lahiri, A. Raghunathan, and S. Dey, “Dynamically configurable bus topologies for highperformance on-chip communication,” IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 16, no. 10, pp. 1413–1426, Oct. 2008. F.-M. Xiao, D.-S. Li. G.-M. Du, Y.-K. Song, D.-L. Zhang, and M.-L. Gao, “Design of AXI bus based MPSoC on FPGA,” in Proc. 3rd Int. Conf. AntiCounterfeit., Security, Identificat. Commun., Aug. 2009, pp. 560–564.

Authors Biography 1. Sri Challa Hashavardhan received his B.tech., degree from the department of Electronics and Communication Engineering, VNR College of Engineering (Affiliated to JNTU University, Kakinada), Ponnur, Guntur Dt, AP, India. He is presently pursuing MTECH in QIS Institute of Technology, Ongole, AP, India. His current research interests are VLSI and Embedded Systems. 2. Sri Vaddempudi Koteswara Rao received his B.Sc, degree from Andhra University, Waltair, AP, india. He received his AMIETE degree in Electronics and Communication Engineering from IETE, India. He received his M.Tech in Embedded Systems and VLSI design from JNTU, Hyderabad. He is a member of IEEE, USA., Fellow of IETE., India, a Professional member of ACM, USA, Life Member of Advanced Computation and Communication Society, India and also a Member of VLSI Society of India. He worked for AP State Govt., India as Deputy Executive Information Engineer for several years. He is presently working as Associate Professor in ECE department of QIS Institute of Technology, Ongole, AP, India. He is presently also a research scholar in VLSI at Vignan University, Vadlamudi, Guntur Dt., AP., India.

1997

Suggest Documents