A Simple Lock Manager for Distributed ...

2 downloads 647 Views 125KB Size Report
However, it is hard to handle the difficulties such as ..... the thread must retrieve all information about systems that are .... on the hard disk, can be a future work.
A Simple Lock Manager for Distributed Heterogeneous Systems Arash Khosraviani, Omid Kashefi and Mohsen Sharifi School of Co mputer Engineering Iran Un iversity of Science and Technology, Tehran, Iran [email protected], {kashefi, msharifi}@iust.ac.ir

Abstract— Distributed systems require an efficient way to manage access to shared data and using lock management system is a software-based solution to it. Current locks are mostly special purpose locks and do not consider the heterogeneity needed for these systems. In this paper, we use the concept of semaphore as a basic structure to manage critical regions in a distributed heterogeneous system. We propose a distributed lock mechanism that is implemented with the aim of simplicity and ability to run on heterogeneous systems. The comparison of this lock mechanism with other comparable mechanisms indicates that it has the necessary and sufficient locking facilities and supports heterogeneous distribution. Besides, we report different average operation times of our implemented lock.

I. INT RODUCTION By e xploiting parallelism in applications, you can have different parts with different priorities, working separately but also having communication when needed. Corporation between these parts is implemented by means of inter process communication (IPC) mechanisms. Using s hared parts like shared memory or shared file system is an IPC mechanism that simplifies communication between different threads or processes. However, it is hard to handle the difficult ies such as concurrent accesses to memory [1]. When two or more processes may be accessing shared memory or file, we should be able to manage the jobs such that no two processes can change the shared data at the same time. In other word, what we need is mutual exclusion [1]. A software solution for having mutual exclusion is to have a lock manager. The lock manager ensures that changes on data would take place only when there is just one process accessing it. The most basic way to have mutual exclusion in uniprocessor environments is to disable interrupts and to use TSL instruction [1]. Dekker [2] was the first one who had offered a software based solution for mutual exclusion problem. Then Dijkstra [2] introduced semaphores to achieve mutual exclusion. Afterwards , Hoare [3] and Honsen [4] presented a high-level tool, named Monitor, wh ich uses semaphores as basis. Besides single systems mentioned above, many lock managers can handle mutual exclusion in a distributed system. At the beginning of 90's, large co mpanies had raised various issues about managing locks in distributed systems. HP [5]

initiated the work. Then in 1997, IBM [6] produced its lock manager. In the same year, Oracle [7] included a lock manager in its servers. In this paper, we propose a simple lock manager to emp loy in distributed systems. Proposed lock manager is based on concept of semaphore to manage critical regions. We provide a simple centralized lock manager for distributed systems and exp lain its structures and features. The main objectives of our lock are simp licity and ability to run on heterogeneous systems. Therefore, the API is simple and compatible with Linu x and Windows operating systems. Besides, we had tried to increase the performance on systems benefit multi-core processors, wisely using threads. We compare our mechanism with other famous lock managers and show that proposed lock manager has the necessary and sufficient features expected from a simple lock manager. The rest of the paper is organized as follows. Section 2 contains a short description of lock managers and their architectures. Section 3 presents the proposed lock manager, its structures, and the way it is implemented. Section 4 evaluates this implementation and Section 5 concludes the paper.

II. BACKGROUND AND RELATED W ORK In this section, we present some background on lock managers and their related works and briefly introduce single and distributed lock managers. A. Single System Locks In multitasking operating systems, where two or more processes are reading or writing some shared data, the outputs depend on shared data. Any unwanted changes to a shared data may cause malfunction. Crit ical region is the part of program that has access to shared data. Therefore, we should develop the program in such a form that there is no possibility of having two threads in a critical region at the same time. What we need is to achieve mutual exclusion in critical reg ion. Using lock management system is a software-based solution to problems having critical sections. Whenever and wherever a thread needs to access the critical reg ion, it asks the lock manager to permit. If there is no thread in that region, the former thread gains the privilege and the region will be locked. Otherwise, if there is an active access to shared data, the lock

- 54 -

manager puts second thread in waiting queue. After that the critical region becomes free, the first thread in queue would be activated and can continue its job. In 1968, Dijkastra [2] proposed a simp le and elegant solution for achieving mutual exclusion, introducing a kind of variable named semaphore. Despite developing many kind of synchronizing mechanism, semaphores are still available in current programming languages and operating systems. Semaphores are useful for building high-level lock managers for programming languages that do not have built in lock manager. Semaphore is a variable, which can be zero or a positive value. Changing the value of a semaphore takes place using UP and DOWN instructions in an atomic execution. When a semaphore becomes Zero, any other execution of DOWN instruction causes the executer thread to sleep in a waiting queue. A positive value enables a thread to run DOWN instruction and enter the critical region. Many researchers have presented various extension of semaphore. They tried to provide more efficient and powerful semaphores and still have the simp licity of original concept. Keedy et al. [8] had implemented semaphore with a set of bit strings indicating of resource set for resource management. Adding priority was an extension that enabled usage of priority semaphores with support for preemption and shared access by certain process classes [9]. Hadgson et al. [10] have introduced interval semaphore, which makes a semaphore works between two integers, reducing number of usages. B. Lock Management in Distributed Systems As explained in the introduction, in a distributed system programs need to have communication with each other and share some resources like files, data, and even codes. The goal of a distributed lock manager is to provide facilit ies for distributed software in order to synchronize different processes in minimu m t ime, and make it possible to use each other's information. If this would not happen, that distributed system no longer has meaning and is just some standalone computers. Deciding on software components placement, there are three categories for distributed system architectures: centralized, decentralized and hybrid [11]. Distributed lock manager architectures are also categorized into three groups, based on their database placement and the strategy used to assign lock server. Centralized lock manager is the simplest architecture known. There is a centralized server in this type, responsible of doing all related works. Other nodes should send their requests include to lock or to unlock to the main server, and waiting for the answer. Centralized lock manager operates well in small systems with low competition for resources. However, when the system is large the major drawback of it appears and main server becomes the bottleneck of the system. In addition, when the most of requests are local, this architecture performs inefficient while wasting time to have communication with server. Having global information about processes and their

achieved locks is the main advantage of this system and allows better admin istration [12]. In decentralized architecture, mult iple co mputers will corporate with each other to keep the whole system and resources stable. There are different types of decentralized lock manager. One would save data related to resources in a distributed database. In other words, instead of a system, mu ltiple servers together respond to requests [13]. The other supports waiting processes in a distributed queue. If a process needs a lock ownership, it looks at local cache first. Availability of ownership right causes the process to use the lock immediately. Otherwise, the local lock manager broadcasts its request over the network and would be placed in global wait ing queue. Whenever the lock becomes free, the first process of the queue takes the ownership [14]. The decentralized architecture works fine in d istributed systems with low contention. Unlike centralized lock manager when the process mostly has access to local resources, using this type of architecture greatly reduces the system overhead [12]. Final category is hybrid architecture, which uses elements of both centralized and decentralized architecture. This lock manager starts like a centralized manager but its strategy changes based on different software operations. After a while, if the lock manager finds a system with many accesses to a specific lock, it would transfer the ownership to the high working computer to reduce the access time. In other words, lock manager strategy is changing to decentralized architecture. Despite all the benefits of this strategy, it suffers from less control in contrast to centralized lock management. There are many distributed lock manager for special purpose systems. Most distributed file systems use their own synchronization mechanisms. IBM's GPFS [15] uses a centralized lock manager, which coordinates locks between local lock managers by handing out lock tokens . Parallax [16] that uses virtualization for a storage system, has a simple centralized lock. This lock is needed only when performing admin istrative operations such as creating new virtual disk images. There is also a distributed lock managers for PVFS parallel file system [17], which provides noncontiguous locking for handling atomicity within shared files. The next lock service, used by Google's Bigtable [18] is Chubby [13]. Zookeeper [19] is a similar service of Chubby for Apache Hadoop project. Chubby and Zookeeper are file based lock services with emphasis on availability and reliability. To achieve this goal, instead of a single server, Chubby has server cells each including five computers. Some other locks are designed for special networks like infin iband. Infiniband [20] supports Remote Direct Memory Access (RDMA). Co mpare-and-Swap is a RDMA instruction equals to TSL, used as a locking instruction [21]. DQL [14] also uses RDMA instructions but the distinction is that DQL manages the waiting queue as decentralized. These types of locks will operate so fast and are used in high performance systems. MP-locks [12] is a set of three locks that each are based on one of the architectures discussed. MP-cent has centralized

- 55 -

architecture and MP-decent is decentralized. MP-react introduces a new structure, which is placed among combinational architecture. MP-react changes its strategy based on whole system reaction. Finally, Sadjadi [22] has presented a method for controlling concurrent access to shared resources using a distributed lock manager.

III. PROPOSED LOCK M ANAGEMENT M ECHANISM In this section, we propose a simple lock manager in support of distributed heterogeneous systems . The first issue in design of the lock mechanism is its architecture. Among the available options, we have chosen centralized architecture to implement the lock. The main reason is simplicity and performance. Co mparing to decentralize and hybrid architectures, centralized architecture requires less message exchange, which leads to increase the efficiency [12]. In addition, we can change the code to scale it up for using on mu ltiple servers. Therefore, by choosing this architecture the emphasis is on simplicity. Considering simplicity and heterogeneity objectives, we have decided to use common networking technologies (i.e. Ethernet and Gigabit Ethernet); we did not consider special network technologies like Infin iband [20]. To initiate the lock manager, clients and server should identify each other. This identification includes detection of their network addresses. In the designed lock, we supposed that server and client should identify and communicate each other without any special information requirement. Hence, the client needs to recognize its own address and broadcast it on network. Whenever server receives this message, it will send its address as a response to client address. In this way client becomes informed about server's address . Now clients and server are able to communicate to each other. The above situation describes the time that the main server is available and is waiting for new client connection. Contrary, clients could also wait for a server to establish connection. In this situation, the server will broadcast its own address and then clients can communicate with the server. After initial recognition, clients can create different locks. In order to identify the lock by all clients , each lock should be created with a unique ID. The main server will identify different locks according to an integer number. Actually, this integer is the ID number and would be provided to all processes that would use the lock. Therefore, different parts of a distributed program can use a single lock. Proposed lock manager provides two types of locking the critical section: blocking and non-blocking. In blocking type, when a client process requests a lock and it is not available, client process goes to sleep mode waiting until the lock will be available.

In non-blocking mode, when a client sends its request and the wanted lock is free, it will receive success message. Otherwise, server will send a failure message, indicating that the lock is not available. After receiv ing failure, depend on programmer strategy, program can continue or try again. We consider both shared and exclusive mode locking in our proposed lock mechanism. If shared lock is being used, other processes can have access to the lock in shared mode, but exclusive requests will not be answered until releasing the lock. In addition, if the lock is being used in exclusive mode, all other requests are placed in queue to serve and thus starvation phenomenon will not happen. To avoid deadlock the lock manager should also check the aliveness of lock owner system. Normally, if a computer obtains a lock and before explicitly releasing it goes dead, the lock allocation to the client remains. To avoid this problem central lock manager must ensure that the lock owner is alive and working. Therefore, the lock manager must continually communicate with the lock holders and ensures their availability. If the client does not respond in due course, the corresponding lock is marked released and the manager continues to work. C. Implementation of the Proposed Lock Manager We have implemented our proposed lock manager in C++. We select this language because there are libraries like STL [23] to work with. The STL library provides many of the basic algorithms and data structures of computer science for C++ language. However, after starting the implementation the selection were leaned toward C and so in all function bodies we have used structured C. In server part, we have to maintain all information about the locks, active processes and waiting ones. We implement the main database as the data structure shown in Figure 1, which keeps the locks information, as a lin k list; each element in the list contains data such as lock ID, current lock status, a pointer to a local semaphore, requested processes, and active processes. As introduced in Section 3, Lock ID is a key to distinguish various locks. Current lock status could be free, busy in writing mode, or reading mode. For every lock there is also a semaphore prevents simultaneous access to database and unwanted changes in current lock status. When the lock is busy, a queue holds information about other requests. Maintaining requests' orders are important and thus we have uses a queue structure to establish this requirement. All data received from a client such as locking mode, lock ID, and client address are stored in this queue. The mentioned queue is a two-way type queue, which provides easier access to the first and last elements and increases the performance.

- 56 -

Figure 1.

Overview of the Data Structures Used in The Lock Management Mechanism

Locking in shared mode will cause the manager to store data related to active processes. Keeping this data is required for controlling clients' aliveness. Data structure used for this job is a queue like the one used for storage of waiting processes . In the server, lock manager uses multithreading to manage the lock requests and so each thread is responsible to perform a task. Typically, three threads are running on the server and with every client's request, a new thread will created. To enhance the portability of code, we have implemented thread using the second version of PThread [24]. The first active thread is responsible for answering broadcasted requests that just need server address. It responds to the requests asking for lock manager’s address. Each request contains the client address, so the corresponding response, containing server’s address will be sent to the received client address. All message broadcasting is based on UDP protocol. After answering every request, the process goes to sleep waiting for another request. Two different ports are considered for send and receive the messages; the reason is that the lock manager can be a client itself. The second thread checks aliveness of lock owners. In each cycle, it investigates all locks with different identifiers. Thus, the thread must retrieve all informat ion about systems that are using the locks, which are already placed in active clients queue. In the next step, server tries to communicate with the lock holder clients and if this effort fails for two consecutive attempts, the information about the allocation of the lock to that client is eliminated from the lock database, so the lock is treated as free and could reassign to other clients . This communication is through a TCP three-way handshake. After all steps above, this thread goes to sleep for about 200 milliseconds and then repeats mentioned steps again. In addition to mentioned threads, there is another important thread, which is responsible for doing the main tasks such as creating a lock ID, dropping a lock ID, granting a lock and unlocking. If the request type is about creating or releasing a lock namespace, this thread will do the request itself. This causes other request to remain in a line while creating or releasing a lock with a specific ID performed. If locking or unlocking a semaphore is requested, according to the request type, another thread will created and handle the job. This thread is configured as detachable, so that there will be no possibility to be canceled in the middle of the task.

We implemented the proposed lock manager in support of heterogeneous systems. Considering the fact that socket programming in Unix-like operating systems and Windows are both based on BSD sockets Interface [25] having two versions of a code is not so difficult. The differences include changing the Linu x socket header file, and writing a code to call Winsock.dll in Windows. About broadcasting a message into the network, in Linu x we can have broadcast address of each network interface and send the request to these addresses. However, this part of program in Windows has changed and a request would be sent to the IP address of 255.255.255.255 that makes the message reach out to internal network. The last issue is about signal handling. At the beginning, we have implemented the client aliveness controlling part by Linu x signal handling features but having a useful equivalent of this code in windows requires complex structures. Therefore, we have used a single thread to control this job. We have implemented threads using PThread library, wh ich is available in Linu x by default and simply needs PThreadV2.d ll [24] to run on Windows operating systems.

IV. EVALUAT ION According to the related works mentioned in Section 2 and our lock design in Section 3, every discussed lock manager has a set of common characteristics and one or more outstanding features that make them distinct from others. Table 1 is a comparison between most common lock managers and our proposed lock manager. Empty cells indicate that there is no available data about those features, mentioned directly or indirectly. As it is shown in the last column of Table 1, all locks are mu ltimode and are capable of locking in shared and exclusive modes. DQL [14] is a lock that uses high-speed Infiniband networks and has a decentralized architecture. Its major feature is that the waiting queue is built as a distributed structure. MPLocks [12] supports three different architectures. Parallax lock manager [16] is a mechanism that has been used in virtual disks. Chubby [13] provides high reliability and is able to recover after stopping.

- 57 -

TABLE I.

COMPARISON OF DIFFERNET LOCK MECHANISM

Lock

Blocking or Non-blocking

Architecture

Special Network

Multimode

Parallax

--

Centralized

Manually

--

--

9

PVFS Lock

Blocking/ Nonblocking

Centralized

--

--

--

9

Chubby (Zookeepe r)

Blocking/ Nonblocking

Centralized

9

--

--

9

MP-Locks

--

Centralized/ Decentralized/ Combinational

--

--

--

9

DQ L

NonBlocking

Decentralized

--

--

Infiniband

9

Our Proposed DLM

Blocking/ Nonblocking

Centralized

--

9

--

9

Our imp lemented lock is capable of running on heterogeneous systems. In addition, it is a simple and general-purpose lock, which can be used in many different systems as basis of the lock management component. Besides listing features, we have evaluated average operation time of locking, unlocking, creating and releasing a lock of our implemented lock. We have repeated each tests for 50 t imes and recorded the mean time. The test systems were a two PCs, one with Intel Core2Duo 3.0 GHz processor, 1GB of RAM and Windows XP SP3 and Linu x kernel version 2.6.49 and the other with Intel Core2Duo 2.0 GHz, 2GB memo ry and Windows Vista SP1 and Linu x kernel version 2.6.27 and the connection type was Ethernet 100Mbps. The results are shown in Table 2. Results on Table 2 are comparison of operation times between Linux and Windows version of our proposed lock manager. Unlocking operation consists of a TCP connection and sending the request. The most time consuming part of this operation is the three-way TCP handshaking. Among all operation times, locking time is more than the rest. It is because of the time needed for thread creation and context switch. The comparison also shows that on creating and releasing locks Linu x will do better, once it needs more time on locking operation. The reasons for this difference is unclear that needs to do more research.

TABLE II.

Time (Millisecond) Windows

2.6

2.1

1

1

Create lock

1.4

1.6

Release lock

1.4

1.6

Lock Unlock

Heterogeneous

V. CONCLUSION AND FUTURE W ORKS In this paper, we have presented a distributed lock mechanism with centralized architecture. In imp lemented lock, both shared and exclusive locks, and blocking and non-blocking locks were supported. Clients and server have the ability to identify each other automatically and the server periodically checks the aliveness of clients. We also have used threads for increasing performance. Besides all, this lock has a simp le API and can be run on heterogeneous systems. Based on performed features comparison, our lock has the necessary and sufficient facilities. After investigating various operation times, it was determined that establishing TCP connection is the most time consuming part of each operation. Much development in the imp lemented lock can be done as future works. To automatically detect errors and adding the ability to recover after stopping the server through writing logs on the hard disk, can be a future work. In addition, we can focus on architectural changes to decentralized and combinational architectures too. Changing the main lock data structure to hash table can increase efficiency. For the current lock, the main goal is to make it a special purpose lock. Thus, the most important recommendation for the future is to use the lock manager in the form of a distributed memory or distributed file manager. Finally, adding ability to detect and prevent deadlocks and is another future work to be done.

OPERATION TIMES OF OUR P ROPOSED LOCK MANAGER

Time (Millisecond) Linux

Operation

Recovery

REFERENCES [1] A. S. T anenbaum and A. S. Woodhull, Operating systems: design and implementation: Pearson, 2009. [2] E. W. Dijkstra, "Cooperating sequential processes," Programming Languages, 1968. [3] C. A. R. Hoare, "Monitors: an operating system structuring concept," Commun. ACM, vol. 17, pp. 549-557, 1974. [4] P. Hansen, "T he programming language concurrent pascal," in Language Hierarchies and Interfaces. vol. 46: Springer Berlin, 1976, pp. 82-110. [5] N. P. Kronenberg, et al., "VAXcluster: a closely-coupled distributed system," ACM Trans. Comput. Syst., vol. 4, pp. 130-146, 1986.

- 58 -

[6] N. S. Bowen, et al., "A locking facility for parallel systems," IBM Systems Journal, vol. 36, pp. 202-220, 1997. [7] R. Moran, "Oracle8 Parallel Server Concepts & dministration, Release 8.0." [8] J. L. Keedy, et al., "On implementing semaphores with sets," The Computer Journal, vol. 22, p. 146, 1979. [9] B. Freisleben and J. L. Keedy, "Priority semaphores," The Computer Journal, vol. 32, p. 24, 1989. [10] S. Hodgson, et al., "Extended semaphore operations," Concurrency: Practice and Experience, vol. 12, pp. 1495-1509, 2000. [11] A. T anenbaum and M. Van Steen, Distributed systems: Citeseer, 2002. [12] C. C. Kuo, et al., "MP-LOCKs: Replacing h/w synchronization primitives with message passing," 1999, p. 284. [13] M. Burrows, "T he Chubby lock service for loosely-coupled distributed systems," presented at the T he 7th Symposium on Operating Systems Design and Implementation, Seattle, Washington, 2006. [14] D. Ananth, "Distributed Queue-Based Locking Using Advanced Network Features," in International Conference on Parallel Processing 2005, 2005, pp. 408-415. [15] F. Schmuck and R. Haskin, "GPFS: A shared-disk file system for large computing clusters," in The Proceedings of The First USENIX Conference on File and Storage Technologies, 2002, pp. 231–244. [16] D. T . Meyer, et al., "Parallax: virtual disks for virtual machines," SIGOPS Oper. Syst. Rev., vol. 42, pp. 41-54, 2008.

[17] A. Ching, et al., "Noncontiguous locking techniques for parallel file systems," presented at the T he Proceedings of T he 2007 ACM/IEEE Conference on Supercomputing, Reno, Nevada, 2007. [18] F. Chang, et al., "Bigtable: a distributed storage system for structured data," presented at the T he 7th USENIX Symposium on Operating Systems Design and Implementation, Seattle, WA, 2006. [19] P. Hunt, et al., "ZooKeeper: wait -free coordination for internet-scale systems," presented at the T he P roceedings of T he 2010 USENIX Conference on USENIX Annual T echnical Conference, Boston, MA, 2010. [20] T . InfiniBand, "T rade Association, InfiniBand T M architecture. Specification Release 1.2.," 2009. [21] S. Narravula, et al., "High performance distributed lock management services using network-based remote atomic operations," 2007. [22] S. Sadjadi, "Controlling access of concurrent users of computer resources in a distributed system using an improved semaphore counting approach," US Patent 7,743,146, 2010 . [23] (2011). standard template library programmerʼs guide. Available: http://www.sgi.com/tech/stl/ [24] (2011). POSIX Threads (pthreads) for Win32. Available: http://sourceware.org/pthreads-win32/ [25] (2011). Windows Sockets: Background. Available: http://msdn.microsoft.com/enus/library/z4eykh88%28v=VS.90%29.aspx

- 59 -

Suggest Documents