May 15, 2009 - the operation by sending the read/write request to an iSCSIâtarget. ... for reference, invoked directly on the commando line without any Java-.
jSCSI 2.0: Multithreaded Low-Level Distributed Block Access Sebastian Graf, Patrice Brend’amour and Marcel Waldvogel May 15, 2009 Abstract In 2007 we introduced jSCSI 1.0 [1] to the public. The use case was to access block-patterns directly from Java without any third party JNI [2] invoked software. In the last 2 years we explored the capabilities to assimilate multithreading in jSCSI. The goal was to leverage the outstanding features of the new Java multithreading extension introduced with Java5/Java6 and incorporate them into our proven blocklevel accessing framework. Today, we present the next incarnation of jSCSI 1.0, jSCSI 2.0 which yields significant performance improvements by utilizing Java’s advanced multithreading capabilities. We show that our Java based implementation of a low-level architecture is not only a proposing alternative in terms of performance but also in the ease-of-use compared to common JNI-invoked system calls. Therefore, we argue that jSCSI 2.0 is not only a platform independent implementation of the iSCSI protocol [3], but also a fast and robust proof for implementing low-level applications in Java.
1
History and Background of jSCSI
iSCSI [3] offers an easy way to access distributed storage devices in a block based manner. As there are numerous use-cases for iSCSI there exist numerous iSCSI initiators and iSCSI targets and their number is still rapidly growing. Nevertheless, the usage of iSCSI is restricted on the operating system level. This results in various implementations of the iSCSI protocol that all differ in their interpretation of the initial RFC. For example, the concurrent access between an initiator and a target (Multipath) is sometimes implemented utilizing the intended Multi–Connection feature, other implementations utilize multiple sessions between both, the target and the initiator, respectively. Figure 1 depicts these two different approaches. These differences’ in the implementation of the iSCSI RFC complicate the development of Java programs that utilize the low-level access provided by iSCSI, since the bridge between a Java program and an iSCSI implementation, namely JNI system calls, become as platform dependant as the utilized iSCSI framework itself. Session
Session
TCP Connection
TCP Connection
iSCSI Target
iSCSI Target
TCP Connection
Session
iSCSI Initiator
iSCSI Initiator Session TCP Connection
TCP Connection
Session
iSCSI Target
TCP Connection
(a) Multi Connection per Session
(b) Multi Session
Figure 1: Different implementation to implement the Multipath feature
1 Konstanzer Online-Publikations-System (KOPS) URN: http://nbn-resolving.de/urn:nbn:de:bsz:352-opus-84511 URL: http://kops.ub.uni-konstanz.de/volltexte/2009/8451
iSCSI Target
Our aim was to get rid of these inconsistencies by implementing the iSCSI-standard in Java, providing a platform independent framework which behaves equally on every platform. In consequence, we built jSCSI 1.0 [4], an iSCSI initiator written in Java without the necessity to utilize any third-party native libraries. jSCSI 1.0 was later released to the open-source community1 and the feedback was overwhelming. jSCSI 1.0 was used as a key component in the ”Dispersed Storage” project of Cleversafe 2 which won the Technology Innovation Award in 2008, issued by the Wall Street Journal [5]. The success of jSCSI 1.0 motivated us to push the idea of jSCSI to the next level which led to our actual version of jSCSI 2.0. jSCSI 2.0 comes is a lightweight Java-library that enables a user to access every iSCSI target in an easy and intuitive manner while offering the stability and robustness inherent to the majority of Java applications. However, the major improvement in jSCSI 2.0 is its outstanding performance, outperforming a native iSCSIinitiator invoked over JNI, by utilizing the new multithreading API of Java5/6. Following, we will that Java is able to fulfill low-level requirements previously implemented using platform-dependant C-implementations.
2
Architecture of jSCSI 2.0
jSCSI 1.0 was designed in a highly object-oriented way. Based on the iSCSI RFC [3], the relevant aspects where designed as objects in the architecture. Unfortunately, the need of Runnable and Thread –objects restricted the use of a complete object-oriented manner since the software was compliance with Java 1.4. To ensure an performant way the program was based on two daemon threads. The first one, representing the session layer, was generating the iSCSI-Tasks as described in the RFC [3] while the second one concluded the operation by sending the read/write request to an iSCSI–target. The architecture was implemented as a prototype which resulted several trade–offs regarding the robustness despite of existing JUnit testcases [6]. Application 1
Thread Application 1
Application 2
...
Application n
Application 2
Application n
Thread Initiator
Initiator Thread
...
Thread Session
Connection
Session
Session Connection
Connection
Task Thread
Thread SenderWorker
SenderWorker
SenderWorker
(a) jSCSI 1.0 Architecture
Session
Task Thread
Task Thread
Connection
Connection
Connection
SenderWorker
SenderWorker
SenderWorker
(b) jSCSI 2.0 Architecture
Figure 2: Architecture of jSCSI 1.0 and jSCSI 2.0 In the Work in Progress 2007, the outlook promised to get rid of these benefits of the prototype implementation. With jSCSI 2.0 as a final release, we refactored the complete architecture especially the thread handling as shown in Figure 2 by making use of the new java.util.concurrent–package. Instead of using daemon threads, the write/read tasks are now encapsulated in Callable-objects which providing an efficient blocking mechanism on the one hand and a much better exception-handling on the other hand. Due to the possibility to have one task per thread, the framework is able to provide both, Multipath through MultiConnection per Session and Multipath through MultiSession. With the usage of an ExecutorService, the blocking operations are performed in a optimised way regarding the usage of the daemon threads in jSCSI 1.0. Last but not least, the code is now not only programmed in a highly object-oriented 1 http://sourceforge.net/projects/jscsi/ 2 http://www.cleversafe.com/
2
way but also in an intuitive way. Since every task described in the RFC [3] is now represented by exactly one Callable, the code is easy to read and to understand.
3
Benchmarks
Since the last release as a Work in Progress, the question came up, if an object-oriented programmed framework like jSCSI 2.0 can mess with the JNI–invocation [2] of a native C-implementation (in this case the initiator of the Open-ISCSI project [7]).
350
● ● ●
● ● ● ●
●
JSCSI 2.0 − One Session JSCSI 2.0 − 5 Sessions OpenISCSI − RawDevice via JNI OpenISCSI − RawDevice on bash
●
●
400
300 250
500
●
JSCSI − One Session JSCSI − 5 Sessions OpenISCSI − RawDevice via JNI OpenISCSI − RawDevice without JNI
●
● ● ● ● ● ●
●
●
● ●
Time[ms]
● ● ● ●
200
●
●
300
200
● ● ● ●
150
Time[ms]
●
●
● ●
100
●
●
●
●
●
100
●
● ●
50
●
● ● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
0
0
●
320
1280
2240
3200
4480
5760
7040
8320
9600
10880
12160
13440
14720
16000
320
1280
2240
3200
4480
5760
Amount of Data in KiB
7040
8320
9600
10880
12160
13440
14720
16000
Amount of Data in KiB
(a) Read of Blocks
(b) Write of Blocks
Figure 3: Read / Write of jSCSI / Open-iSCSI Figure 3 shows the jSCSI as well as Open-ISCSI-calls. The Open-iSCSI framework is one time invoked over JNI and the second time, just for reference, invoked directly on the commando line without any Javarelated layer. jSCSI is once invoked with one thread and once invoked with five threads. The operation is performed 100 times with the Perfidix [8] framework. The points in the middle denote the mean of the runs while the lines above and below stand for 95% confidence-intervals. The test embed is defined as two quad-core Athlon64 Servers with RAM-Disks connected over Gigabit-Ethernet to reduce possible bottlenecks. This reduce results in a better scaling of the needed computing ressources. Figure 3 shows, that jSCSI 2.0 scales well with one thread but better with 5 threads than an JNI-invoked Open-iSCSI initiator instance. This is founded in the task management as Callables which are minimising the blocking operations on the one hand using all available sessions and connection on the other hand: As soon as the connection/session is free again (which results in a finished Callable), the Executor is starting the new operation already queued in the framework. A detailed scaling of the number of used threads in read/write operations is shown in Figure 4. It is clearly visible that even with more 5 parallel threads, the scaling of the framework is still good. If more than 50 threads are invoked in jSCSI 2.0, the curve will grow again since the management of the threads is in need of resources as well. As we developed jSCSI 2.0 as a small component in a larger software project, we think that 5 threads are more than sufficient to decrease the blocking time as well as to increase the performance. With jSCSI 2.0, we use the possibility to overtake the usage of native reference implementations within Java and provide a use case for migrating from platform dependant software to Java as an independence layer as well. In the results displayed in the Figures 3 and 4, we see that from the point of view of performance, it makes sense to implement low-level applications in Java itself instead of jumping to low-level (and therefore often platform-dependant) architectures with JNI.
3
●
JSCSI − Read Split − 16000KB
JSCSI − Write Split − 16000KB
●
350
Time[ms]
250
Time[ms]
400
450
300
●
500
●
300
●
●
250
● ●
● ● ●
●
●
●
● ●
●
●
●
●
●
●
● ●
●
●
●
● ●
1
3
5
7
9
11
13
●
●
15
17
19
21
23
25
●
● ●
27
●
●
29
●
●
31
●
●
33
●
●
35
●
●
37
●
●
●
39
●
● ●
41
●
43
●
●
45
●
●
47
●
●
200
200
●
●
49
1
3
5
7
●
●
9
●
●
11
●
●
●
13
Threads
●
15
●
●
17
●
●
19
●
●
21
●
●
23
●
●
25
● ●
27
●
●
29
●
●
31
●
●
33
●
●
35
●
●
37
●
●
●
39
●
41
●
●
43
●
●
45
●
●
47
●
●
●
49
Threads
(a) Scaling MultiSession Read
(b) Scaling MultiSession Write
Figure 4: Scaling of multiple sessions within one task
4
Conclusion
In this proposal we want to give an outlook what is possible with modern thread-accessing tools provided by Java 5/6. Based on our low-level software jSCSI, we have proven that a migration to Java makes sense even if the use-case of the software is dedicated to the storage of bytes themselves. We show that with the help of actual classes, Java can even outperform low-level domestic C-libraries invoked over JNI [2] because of the minimalisation of blocking operations and the exploitation of the available ressources. jSCSI 2.0 gives us now the possibility as a full-grown software to develop use-cases for block-accessing via Java like the improvement of distributed lucene–indexing [9], accessing filesystems over web–applications and prototyping balancers, schedulers and caching algorithms for block accessing research.
References [1] Marc Kramis, Volker Wildi, Bastian Lemke, Sebastian Graf, Halldor Janetzko, and Marcel Waldvogel. jSCSI - A Java iSCSI Initiator, 2007. [2] Java native interface: Programmer’s guide and specification. At http://java.sun.com/docs/books/jni/. [3] J. Satran, K. Meth, C. Sapuntzakis, M. Chadalapaka, and E. Zeidner. Internet Small Computer Systems Interface (iSCSI). RFC 3720 (Proposed Standard), April 2004. Updated by RFC 3980. [4] Volker Wildi. Java iSCSI Initiator. Master’s thesis, University of Konstanz, Distributed Systems Group, Konstanz, Germany, Feburary 2007. [5] Michael Totty. The 2008 technology innovation awards. SB122227003788371453.html, September 2008. [6] E. Gamma and K. Beck. JUnit. At http://www. junit. org. [7] Open-iSCSI. At http://www.open-iscsi.org/. [8] Perfidix. At http://www.perfidix.org/. [9] Lucene. At http://lucene.apache.org.
4
http://online.wsj.com/article/-