goods, since Napster [1] in 1999, has grown steadily. P2P networks have also been used for sharing computational power, as in the case of SETI@home [2].
A P2P Platform for Sharing Radiological Images and Diagnoses Ignacio Blanquer, Vicente Hern´ andez, and Ferran Mas Universidad Polit´ecnica de Valencia DSIC, Departamento de Sistemas Inform´ aticos y Computaci´ on. Camino de Vera S/N. 46022 Valencia, Spain {iblanque, vhernand, fmas}@dsic.upv.es
Abstract. The research in epidemiology of infrequent diseases require a lengthy and complex process of collecting a significative amount of studies and patient data for obtaining reasonable results. However this information is usually already available in different centres, but the inexistance of tools that could ease the process of sharing information slow down the collecting process and delays the research results. A peer-to-peer application that enables a community of radiologists to share DICOM studies, series or single images and their associated diagnoses is presented in this article. In this way radiologists can contrast its own studies with other similar ones so they can make use of the experience and the conclusions of other facultatives. The technical work have consisted in the development of a system over a P2P platform, allowing distributed image searches using keywords or image context similarity based on mutual information. The system is a collaborative environment that enables sharing images and diagnoses among radiologists. This system consolidates a distributed virtual image and diagnosis database, providing a helpful tool for radiologists’ work.
1
Introduction
P2P networks have become more and more popular over the last years. The development of applications to support virtual communities sharing electronic goods, since Napster [1] in 1999, has grown steadily. P2P networks have also been used for sharing computational power, as in the case of SETI@home [2]. P2P introduced a new philosophy of computation [3,4] that enables the creation of virtual communities with similar interests in which their peer resources, mainly processor power and storage capacity, are shared either for the mutual benefit of the peers (as in the case of file sharing applications) or to achieve a
The authors wish to thank the financial support received from The Spanish Ministry of Science and Technology to develop the project ”Investigaci´ on y Dearrollo de Servicios GRID: Aplicaci´ on a Modelos Cliente-Servidor, Colaborativos y de Alta Productividad”, with reference TIC2003-01318. This work has been partially supported by the Structural Funds of the European Regional Development Fund (ERDF).
challenging common objective (as in the case of the United Devices application for the support of the research in the fight against the smallpox virus [5]). This paper presents a P2P environment to build a virtual community of radiologists sharing medical images, diagnoses and computing resources, providing the radiologists with a huge distributed dataset of images and their associated diagnoses, helping them to improve their work and their possibilities of learning. The work being presented in this paper does not aim at providing a solution for exhaustively accessing clinical radiology data. It is not designed to be connected automatically to RIS, PACS or any other Information System. It has been designed to assist sharing images for research and training. In our knowledge, there is no other activity in the literature with this aim. There are however, other projects aiming at structurally sharing image databases. Projects such as the BIRN (Biomedical Image Research Network [6]) aims at federating a large number of neuroimaging databases from (initially) three virtual communities of researchers. Other similar approaches using GRID technologies are being developed in the frame of the MAMMOGRID [7] or GPCALMA [8] projects, federating mammography databases at European level and adapting a general-purpose Grid middleware to the specific problem of the data and services required.
2
Rationale
The task of providing a clinical diagnosis from images is complicated and requires several years of clinical specialization. As in any learning process, it is important to have the broadest set of examples that prove the theoretical concepts. Enabling access to a huge amount of commented and verified diagnosis examples would be a valuable support for radiologists and their studies. On the other hand, there are infrequent pathologies in the radiologist work which are rarely met by specialists, increasing the complexity of the diagnoses. However, other specialists in other centres may have faced some similar cases, whose experience can be of great value. Thus, it is important for the radiologists’ work to have access to the broadest collection of cases that can help him or her in some diagnosis decisions. Sharing DICOM [9] studies can improve research and learning. Moreover, it is also important to share the image diagnoses too. However, shared diagnoses can be mistaken, defective or vague, so there is a need to validate the accuracy and quality of the medical information, enabling other experts to evaluate or comment the diagnoses. The feedback on the diagnoses may also be very valuable. Finally, public medical image databases can be linked to the P2P platform, complementing the information from several sites and summing up their value.
3
Objectives
The main goal of this work is to offer a collaborative application in which radiologists could exchange important working information, such as diagnoses or image data sets, so they can have access and make use of the data and experience in diagnose acquired by other specialists. Another key aspect is to take advantage of the distributed resources available in the platform. So data, i.e. images and diagnoses, should be made available to the users. As medical information is being shared among users, some security aspects must be considered. First, there is a user control about what will be shared, since the user will choose which data (including diagnoses and images) he or she will share. Then, as DICOM images contain many metadata, partly related to the image and partly related to the study, an anonymization of the DICOM image tags that could compromise patient privacy has to be performed. This anonymization is automatically done by the application before sharing any image.
4 4.1
The State of the Art The Choice of a Peer-to-Peer Model
The analysis of the problem concluded that the conventional client-server model presents several problems for dealing with the aims of the present work. Neither a traditional, nor a web-based client-server were able to deal with the potentially large bandwidth and storage requirements that massive use of the application might have. The first problem is that data being shared in the community could grow so much that make storing all the information on a single system impossible. Moreover, dealing with all the client queries and requests of information (images, diagnoses or diagnosis comments), will require both a great network bandwidth (conventional DICOM studies can require tens of Megabytes) and an important computational power in the server. Furthermore, when dealing with context-based search involved in the application of a registration algorithm, the need for computing resources exponentially increases, surely collapsing the server as the number of clients grow. The use of a collaborative environment based in peer-to-peer approach enables to overcome the aforementioned problems of storage limitations, i/o bandwidth and computation power. As each peer or client will have some shared data, the storage problem is distributed among the peers. The query process is also parallelised, reducing the response time, as each peer has only to browse into its local data. Such quick response time would not be possible using a centralized client/server system. 4.2
Suitability of the P2P model
The suitability of the P2P model to achieve a better degree of scalability is straightforward in concept. However, it might be interesting to analyse how much P2P approach is better or which are the bottle necks that must be avoided.
Let’s suppose two scenarios: – Client/Server model. One server storing all the images and providing a sustained downloading bandwidth (BW) of 1Mbit/s. – P2P model. Each peer holds a part of the images, which will be partially replicated. Perfect data balancing is not realistic, but assuming that only 20% of the images requested simultaneously in a time point are badly distributed (available only in one specific peer) is a rather pessimistic statement. Assuming a maximum bandwidth (even for the unbalanced peer) of 256Kbits is realistic. The maximum BW required can be estimated as shown in the expressions 1 and 2. BWserver = max(BWclient ; BWclient × nusers × BusyRatio) BWP 2P = max(BWclient ; 20% × BWclient × nusers × BusyRatio) BusyRatio = tuser
study download
× nsimultaneous
studies
(1) (2) (3)
Where BWclient is the BW of each client, nuser is the number of clients/peers and BusyRatio is the product of average study downloading time (for a single user) by simultaneous studies ratio, computed with the reciprocal of the number of studies downloaded per second per user. The maximum delay can be estimated with the expression 4. tdelay = max(
BWreal × tuser BWmaximum
study download ; tuser study download )
(4)
For both the client/server (where BWmaximum is BWserver ) and P2P models (where BWmaximum is BWP 2P ). For the client side (in both P2P and client/server), the analysis will compute the average maximum BW required and maximum expected delay considering from 1 to 100 peers, downloading 2 studies per hour for a typical study of 10 standard CT slices. Figures with smaller study sizes reduce the difference between models, and larger average study size increases strongly the difference in favour of the P2P model, which is always more efficient.Figure 1 show the evolution of previously described factors. 4.3
Platforms
JXTA [10] is a set of open protocols initially developed by Sun Microsystems that allow any connected device on the network to communicate and collaborate in a P2P manner. Any peer can interact with other peers and resources directly. Sun Microsystems introduced JXTA technology in 2001 with an open-source, royalty-free license model. JXTA offers interoperability across different P2P systems and communities, platform independence among diverse languages, systems and networks. JXTA provides ubiquity access by any device, and constitutes the de facto standard in the development of P2P enviroments.
C/S
C/S
P2P
P2P
750
2500
1500 Max Admisible Server Bw (1Mbits)
1000
500
Maximum Delay (secs)
Required Bandwidth (Kbits/sec)
700 2000
650 600 550 500 450 400 350
Max Admisible Peer Bw (256Kbits)
Number of Peers
97
91
85
79
73
67
61
55
49
43
37
31
25
19
7
13
97
91
85
79
73
67
61
55
49
43
37
31
25
19
7
13
1
1
300
0
Number of Peers
Fig. 1. Requiered bandwidth and expected maximum delay
5
Architecture
The platform has been developed in Java using JXTA for the peer-to-peer part implementation. JAL (JXTA abstraction layer) [11] was used to simplify the development of the application. JAL abstracts the functionality necessary in a P2P environment, easing the implementation of P2P applications. The use of Java in the implementation of the application improves the portability to multiple platforms, as required in medical environments. Although the architecture is P2P, a server is required to provide registration and authentication. The implemented platform presents a hybrid architecture between client-server model and peer-to-peer model, predominating the peer-topeer one. The tasks of the server consist on merely accepting or declining peers in the group. Figure 2 shows the structure of the P2P system.
Users Diags comments
Server
Images Diags
Images Diags Peer
Peer Images Diags Peer
Fig. 2. Peer scheme
Next sections describe more in detail the two components of the architecture.
5.1
The Server
The server is in charge of the user management. The server creates, modifies or deletes user accounts. The accounts are used to start client applications and grant access to the group. This allows the control of the user access, enabling the possibility to restrict the use of the system to clinical users only. A global evaluation mark assigned to each user and diagnosis is computed from the marks received from other peers. This is a way to control the quality of the diagnoses and also to control the credibility (see [12]) of the users. It is very important to know how reliable a source of information can be. 5.2
The Peers
Peers have more functionality than the server has. A peer is basically a storage container with services that allow other peers to access to the shared data in a controlled way. Peers also include several components that provide more advanced functionality (such as single image and series visualization, registration, fusion, diagnosis creation and the like), increasing the usability of the system. Figure 3 shows a diagram illustrating the most important classes of the peer. There is a centralized class in charge of coordinating the different elements of a peer, called the Peer Manager class. The Peer Manager is constructed on top of the EZMinimalPeer JAL object that implements and describes the Peer interface using JXTA. The Peer Manager class also provides the basic communication procedures for peer interaction. The Msg Listener class takes over the management of the reception of messages from other peers. The message is the basic unit of information exchanged among peers in JXTA. Each message is associated to a tag that identifies the message type. Some message types are: Search, Search answer, Ask for download image, Ask for download diagnosis, etc. The Local Searcher object contains references to the shared images of the current user, and it deals with the task of selecting the set of shared images in the local peer that match the filter criterion in a search. Images are sent by the Sender class, which implements a queue of images to be sent. The Sender object periodically checks if the maximum number of concurrent send requests has been reached, creating a sending thread (J2KSender object) on the other case. Images are coded in JPEG 2000 [13] lossless format and sent layer by layer. Before sending any DICOM image, an anonymization process is performed, and information identifying the patient is removed from the DICOM file. Patient ID, name and physician are removed and birth and study date are blured. The Downloader object is used for managing the reception of different simultaneous parts storing each partial download on an Image Part object that combines the JPEG 2000 received layers. When the downloading process is completed, the JPEG 2000 received layers are merged and decoded resulting into the original DICOM file. Downloaded images are automatically shared (i.e. incorporated in the Local Searcher ) by the receiving peer.
Diagnosis Search
Local Searcher
EZMinimalPeer
Downloader
n Image Part n
Peer Manager
Msg Listener
Sender
J2K Sender
n Queue Image Element
Fig. 3. Peer class diagram
Distributed searches among the peers are implemented using JXTA propagate pipes. Propagate pipes allow peers to send a message to multiple peers. When a peer has an image query it broadcasts that query message on the propagate pipe shared by all the peers. Peers listening to the propagate pipe, receive the query message to browse local databases for the images required by the broadcasted query. The peers support two kind of searches. The first one by keywords in the metadata, enabling users to search by patient age, sex, study modality (CT, PET, . . . ), and body part. The other model for searching, is content based search. This procedure consists on retrieving similar images contained in other peers that look somehow similar to a reference image. Each peer computes the mutual information function [14,15,16], comparing the reference image and the other images. This search procedure can be combined with additional filtering by metadata reducing the range of local studies to be compared to. The result of the searching process is a list of images and associated owners, indexed by the degree of similarity obtained by the mutual information function. As seen in Section 3, not only images, but also diagnoses are shared. The peer application provides an interface for creating diagnoses attaching them to a study, and saving them to disk. The class Diagnostic Set stores pairs of diagnosis file reference and study unique identifier to keep the relation between both entities. When a peer has the results from a search, the images and the associated diagnoses can be retrieved and downloaded, including user comments other users can create. New comments and evaluation marks can be added once the information is studied. 5.3
The server and the peers
This section describes more in detail how the server and the peers interact. The server is set up before any peer, starting as a rendezvous peer (this is a JXTA concept for designate a peer that helps other peers discover resources). When the server starts it creates a propagate pipe, that can be used by the other peers.
During the starting process, a peer seeks for a rendezvous peer, which will always be the server in the present architecture. Then the server and the peer use the Diffie-Hellman key agreement protocol for generate a common secret key used for implementing an encrypted peer-server communication. At this point the user can securely be authorized by the server, provided he or she logs in correctly. Once logged in, the server informs the peer about the previously generated propagate pipe, which is used by all the peers to broadcast search queries.
5.4
Security
The environment offers a centralized account management,so in order to interact with the application, users must previously log in and get authorized by the server. As mentioned in Section 3, the system has to fulfill the necessary security requirements. Peer applications allow users to choose which images will be shared, in a first level of security, since only authorized data will be shared. On the other hand, diagnoses are not directly shared from the information systems, only diagnoses created with the peer application are visible for other peers. Finally, content security is bonded as only anonymized data is transmited by the system. JXTA also provides confidentiality, ensuring that the contents of a message are not disclosed to unauthorized individuals.
6
The Current System
A P2P application has been implemented on top of the architecture described for the exchange of medical radiological information. This system, namely DR-Red, comprise two applications: DR-Red peer and DR-Red server, and it is currently in a beta testing stage. As explained in Section 5, the system is completely developed in Java and has been tested in both Windows and Linux operating systems. Both the server and the peers have graphical user interfaces, which are very intuitive and enable inexperienced users to easily run it without extensive training. Along with the functionality shown in Section 5 a component for image visualization has been implemented. Although this component is not directly related with the process of sharing images, it is quite useful for users to preview images while being downloaded, and it offers additional facilities usually needed, such as zooming and panning, window width and window level adjustment in DICOM images, movie visualisation of image series (shows an animation of the image slices being part of the same DICOM series), image fusion and image registration. Figure 4 shows a screenshot of two running peer applications.
Fig. 4. Peer application screenshot
7
Conclusions and future work
The system presented is a complete environment for sharing digital medical information within a radiological virtual community. It provides the necessary tools for the efficient transfer, encryption and decryption, searching and browsing, basic image processing and sharing the radiological information involved. It is a P2P environment specially designed for the particular requirements of radiological users (privacy requirements, multislice data particularities, registration of images). Comparing this system architecture with the traditional client-server approach, the P2P environment is clearly more scalable since the server will not collapse when dealing with massive increase on the number of users, providing better work load balancing among peers. In a client-server model, the weak point is the server, although using a multiserver version of the platform would prevent this problem. In the presented P2P approach, less information would need to be replicated in the servers. A replica of the catalogue would be infeasible at a large scale client-server model. In the P2P approach even if the server stops working, logged peers could maintain their connections with other known peers. The system can be improved in some aspects. In the case of reputation and confidence, current system provides an evaluation mark for each user and diagnosis which value is computed just as the mean of the evaluation marks for the user diagnoses. This approach will be replaced by the Eigen Trust score [17,18,12] procedure that introduces the concept of transitive trust computing the evaluation mark distributedly and according to the reputation of the different users.
Scalability can even be improved by developing a multiserver version. This modification will be transparent for the peers and just involve to modify the server distributing the work among several instances. Fault tolerance will be then improved. Finally, diagnosis search will be added to the system. Searching on the diagnosis could be done by content, allowing users to find diagnoses by word content, or by study properties. Then the user could download the study related with a diagnosis. Last but not least, the tool will be promoted and made easily available to medical users. Agreements for the creation of initial databases are being studied, in order to foster the start up of a virtual community that could bring content into the network.
References 1. Napster. http://www.napster.com. 2. Seti@home. http://setiathome.ssl.berkeley.edu. 3. Andy Oram, editor. Peer to Peer: Harnessing the Power of Disruptive Technologies. 2001. 4. Dejan S. Milojicic, Vana Kalogeraki, Rajan Lukose, et al. Peer-to-peer computing. Technical report, Hellwet-Packard, 2002. 5. Smallpox virus research. http://www-1.ibm.com/grid/announce 205.shtml. 6. Biomedical informatics research network (birn). http://www.nbirn.net. 7. R. McClatchey, D. Manset, T. Hauer, F. Estrella, P. Saiz, and D. Rogulin. The mammogrid project grids architecture. Procedings of the 10th Int. Conf. on Computing for Hight Energy Physics, March 2003. 8. U. Bottigli, P. Cerello, S. Cheran, et al. Gpcalma: A tool for mammography with a grid-connected distributed database. MEDICAL PHYSICS: Seventh Mexican Symposium on Medical Physics, March 2003. 9. National Electrical Manufacturers Association. Digital Imaging and Communications in Medicine (DICOM). 1300 N. 17th Street, Rosslyn, Virginia 22209 USA. 10. Project jxta. http://www.jxta.org. 11. Jxta abstraction layer. http://ezel.jxta.org/jal.html. 12. Karl Aberzer and Zoran Despotovic. Managing trust in a peer-2-peer information system. Technical report, Swiss Federal Institute of Technology, 2001. 13. David S. Taubman and Michael W. Marcellin. JPEG2000: Image compression fundamentals, standards and practice. Kluwer Academic, 2002. 14. Max A. Viergever Josein P.W.Pluim, J.B. Antoine Maintz. Mutual information based registration of medical images: a survey. IEEE Transactions on Medical Imaging, XX, 2003. 15. S´ebastien Gilles. Description and experimentation of image matching using mutual information. Technical report, Dpt. of Eng. Sci., Oxford University, 1996. 16. Dirk Vandermeulen Frederik Maes and Paul Suetens. Comparative evaluation of multiresolution optimization strategies for multimodality image registration by maximization of mutual information. Medical Image Analysis, 3(4):373–386, 1999. 17. Kamvar S. Schlosser M. and Garc´ıa-Molina H. The eigentrust algorithm for reputation management in p2p networks. In WWW, 2003. 18. Bawa M., Cooper B.F., Crespo A., et al. Peer-to-peer research at stanford. Technical report, Computer Science Department, Stanford University.