Networked Electromagnetic Data Distribution System

2 downloads 0 Views 681KB Size Report
The OHP has originally developed the NINJA system (New Interface for ... named EM NINJA. 2. .... managed by the World Data Center for Geomagnetism, Kyoto.
FRONTIER RESEARCH ON EARTH EVOLUTION, VOL. 2

Networked Electromagnetic Data Distribution System Takao Koyama, Seiji Tsuboi, Yasushi Ishihara and Hiromitsu Mizutani Research Program for Data and Sample Analyses, Institute for Research on Earth Evolution (IFREE)

1. Introduction

each pair of the parts communicate to each other on the internet only by an http port, because an http port is usually used in the internet even at the robust network environment with firewalls for a security. Then network maintenance becomes very simple for a network administrator by focusing only on an http, which makes the network system very secure. Also our system is made by using the Java in order to work in any platform and operating system. Hereafter we explain the three parts of our system; “WEB server”, “data server”, and “user interface” parts, respectively.

Since 1997, the Japanese Ocean Hemisphere Networked Project (OHP), which aimed to elucidate the Earth’s interior precisely by filling a large vacancy of geophysical observatories in the Pacific region, has carried out the construction of an integrated geophysical observation network such as seismic, geodetic (GPS), and electromagnetic stations [Fukao, 1997]. The geomagnetic stations were set up on the land for long-term observation of the geomagnetic field by installing both proton scalar magnetometers and fluxgate vector magnetometers. The geomagnetic field data in high quality from nine observatories are obtained and are open to the public. Then they can greatly improve the geomagnetic field models and can be used for the study of the geomagnetism of core origin and the EM sounding to estimate the electrical conductivity structure beneath the Pacific region [Shimizu and Utada, 1999]. For local research projects like the OHP, however, there has been no simple way to distribute the obtained digital data of geomagnetic field to the research community in general. Actually some data may be distributed by sending e-mails, mailing CD-roms, using WEB services, etc, but data users must make contact with each data center individually and then transform these data to unified ones in a single format before data analysis. In practice, this procedure requires a lot of work and has hampered use of the available data. To solve this problem, an automatic data distribution system is needed. The OHP has originally developed the NINJA system (New Interface for Networked Java Applications) to distribute the broadband seismic waveform data through the internet by using Java RMI (Remote Method Invocation) technology [Takeuchi et al., 2002]. Then we developed a new electromagnetic field data distribution system in cooperation with the Earthquake Research Institute, the University of Tokyo and the Fujitsu Limited following the NINJA system, named EM NINJA.

2.1 WEB server The WEB server, which is an intermediate part between data users and data providers, consists of two subparts; a “Java servlet/dispatcher” subpart and a “directory server” subpart. A Java servlet/dispatcher subpart is the main part of the WEB server which connects to data users and data providers through the internet. This part receives data search request and data creation request from a data user, dispatches the request to a proper data provider (or center), and finally uploads the requested data created at the data center to the data user. A directory server subpart is a branch part which provides primary station information to data users through a Java servlet/dispatcher part; for example, from what stations each data center obtained the data. It can drastically quicken to find which data center has a requested data, just like searching a telephone directory for a telephone number. The directory service is performed by an LDAP (Lightweight Directory Access Protocol) inside the directory server subpart. A network communication is made by an http between a Java servlet/dispatcher and directory subparts. One of the WEB servers has already been installed by the IFREE/JAMSTEC. Its URL is http://www.jamstec.go.jp/pacific21/. Any data user can search and download the data by contacting it, and any data provider can upload their own data through the WEB server provided that the data server part is installed, as mentioned in the next section.

2. System architecture Our goal is the construction of an "easy-to-use” system not only for data users but also for data centers or data providers by using minimal and flexible structures. For this purpose, it is the best way that the system is made of main three parts; one on the side of data users or a “user interface”, one on the side of data centers, and one that interconnects those two parts (Fig. 1). The point is that the number of each of the three parts should not be same; for example, the data provided by several data centers can be distributed to any data user through a single interconnecting part, unlike usual WEB data services which aim to distribute only the data stored in their own data center. Therefore the WEB data server for usual data services is divided into a WEB server and a data server which can be apart from each other, provided both are connected through the internet. It means that each data center does not need to construct the WEB data server but only has to install a “data server“ part of our system which can connect to a single “WEB server” part through the internet. Another point is that

2.2 Data server The data server has to be installed in every data center that stores original data from geomagnetic stations and provides them to data user through the WEB server in response to requests of the data and their format. As mentioned above, the data server is separated from the WEB server, and it makes a construction of data distribution system very flexible. It is, however, a problem that a command must be sent through the internet, for example, with which the data should be created in response to data users’ requests, while usual data service systems with both of the WEB server and the data server on the same place can command it directly. To solve this, the Java RMI technology is used in our system so as to send and/or receive objects or programs through the internet. This means that only the WEB server has to be changed basically if any small modification is made on command, 1

FRONTIER RESEARCH ON EARTH EVOLUTION, VOL. 2

and then it vanishes tasks at all the data centers. Therefore every data server has a Java RMI server to communicate the WEB server and executes requested jobs. For communication by Java RMI, an RMI port is used, but it is usually closed by firewalls at severe network systems. To avoid this, an http port is used by using RMI over http tunneling instead of an RMI port. The jobs executed by the data server are searching data files in the data storage, converting data into a requested format, creating the data archive, and so on, in response to data users’ requests. For effective data file search, data catalogue files about information on every data file are prepared in advance. Just as browsing a catalogue before shopping, the data server checks whether or not the requested data exists in the data storage and where if it does and then shows them to data users through the WEB server before data users request to create data archives. After receiving a request to create the data, the data server retrieves the data file from the file storage, converts the data format and creates the data. In a whole procedure of our data distribution system, a series of these processes are the heaviest task and take a long execution time. To reduce a processing time, a multithread processing is taken to divide a requested job into several small pieces and executes them concurrently. Finally created archive files are uploaded to data users through the WEB server. Each data center is only required to carry out some manual operations periodically besides the maintenance of data storages, that is, to change primary station information files in the WEB server when the stations are added or deleted, and to create the data catalogue files in the data server when new data files are added. In this way, every data center can distribute their own data to the public through a WEB server.

community. Tests were made by simple measurement of waiting time from start of data request to completion of data archive creation. It means that the time was basically measured during data format conversion and data archive creation in the server, which are main parts of data processing. The results of tests are shown in Figure 3 for various output data formats and amount of requested data. The vertical axis indicates a waiting time per data for one month, that is, an actual waiting time for data for three months is three times of the value on the figure. As a result of the tests, the EM NINJA system is as quick as or quicker than the data service by the WDC Kyoto. It should be noted that these results, of course, depend on not only the performance of system but also performance of machines used for servers, but they can be rough indicators of degrees of system convenience because the actual waiting time can be known by the data users. Therefore the EM NINJA system was found to be as useful as the data service of WDC Kyoto.

4. Conclusions We developed a new networked electromagnetic data distribution system, named EM NINJA, following the original NINJA system. This system was designed to be easy to use for both of data users and providers with minimal tasks in order to distribute the data through a single (or a couple of) integrated WWW server(s), and was made by using the Java in order to work in any platform and operating system. It contains main three parts; a WEB server, a data server, and a user interface. A WEB server positions an intermediate part between data user and data provider, that is, a center of our network system. The communication to other parts and branches is made only by an http port and therefore enables management of the internet security to be easy and robust with a firewall. A data server can execute jobs quickly by multithread processing. A user interface is easy and flexible to use for data users by searching with both of simple and advanced search options. As a result of tests of system performance, the EM NINJA was found to be very quick and useful, compared with other conventional data services.

2.3 User interface Data users are only required to have a standard WEB browser to look at html documents on the user interface from the WEB server. Some data search and request parameters chosen on the user interface are sent to the Java servlet in the WEB server and following procedures are carried out in the WEB and data servers. By using this system, data users can download multi-stations data through WWW at once. On the WEB browsing windows of the user interface, a simple search can be done with data period and station name. Also, advanced search options are available; for example, global geomagnetic activities such as values of sum of daily 8 Kp indices, the five or ten quietest days, and the five most disturbed days [Menvielle and Berthelier, 1991]. Furthermore, an option to limit the missing rate in data is provided for users to avoid downloading data that includes unavailable period (Fig. 2). The provided output data are values of one-minute data, and their supported output formats are WDC 1-min, IAGA2000, IAGA2002 [e.g., WDC Kyoto, 2002], INTERMAGNET IMFV1.22, and INTERMAGNET CD-ROM formats [INTERMAGNET, 2004] in addition to the original OHP format. After data users requested the data, they are just required to input their names and e-mail addresses for agreement on the data licenses. Finally data users can download a TAR archived data file.

Acknowledgements. Development of EM NINJA system was made in cooperation with Alexei Gorbatov in the Geoscience Australia, Masahiro Ichiki in the IFREE, JAMSTEC, Hisayoshi Shimizu and Hisashi Utada in the Earthquake Research Institute, the University of Tokyo, Toshiyuki Nakashima and Takuya Arai in the Fujitsu limited.

References Fukao, Y., The Ocean Hemisphere Network Project, Proc. OHP International Symposium, 9, 1997. INTERMAGNET, INTERMAGNET technical reference manual, ver. 4.2, 92 pp., 2004. Menvielle, M. and A. Berthelier, The K-derived planetary indices: description and availability, Reviews of Geophysics, 29, 415-432, 1991. Shimizu, H. and H. Utada, Ocean Hemisphere Geomagnetic Network: its instrumental design and perspective for long-term geomagnetic observations in the Pacific, Earth Planets and Space, 51, 917-932, 1999. Takeuchi, N., S. Watada, S. Tsuboi, Y. Fukao, M. Kobayashi, Y. Matsuzaki and T. Nakashima, Application of distributed object technology to seismic waveform distribution, Seismological Research Letters, 73, 166-172, 2002. WDC Kyoto (World Data Center for Geomagnetism, Kyoto), Data catalogue, no. 26, 122 pp., 2002.

3. System performance The performance of the EM NINJA system was tested for data requests and downloads in comparison with a WEB data service managed by the World Data Center for Geomagnetism, Kyoto (WDC Kyoto, http://swdcwww.kugi.kyoto-u.ac.jp/mdplt/index.html) which provides geomagnetic data from a global permanent observatory network and is commonly used among the geoelectromagnetic 2

FRONTIER RESEARCH ON EARTH EVOLUTION, VOL. 2

Figure 1. Schematic image of EM NINJA system. This system mainly consists of three parts, which are a data user interface, a WEB server and a data server.

Figure 2. Advanced search window. Data users can download the EM data of several stations simultaneously. In addition to standard search options by data period and station name, advanced search options are also available, which are global geomagnetic activities (Kp index, geomagnetic quietest and most disturbed days) and data missing rate

3

FRONTIER RESEARCH ON EARTH EVOLUTION, VOL. 2

Figure 3. Comparison of waiting time from request of data and download of data between the WEB server of WDC Kyoto and EMNINJA in various output data formats and amounts of data, as of January 28th 2005. The vertical axis indicates a waiting time per data for one month. The output data formats supported by WDC Kyoto are WDC 1-min, IAGA2000 and IAGA2002 formats.

4

Suggest Documents