Jan 26, 2000 - The database software discussed in this memo ... location is stored into the SMA Astronomical Database (SMADB see ... Apple Macintoshes.
SMA Technical Memo# 138
The SMA On-Line Data Archive and Storage System: Software Development and Hardware Prospects Jun-Hui Zhao, Patricia Mailhot & Takahiro Tsutsumi January 26, 2000 ABSTRACT As the MIT/SAO correlator comes on line, the maximum data production rate from the Submillimeter Array (SMA) on Mauna Kea in Hawaii could reach up to 2.75 Mbytes per sampling. For a typical integration time of 10 sec, the daily data production rate of the SMA would be 20 Gbytes/day. The raw SMA visibility data is required to be accessible to the scientists at both CfA in Cambridge and ASIAA in Taipei as well as eventually the international astronomical community. How to archive the data and manage the database will become a challenging issue to the SMA project. A good solution by utilizing the existing resources can be achieved. In this memo, we present the state of the art design for the SMA Astronomical Data Archive and Storage System. The detailed description on the development in the archiving software is also provided. The hardware devices required for implementing the massive data storage system are reviewed and evaluated based on the current technology and their market performance.
1
Contents 1. Introduction
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
2. Software Design and Development
4
4 5 5 7 8 10 17 17 17 20
::::::::::::::::::::::::::::::::
2.1. The Server Architecture 2.1.1 Data Transfer and Storage Server 2.1.2 Sybase SQL Server 2.1.3 Replication Server 2.2. SMADB { The SMA Astronomical Database 2.3. Java Database Connectivity 2.3.1 Jconnect and HTTP Server 2.3.2 SmaJIsql Applet { The SMA Database GUI 2.4. System Setup and Backup Plan
:::::::::::::::::::::::::::::::::::::::::::: :::::::::::::::::::::::::::::
::::::::::::::::::::::::::::::::::::::::::::
::::::::::::::::::::::::::::::::::::::::::::: :::::::::::::::::::::::
::::::::::::::::::::::::::::::::::::::: :::::::::::::::::::::::::::::::::: ::::::::::::::::::
::::::::::::::::::::::::::::::::::::
3. Hardware Prospects
::::::::::::::::::::::::::::::::::::::::::::::::
3.1. Data Rate 3.2. Requirement 3.3. Choice of Hardware 3.3.1. Removable Media 3.4. Recommendation
::::::::::::::::::::::::::::::::::::::::::::::::::::::::: ::::::::::::::::::::::::::::::::::::::::::::::::::::::: :::::::::::::::::::::::::::::::::::::::::::::::: ::::::::::::::::::::::::::::::::::::::::::::
:::::::::::::::::::::::::::::::::::::::::::::::::::
4. Remarks 4.1. 4.2. 4.3. 4.4.
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
Secondary Replication Sites Other SMA On-line Databases Dedicated Network Link Temporary Solution for Data Storage
::::::::::::::::::::::::::::::::::::::: :::::::::::::::::::::::::::::::::::::
:::::::::::::::::::::::::::::::::::::::::::
5. Summary Acknowledgement
::::::::::::::::::::::::::::::
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
References
:::::::::::::::::::::::::::::::::::::::::::::::::::::
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
2
20 21 21 22 22 23 25 25 25 26 26 27 27 28
List of Figures 1. 2. 3. 4.
The SMA On-Line Archive Server Architecture Replication Sites of SMA On-Line Data Archive SMADB { SMA Astronomical Data Model in Sybase SmaJIsql Applet { The SMA Astronomical Database GUI
6 9 16 18
::::::::::::::::::::::::::: :::::::::::::::::::::::::: :::::::::::::::::::: :::::::::::::::
List of Tables 1. The Table Model of The SMA Astronomical Database 2. Hardware Properties for Mass Storage System
::::::::::::::::::
::::::::::::::::::::::::::
3
11 24
1. Introduction
Data transfer and archive are a major step in the processing of astronomical data. Because of high cost in the mass storage hardware and the speed limitation in network communication, an half-decade ago relatively little eort was expended on the archiving of astronomical data with emphasis towards making the raw data permanently on-line and immediately accessible. Due to the recent development in technology the cost of designing and building an on-line archive system has dropped signi cantly. It becomes possible and can be aorded by a project like the SMA to develop and maintain an on-line archive system. A solid design provides easy and quick access to users and also provides high eciency and good performance in data management. The storage and archive techniques used in the modern computer systems are composed of several dierent hierarchical levels. Four levels are needed to make an archive successful, which can be divided into the two major categories, i. e. hardware and software. For the hardware, 1. the storage medium (such as juke box, disk array, and CD-ROM, tape reel ...); 2. the reading or backing up devices (disk drive, tape drive ...). For the software, 3. the data structure on the medium ( le system or standard format); 4. control and management procedures. A general discussion on these levels can be found in Pasian and Pirenne (1995). In this memo, we present the state of the art design for the SMA Astronomical Data Archive System by giving the details of each level. The rest of the memo is organized as follows: In Section 2, we describe the software design and development for the SMA on-line archive system. In Section 3, we will review the prospects of the hardwares that are necessary to implement the entire system. Section 4 gives a few remarks on this system. Section 5 is a summary.
2. Software Design and Development
The design of this on-line archive system is based on the existing resources available to the SMA; this includes software, hardware and manpower. Therefore the cost in the data processing can be minimized. We also consider that the nal 4
system (at least the software part) must be portable and hardware independent. Our goal is to create a system that can be easy to maintain and upgrade. A key commercial database management system (DBMS) that has been used at the SMA is Sybase. Recently, this system has been upgraded to the Sybase SQL Server System 11 at both the Cambridge site and the Mauna Kea site. This relational DBMS (or RDBMS) provides a good development environment for enterprise client/server application. The database software discussed in this memo utilizes the RDBMS.
2.1. The Server Architecture
Fig. 1 shows an overview of the SMA on-line archive server architecture. The visibility data produced from the SMA correlator (Crates) along with the ancillary data is transferred via RPC to the data computer Smadata (Sun Ultra 60/Solaris). Smadata is a central host in which almost all the post-correlator data processing in terms of real-time correction, agging, formating and archiving does occur. In addition, this data computer also hosts the Sybase SQL server, Replication server and the HTTP server.
2.1.1. Data Transfer and Storage server
The RPC server smadata svc provides several data services to process the data received from real-time computers (Crates and Hal). As soon as the server receives a wake up signal (that signi es the observing run to start) from a real-time computer, several FITS-IDI (Diamond et al.1997; and Flatters 1998) table les are created with a post x indicating the type of data table (such as FILENAME.UV, FILENAME.AG, FILENAME.AN, FILENAME.FQ, FILENAME.SU...). The UV data transferred from the Crates will be assembled by the server process smadata svc along with the random parameters for each integration and each baseline. These data are appended continuously into the FILENAME.UV as the observing and the real-time correlation process goes on. The ancillary data is appended into the associated tables if any updating information occurs either as the source changes or as the correlator con guration changes. Immediately after each observing track, these FITS tables are packed into a single FITS-IDI le. This FITS-IDI will be moved to the Online Storage System. Meanwhile, in the process FITStoDB (Tsutsumi & Zhao 1999), the FITS header information along with the other ancillary data such as the setup parameters for the correlator con guration and the information for the FITS-IDI le 5
The SMA On-Line Archive Server Architecture SMA Correlator Crates (Power-PC, RPC clients)
Real-time computer Hal (Power PC, RPC client/server)
Data Storage Device (TBD)
smadata_svc RPC Server Store the FITS_IDI files into the on-line Storage System
files
FTIStoDB
Store the FITS header information into DBMS process
to Network
DBMS
Java-Database connection
Replication Server
jConnect JDBC (drivers)
SQL Server
smadb with jConnect
Compile the Java Applet GUI
FTP or HTTP
FITS_IDI
docdb
dersdb
......
Clients Web Browsers
HTTP Server HTTP
download Java applets jConnect drivers
Java Applets Any Client Computers with Web Browser
Smadata (Sun Ultra 60 / Solaris)
Fig.1: The SMA Archive Server Architecture. The host computer is SMADATA, Sun Ultra 60 /Solaris. The RDBMS is Sybase. The JDBC utilizes jConnect from Sybase. This con guration is for the primary site currently located on the top of Mauna Kea. This system will be moved to the SMA headquarter in Hilo. There already exits a dedicated network link (45Mb/s) between Mauna Kea and Hilo. This system can be shared by SMADB and other SMA databases, such as DERSDB, DOCDB etc.. 6
location is stored into the SMA Astronomical Database (SMADB see Section 2.2) in Sybase. The details regarding the data transfer and data structure for the FITS-IDI standard have been discussed in the SMA Technical Memo 126 (Zhao 1998). Two points that we would like to emphasize: (1) the storage format for the data taken from a single observing run must follow an international standard which is platform and medium independent. The FITS-IDI is the best standard for packing up the raw visibility data in real-time. A FITS binary table is a collection of row data elds, organized into columns. This structure is also similar to the table structure used in a relational database. The information contained in the FITS-IDI le can be easily organized into our RDBMS. (2) the reasons why we do not store all the interferometer data, including the visibility and other ancillary data into Sybase, is because the SMA data sets will be very large in particular if the integration time is shorter than 10 sec. The procedures and I/O within Sybase Servers would not be ecient to handle large data les and will make the archiving process complicated and slow. This will also cause diculty and problems in database management. The data in the FITS header and some of the associated tables are stored in SMADB. The individual FITS-IDI data le can be stored on an on-line le system that requires special hardware devices. The information about and location of each FITS-IDI le will also go to the RDBMS. These data are sucient for users (astronomers) to review the basic information for each SMA observing run and determine where to get the FITS-IDI data le that they want. Thus, with minimal I/O activity the archive data model for our system becomes simple and easy to understand. Also in this system, the individual FITS-IDI les can be put on a portable medium for manually delivery without interruption or slow down to the SQL server. This storage design meets two basic requirements of (1) the exibility in data retrieval process and (2) high eciency for data management.
2.1.2 Sybase SQL Server
The RDBM system is a commercial package, provided by Sybase, Inc. Its relatively low cost (compared with other commercial package such as Oracle) is suitable to the size of a project like the SMA. With the standard ANSI SQL (Structure Query Language), Sybase also has an excellent reputation as a SQL Server. The software functions supported by Sybase are also enough to meet our requirements for archiving documentation and data management carried out at the SMA. Sybase SQL Server is available on many hardware platforms under a variety of operating systems. 7
The SQL Server-client platforms can be any of the following system: MS-DOS or Windows PCs, OS/2 PCs, Next PCs, Unix workstation, Unix terminal servers, or Apple Macintoshes. The exibility in hardware and OS system makes our on-line archive system portable and easy for upgrading in hardware. One of the major components in the Sybase SQL Server architecture is the Server Nucleus, that is the collection of software installed on a host platform that acts as a server to a client platforms. It includes the SQL Server Kernel, Process Manager, Query Optimizer, system databases, and application data.
2.1.3 Replication Server
Data replication is now one of the most common ways to distribute data to remote sites. If a large number of remote users and remote applications need to perform ad hoc queries and transaction processing or a large volume data transfer is involved, the network can bog down. In particular, for the SMA, we have a primary database server at the Mauna Kea site, most of the users will remotely access the data. Based on their location, the remote users can be divided into two groups, namely Cambridge (Massachusetts) based or Nankang (Taipei) based. For a large volume of the SMA visibility data, users and applications may have to wait unacceptable lengths of time for receiving a complete data set. This will also generate a large amount of network trac. Instead, we can replicate the data to local servers, and users can access the data locally. Sybase Replication Server provides an excellent solution for replication of a database at a remote secondary site. As illustrated in Fig. 2, Replication Server transmits data from the primary database server in Hilo (where the data originates) to secondary site (the Cambridge site and/or the Nankang site) on the network. Rather than requiring each user to access the remote database server over a long-distance network, this Replication Server handles data transmission and ensures that each local server has an up-to-date copy of the data. We note that in this on-line archive system, the RDBM (Sybase) handles only the header information stored in SMADB (the SMA Astronomical Database, see 2.2) while the visibility data in the massive storage system is handled separately. Since the volume of the header information data is small, it can be easily replicated through the network. During the testing period, the visibility data might be small enough to allow us to duplicate all the data including the data in SMADB and the FITS-IDI les at the remote sites via the existing network link. As the full correlator operation 8
Replication Sites of SMA On-Line Data Archive From Mauna Kea
Special Link
45Mb/s (30T1) i Hilo
smadata_svc
Express Air Delivery Service
RPC Server
Cambridge
FITS-IDI
Data Storage Device
files
Data Storage Device
FITStoDB
toClients Replication Server
SQL Server smadb
Nankang Network toClients SQL Server
Data Storage Device
Replication Server smadb
toClients Replication Server
Primary Site
SQL Server smadb
Secondary Site
Fig.2: A concept of the replication sites for SMA Astronomical DATA Archive. Hilo is considered as the primary site; the secondary sites are in Cambridge and Nankang. The data in SMADB can be replicated through normal network link. However, the large FITS-IDI les need either a dedicated link or portable medium with an express air delivery service. For the ASIAA, it is not necessary to follow this design. They can develop a database system based on their resources. The FITS-IDI les can be duplicated from the primary site. 9
comes on line, the duplication through network will run into some problems due to the limited bandwidth in network transfer. Either a special network link or other portable medium combined with some commercial express delivery service is required. The approach proposed here is a remedy for the problems associated with the distribution of and direct access to the SMA data. Using Replication Server is ecient, because Replication Server only replicates original data that is added, modi ed, or deleted. It also is fast, because Replication Server copies all of the data to the remote server, so the remote users can access it over the Local Area Network (LAN). Replication Server provides an additional important advantage. If the local data server or local network is down and transactions need to be replicated, Replication Server performs all of the necessary synchronization when the local network is available to each other. This is an excellent function for disaster recovery. Replication Server will also eectively reduce the manpower required for the maintenance of the SMA Astronomical Database. However, Replication Server has not been created in our system yet. We strongly propose to implement the Sybase Replication Server to our database server system in the next software upgrade. The cost would be minor as compared with the returns.
2.2.
SMADB
{ The SMA Astronomical Database
The data model design in Fig. 3 is based on the data structure of the FITSIDI le and the information obtained via the real-time data collection process. Ten relational Sybase tables are needed to model the SMADB. Using this data model, the physical schema is designed (see Tables 1{10). Table 1 (RUN LOG) contains the general information for each observing run. The information about the correlator that generates the visibility data is included in Table 2 (CORR). The mandatory keywords for each FITS-IDI le are stored in Table 3 (FITS KEY). The general information on FITS tables in each FITS-IDI le is stored in Table 4 (TAB NM). The parameters for frequency setup, source coordinates and velocities are stored in Tables 5 (FREQ), Table 6 (SOUR), and Table 7 (VELO). The information regarding the array geometry is saved in Table 8 (ARR GEO). The information on the visibility table and information about each FITS-IDI le can be found in Table 9 (VIS) and Table 10 (DFILE), respectively. The tables are laid out as: The 1st column is the variable name; the 2nd column is the data type; and the comments on each variable is given in the 3rd column. 10
Tables: The SMA Astronomical Data Model 1. RUN LOG obscode char(20) /* observing program code */ job id smallint /* job ID for current script le */ observer char(20) /* observer's name */ telescope char(16) /* telescope name */ start double precision /* job start time (MJD) */ stop double precision /* job stop time (MJD) */ date char(10) /* date of observation */ clustered index: obscode index: job id, observer, start, date byte size: 84 2. CORR obscode char(20) /* observing program code */ corr name char(12) /* correlator name */ corr vers char(12) /* correlator software version */ char(12) /* correlator mode */ corr mode no pol tinyint /* # of polarizations*/ ave time
oat /* data averaging time */ transition char(48) /* transition name */ restfreq double precision /* line rest frequency for each band */ sideband smallint /* sideband ag */ smooth char(12) /* smooth function */ band idx smallint /* band index */ block ct freq
oat /* block center frequency */ no chan chck1 int /* number of channel of 1st chunck */ no chan chck2 int /* number of channel of 2nd chunck */ no chan chck3 int /* number of channel of 3rd chunck */ no chan chck4 int /* number of channel of 4th chunck */ clustered index: obscode index: corr name, corr vers, corr mode, band idx byte size: 153 3. FITS KEY 11
obscode char(20) no stkd smallint stk 1 smallint no band smallint no chan int ref freq double precision chan bw
oat double precision ref pixel rdate int clustered index: obscode composite index: rdate byte size: 54 4. TBL NM obscode char(20) lename char(32) tb names char(48) int tb numb clustered index: obscode index: tb name byte size: 104
/* observing program code */ /* number of Stokes parameters */ /* rst Stokes parameter in data */ /* number of 'bands' in data */ /* number of spectral channels in data */ /* reference frequency */ /* frequency channel bandwidth */ /* coordinate for reference frequency*/ /* reference date (mjd) */
/* observing program code */ /* FITS IDI le name */ /* list of FITS tables in FITS IDI */ /* number of FITS tables */
5. FREQ obscode char(20) /* observing program code */ freqid tinyint /* frequency setup number */ band idx smallint /* band index */ bandfreq double precision /* frequency oset + ref frequency */ ch width
oat /* individual channel width */ total BW
oat /* total bandwidth of a band */ sideband smallint /* sideband ag */ transition char(48) /* transition name */ clustered index: obscode index: freqid, bandfreq, band idx, transition byte size: 89 12
6. SOUR obscode char(20) /* observing program code */ sou id tinyint /* source ID number */ sou name char(16) /* source name */ qual smallint /* source quali er */ calcode char(2) /* calibrator code */ freqid tintyint /* frequency ID */ char(6) /* on-source time */ time onsource epoch double precision /* epoch for the source coordinates */ cra char(12) /* RA at the epoch in character string*/ cdec char(12) /* Dec at the epoch in character string*/ ra double precision /* RA at the epoch in deg.*/ dec double precision /* Dec at the epoch in deg.*/ pmra double precision /* proper motion in RA */ pmdec double precision /* proper motion in Dec */ parallax
oat /* parallax of source */ clustered index: obscode index: sou id, sou name, freqid, qual, calc, ra, dec byte size: 116 7. VELO obscode char(20) /* observing program code */ tinyint /* source ID number */ sou id freqid tinyint /* frequency ID */ band idx smallint /* band index */ sysvel double precision /* systematic velocity for each band */ veltyp char(8) /* velocity type */ veldef char(8) /* velocity de nition */ restfreq double precision /* line rest frequency for each band */ clustered index: obscode index: sou id, band idx, freqid, sysvel, veltyp,veldef, restfreq byte size: 56 8. ARR GEO obscode
char(20)
/* observing program code */ 13
arrname char(12) /* array name */ frame char(12) /* coordinate frame */ arrayx
oat /* x coordinate of array center */ arrayy
oat /* y coordinate of array center */ arrayz
oat /* z coordinate of array center */ freq
oat /* reference frequency for the array*/ timesys char(12) /* time system */ rdate int /* reference date (mjd) */ gstia0
oat /* GST at 0h on reference date */ ut1utc
oat /* UT1-UTC */ iatutc
oat /* IAT-UTC */ polarx
oat /* x coordinate of north pole */ polary
oat /* y coordinate of north pole */ clustered index: obscode index: arrname, arrayx, arrayy, arrayz, rdate byte size: 96 9. VIS obscode char(20) /* observing program code */ lename char(32) /* le name */ rdate int /* reference date (mjd) */ begin time double precision /* begin time */ end time double precision /* end time */ number baseline tinyint /* number of baselines */ array id tinyint /* sub-array ID */ total inttime
oat /* total integration time */ #vis int /* total number of visibility */ clustered index: obscode index: rdate, begin time, array id byte size: 82 10.DFILE obscode lename location
char(20) char(32) char(32)
/* observing program code */ /* FITS IDI le name */ /* location of FITS le on on-line system*/ 14
size Mb #source arch date status
oat tinyint char(10) int
/* le size in Mbyte */ /* number of sources */ /* archiving date */ /* a ag to identify if the data is open to public or restricted: 1 ; Yes; {1 ; No */ >
>
clustered index: obscode composite index: full lename ( lename, location) byte size: 103
The parameter \obscode" is unique for each observing run and has been designed as the cluster index in this particular relational database. The \cluster" index of a table (there can only be one per table) is used to indicate the physical ordering of table's row. Therefore, SMADB is organized based on the order of obscode assigned for each observing run. In each of the tables, we also listed the \nonclustered" index. This index provides access to the table's data in an alternate order. Although access time via this type of index is not as good as by using the cluster index, nonclustered indexes allow the user to look at the table's data in more than one way. For example, we can look at the data in an order as the source's coordinates (either ra, or dec). One \composite" index, full lename ( lename, location), is used in this database. This index is composed of two column variables in DFILE. Composite indexes are helpful when two or more columns are best searched as a unit. The tables can be linked to each other through the common parameters (primary key) as indicated in Fig. 3. We also list the byte size in the end of each table. A total of 1 Kbytes is estimated for each set of table data. For a typical observation track, the observing time is about 8 hrs. In other words, the SMA will daily produce three sets of table data that need to be stored into SMADB. Assuming 300 observing days each year, only about 1 Mbyte table data will be produced for the database per year if no text les of scienti c proposals are considered to be stored in Sybase. Optionally, we could also store the text les of scienti c proposal into Sybase. The size of the database would be roughly increased by an order of magnitude. 15
SMADB SMA Astronomical Data Model in Sybase RUN_LOG
transition
FITS_KEY
restfreq
CORR
bandinx
TBL_NM
freqid
rdate
SOUR sou_id
obscode
FREQ
ARR_GEO
filename
VELO
VIS DFILE
Fig.3: The data table model for SMADB. Ten relational tables are used in this database. The table can be linked to each other with the parameters labelled along the connection lines. 16
2.3. Java Database Connectivity
JDBC (Java Database Connectivity) is a Java API (Application Programming Interface) for executing SQL statements. It consists of a set of classes and interfaces written in the Java programming language. JDBC provides a standard API for tool/database developers and makes it possible to write database applications using a pure Java API.
2.3.1. Jconnect and HTTP Server
A JDBC driver, jConnect purchased from the Sybase company, has been installed in the Server host computer Smadata. The basic con guration for the SMA On-Line Archive System is illustrated in Fig. 1. JDBC provides standard Java API codes that allows us to develop a speci c Java Applet GUI (Graphical User Interface) to communicate with SMADB via SQL Server. The data computer also hosts an HTTP Server. This Server provides a port for outside clients to download Java Applets and therefore to establish a connection with the database Server. As soon as the Client/Server connection is established, the data transaction can proceed via the network. In principle, a client can be anywhere in the world. As long as the client is allowed to access the computer network (such as the internet) and the client computer is equipped with a Java supported Web browser, the client should be able to access the SMA On-Line Database.
2.3.2. SmaJIsql Applet { The SMA Database GUI
SmaJIsql is a Graphic User Interface (GUI) specially designed for SMADB using Java Applet. This is a pure Java code utilizing JDBC components. The design of this GUI has the exibility to access the data in SMADB to meet the needs for general users in astronomy. The current version is SmaJIsql 1.0. The GUI of SmaJIsql 1.0, as illustrated in Fig. 4, consists of ve major parts: 1. The data fields for the input parameters: This part is composed of four rows. The rst rows are the parameters for establishing a connection to SMADB, including the IP address of the host computer, the SQL Server port number, the username and password for the database access and the database name. In the current version (SmaJIsql1.0), the password for SMADB is encoded; the input eld of this parameter is locked. For security, we will take out the rst row in the next version upgrading. The next three rows involve the searching parameters. The object name, its coordinates, observing time on 17
SmaJIsql Applet SQL Server host na SQL Server port nu Username:
Password:
Database:
131.142.12.246
4100
jzhao
*******
smadb
Object Name:
Object RA:
Object Dec:
Time on source (H Band Width (MHz)
Any
hh:mm:ss.pp
+dd:am:as.p
Any
Any
Array Configuratio Radius:
Equinox:
Correlator Mode:
Observing Band:
Any array
All Sky
J2000
Any mode
Any band
Observing Date:
Observer Name:
Program ID:
Title of Proposal:
Dataset Name:
’mm/dd/yyyy’
Any
Any
Any
Any
Query: get_SourInfo
Search
ftp
Results:
Fig.4: The SMA Astronomical Database GUI built in Java Applet using JDBC components. It can be downloaded from a Web browser with Java support and run on a client computer. The data transaction between users and SMADB can be done via this interface. 18
2.
source and observing bandwidth are listed in the second row. The third row is the searching parameters such as the Array con guration, Searching radius on the sky, Equinox of the coordinates, Correlator mode and Observing band; each of them is attached with a multiple choice list. These choice lists, which are created from the Choice class in Java, are components that enable a single item to be picked from a pull-down manu. The fourth row includes the searching parameters as follows: Observing date, Observer name, Program ID, Title of Proposal, Data Set Name. In SmaJIsql1.0, not all the searching parameters have been applied to the query processes. Query options: Three query options are provided in SmaJIsql1.0. Each of them corresponds to a stored procedure embedded in SMADB. Stored procedures are programs that are written using ANSI SQL commands combined with standard programming constructs. Stored procedures are secure and performance ecient. The rst query option get SourInfo is to search for the source information by selecting Object Name and/or Object RA (input format = hh:mm:ss.pp), Dec (input format = dd:am:as.p) and Searching radius. The returned results are SMA program ID code (Obs code), Object name (Source), Object right ascension (Ra), Object declination (Dec), Calibration code (calc, T:target, C:calibrator), Total integration time (Otime), Observing date in mm/dd/yyyy (Obs date), and Principal observer name (P. I. name). The second query option get FileInfo to search for the data le information by selecting Program ID or Principal observer name (Observer Name), or Observing date (in format mm/dd/yyyy). The returned results are SMA program ID code (Obs code), Principal observer name (P. I. name), Observing date (Obs date), Number of visibilities (#vis), Size of the data le in Mbytes (Size), Number of sources contained in each le (#source), Location of the les (the rst character code C: Cambridge, H: Hilo, M: Mauna Kea) , Data le name. The third query option get FreqInfo is to search for frequency setup information by selecting Object Name or Observing Band or Program ID. The returned variables are SMA program ID code (Obs code), Object name (Source), Observing frequency (Obs freq in Hz), Channel width (Ch width in Hz), Bandwidth (BW in Hz), Transition Name (Transition), Systematic Velocity (Sys vel km/s). More query options can be added in order to meet users' requirements. 19
3. 4.
5.
The Search bar is an action key. By clicking it, The searching action will take place based on the input parameters and the selected query. FTP or HTTP: By clicking the ftp bar, the Applet will link the user to the FTP and HTTP page from where the FITS-IDI les can be found. The data can be transferred via either ftp or http. Results: This panel lists the results returned from each searching action based on the query option. Searching Key:
2.4. System Setup and Backup Plan
SQL server for SMA data resides on SUN Ultra Sparc 60 running Solaris 2.7. SMA is running Sybase 11.5.1 at both Cambridge and Hawaii sites. Although Sybase SQL can use either a le system or raw partitions, we opt for the raw partitions for security of data as is recommended by Sybase. The Unix partitions are owned by `sybase' (a user's account in the Unix system) and the server daemon along with all DBCC (database consistency checker) and backup scripts are executed by `sybase'. The master database (which contains all the system information) is mirrored. In case the original device for master database is damaged, SQL Server could start the mirrored device. The backup and disaster recovery plan is two-level as follows: Level One: User database recovery. Any database owner can retrieve a copy of their database on-line by executing an sql command within Sybase. Level Two: System recovery. Master database can be retrieved on-line. Key system tables and data are backed up. The incorporation of the Sybase Replication Server is scheduled for implementation next year. The dual purpose of this server is for disaster recovery and data reliability. This server will also allow us to eliminate the need for mirroring of databases with high levels of I/O.
3. Hardware Prospects
As the SMA becomes fully operational (with all 8 antennas and full set of the MIT/SAO correlator), the storage of the data becomes a major issue for the on-line data archive system described in the previous sections. We will inevitably need a high capacity mass storage system. 20
Fortunately, the technology for storing large volume data is rapidly advancing and numerous choices are available today. However hardware can be dicult to replace than software, thus we should choose the mass storage devices carefully. In this section, we discuss criteria for choosing the hardware and compare the devices that the current mass storage technology oers. At the end, recommendations are given to aid future decision making on the purchase of the hardware.
3.1. Data Rate
In SMA Tech Memo #99 (Masson 1996), the data rate and the backup storage for the SMA were discussed. Here we update the discussion based on the current speci cation of the SMA. With the eight-element SMA in full operation, 2688 chips with 128 lags/chip (64 spectral channels), or 172032 channels are available. The maximum data rate will be 1720328bytes (complex vis.) 2 (DSB) = 2.75 Mbytes per sampling. For a typical full track synthesis observation of 8 hours, assuming typical sampling rate of 10 seconds, 7.9 Gbytes of the data per each track or about 16 - 24 Gbytes per day will be produced. For high resolution array modes including SMA + JCMT + CSO combined array mode, higher sampling rates are expected if phase correction techniques are to be applied. If we accumulate the data with the above mentioned rate and assuming 256 observing days per year(assuming operational eciency of 70%), about 4 - 6 Tbytes of data will be produced in one year. The data can be temporarily put into hard disks but it is apparent that we need massive storage spaces for more long-term data storage. Note that above discussion assumes that all the data are stored without any on-line calibration (e.g. on-line phase correction) which can reduce the data volume drastically. Also, it is likely that many projects (for example continuum observations) will produce much less data per unit time. In average, we expect that the data production rate would be 1 Tbyte per year.
3.2. Requirements
The current proposed design of the replication sites as discussed in the previous sections requires installation of data storage devices in Hilo, Cambridge, and Nankang. At the primary site in Hilo, we should consider both temporary and longterm data storage to hold at least few months worth of the maximum data rate. At Cambridge and Nankang, each should have a mass storage device to hold about 1 to 2 years full operation data. In the future, if all the SMA scienti c data need to be 21
available to the wider astronomical community, the Cambridge and Nankang sites are likely to become science data center and may need to increase storage capacity.
3.3. Choice of Hardware
There are two types of mass storage devices: magnetic disks and removable media. For storage system based on hard disks, one can use disk array such as RAID (Redundant Array of Inexpensive Disks) to increase data redundancy for reliability. One of the advantages of such a system is that all the data can be available on-line all the time. However, additional cost of secondary backup on tapes may need to be considered. And the initial setup to create data redundancy may be expensive. The other alternative is the use of removable media. For most of the high density removable media, devices that uses robotic mechanisms are available to build mass storage systems. These are sometimes called auto-changers or juke boxes, or in the case of tape media, (tape) libraries. With such devices one can build a semi-online data archiving system with large volume capacity.
3.3.1 Removable media
There are many factors in choosing the right hardware for mass storage for our needs. Some of the key issues are: volume, easy access, easy maintenance, reliability, cost of media and the system as whole, I/O speed, durability, and lifetime of the technology. In Table 2, high density media most commonly used for mass storage system are evaluated in terms of these criteria. Tapes: Since tapes utilize sequential access, the data transfer rate is generally slow. The 8-mm exabyte and DAT tapes have been one standard storage medium for astronomical data. The tape auto-exchanger, or library can store many 100 or 1000 Gbytes of the data. While these media are still very good for transporting and backing up the data, these are not suitable for long term storage because of the limited lifetime of the media. A recent development in tape storage technology is DLT (Digital Linear Tape). DLT is a cartridge tape, data are recorded in longitudinal multi-tracks versus the slanted stripes of helical scan technology which signi cantly increases data accessing rate. DLT is a durable (media lifetime 30 years) and has very large capacity (10 - 80G[compressed] per volume), with a low cost. The DLT libraries are currently available from many vendors that can hold up to few tens of Tbytes. Super DLT is the newest technology which may be available on the market in the future, with about ten times capacity on a single cartridge than DLT. There 22
are other variants of high capacity tapes but the longevity of the such technologies may be in question. MO: Magneto-optical (MO) disks are also a popular choice among the high density media with reasonable storage capacity (5.2GBytes on a single MO disk). Juke boxes to store up to 1 TBytes are available. WORM: Write-Once-Read-Many(WORM) optical disks and the juke boxes have been used in the astronomical data archive such as the HST. The capacity can go up to 12 Gbytes per medium. But as CD/DVD becomes more popular, this technology is now obsolete. The media and system are more expensive than others. CD-R: The \write-once recordable" CD (CD-R) is inexpensive and CD juke boxes have been available for quite sometime. But a single CD-R can hold only 0.65 Gbytes while our data at maximum rate can produce a single FITS le of a single observing run of a few G bytes. DVD-R: The \write-once recordable" digital versatile disk (DVD-R), which has been considered to replace CD, are now starting to appear on the market. Current single-sided DVD-R holds 4.7 Gbytes. The storage capacity will doubled when double-sided disks become available in future. DVD-R drives in juke box con guration (capacity 4 T bytes) are just appearing on the market. These juke boxes(drives) are often backward compatible as the devices can handle mixture of CDs and DVDs. The ESO is considering the DVD-R juke boxes for their mass storage system for the VLT (Pirenne 1999).
3.4. Recommendation
Based on our investigation, the DLT library or DVD-R juke box systems would be our preferred choice among the mass storage devices. While the DVD-R technology is relatively new and still developing, the prospects and longevity of the DVD technology appear promising. As the price of media is going down, the price/GByte of the DVD juke box becomes comparable to the DLT library of the similar capacity. As of December 1999, a juke box that can store over 700 disks is available. This device can hold 3 TBytes of data with single-sided DVD-R disks. For the minimum requirements at the summit, additional hard disks to hold up to 60 Gbytes of the data temporarily and a mass storage system to hold 1Tbytes are necessary with backup rate of 20 Gbytes/day. A single unit of the DVD-R juke box would be sucient for the storage system. The data are copied to media and can be delivered to the Cambridge and Nankang sites. At these sites, each should have >
23
a
< $1-2
b
10-40 30 1-5 high yes easy yes $2
Table 2: Hardware Properties for Mass Storage Media 8mm exabyte & 4mm DLT DAT/DDS 1-5 2-5