SEMAT, National Current Research Information System for IRAN Mohamad Javad Khoshroo1, Omid Fatemi2
1. Islamic Azad University (West Tehran Branch) 2. University of Tehran and Iranian Research Institute for Information Science and Technology
Summary The Supreme Council of Science, Research and Technology (SCSRT) of Iran which is headed by the president and consists of main decision makers of the country in research area including several ministers has ordered the development of a national current research information system, code named SEMAT, in the country in 2006. The main reasons of such system are to help top decision makers for research process in the country and also to help researchers to access scientific information. A group of representatives from SCRST is formed as the steering committee of SEMAT. For the first phase of the project, three sub-systems have been defined by the committee. In this paper the architecture of SEMAT and its main services are presented.
Introduction Knowledge growth has been in exponential form in the last decade. It is reported that 80 percent of all science and technology findings are produced in the 20 th century. And it is predicted that every five years they are doubled and in less than four years they become obsolete. Therefore, the need for a reliable, fast, efficient and convenient access to scientific information should be answered carefully. On the other hand, today the main asset for every country is the acquired science and technology in the country. And the need for this information dictates the necessity of research information systems. This requirement has been the main force behind systems such as Frida in Norway and NARCIS in Netherland. Looking at the goals and objectives of such systems, one could find many similarities in such systems. Grete Christina Lingjoerde introduces Frida as: “Frida is an integrated research environment for the documentation and presentation of research activities, research results and scientific competence. Data from Frida is used to generate statistics for research activities at Norwegian universities. Information provided by this system plays a major role in the annual funding of universities by the Norwegian Ministry of Education and Research.”)Grete Christina Lingjoerde, Andora Sjørgren 2008) For these reasons, the Supreme Council of Science, Research and Technology (SCSRT) of Iran which is headed by the president and consists of main decision makers of the country in research
27
SEMAT
area including several ministers has ordered the development of such national system in the country in 2006. The main reasons of such system are declared as follows: To regularize the national scientific system To extract inter-organizational level of scientific information To develop on demand research system To develop international scientific collaborations To use scientific information and method for decision makings.
1
SCSRT has declared the following objectives for implementing a national current research information (SEMAT, in Persian): 1. National knowledge management;
The main research body in Iran are formed by universities and research institutes either Governmental or Non-Governmental. In Governmental part, the two main ministries in charge for research are Ministry of Science, Research and Technology (MSRT) and Ministry of Health and Medical Education (MOH). The largest non-governmental university is Islamic Azad University (IAU). The universities and research institutes are independent bodies and make their own management of research process. The policies and standards are set by MOH for health section, MSRT for other universities and the management body of IAU for all IAU branches.
2.
To support developing national science and technology roadmap;
3.
To help organizing the process of science and technology generation;
4.
To support decision makers;
5.
To define national research priorities;
6.
To support and automate the process of assigning public budget for research institutes;
7.
Evaluating and ranking research institutes based on defined criteria;
8.
Increasing collaboration level among researchers throughout the country;
9.
To provide fast, easy and reliable access to scientific information for researchers;
10. To improve the quality of research; 11. To prevent unwanted repeating research works; 12. To promote the standards of generating, organizing, processing, disseminating and retrieving scientific information;
Inception phase In order to start the inception phase, at the first step, the research institutes are defined and introduced. Then the pilot institutes for SEMAT project are selected and the current research processes in the pilots are then presented.
1.1
1.2
Research Institutes
Selected Pilots
Since during the deployment of SEMAT project, scientific information is required to be communicated among research institutes and SEMAT, the steering committee has decided to select pilot institutes for the deployment phase of the project.
1.3
Current status of research processes in institutes
We have selected Enterprise Architecture Model FEAF (Federal Enterprise Architecture Framework) shown in Figure 1 to evaluate and rank institutes and to perform inception phase. The model is built on common business practices and designs that cross organizational boundaries and is suitable for our purpose. The FEAF provides an enduring standard for developing and documenting architecture descriptions of high-priority areas. (Rob C. Thomas 2001)
13. To provide an area for industry to announce their needs and for researchers to propose their solutions. In order to start implementing the system a group of representatives from SCRST is formed as the steering committee of SEMAT. During the last three years, this group is managing the SEMAT project. They have developed the RFP for the first phase. A private company has been selected for developing the system. The first phase of SEMAT includes three main subsystems: a) The sub-system of collecting the information; b) The sub-system for accessing information based on access levels; c)
The sub-system for generating managerial and informative reports.
In this paper, the conceptual model and the architecture of the system is presented. In the next section, for the inception phase, the current situation of research processes is presented. The architecture of SEMAT is presented in the next section followed by conclusions.
28
Figure 1: Structure of the FEAF Framework (Rob C. Thomas 2001)
29
SEMAT
To be able to define the best suitable model for SEMAT, the current processes of research institutes have been identified and modeled. Then four layers of information have been presented for every pilot which are as follows: Business layer Data layer Application layer Technology layer
2
SEMAT Architecture
Having done the inception phase, we analyzed the result and updated the main requirements as follows: 1. Integrating all national research meta-data; 2. Collecting research record sets; 3. Registering scientific and technological information; 4. Providing collaboration environment; 5. Searching through information. Based on the heterogeneity and the complexity of the system, service oriented architecture is selected and all the communications throughout the whole system are performed through receiving and transmitting messages in an Enterprise Service Bus. We have done following steps in order the implement the system: 1.
They have research information system but it is not and cannot become compatible with SEMAT. In this case, messages are generated asynchronously by these systems and are imported to SEMAT.
3.
They have a compatible research information system. In this case, systems communicate with SEMAT and messages are going back and forth among these systems.
2.1
Research Information Integration
The best method to add the ability of integration of collected data from all research organizations is to define a common data model. Our proposed data model is based on our newly introduced Research Information Format called IRIF, Iranian Research Information Format. To design an acceptable IRIF, three steps have been taken: I. Having all existing data models of the pilot institutes, we have first extracted all common data fields among them. II.
In the second step, the above data model is compared with CERIF (Common European Research Information Format).
III.
As the final step, the steering committee decided on the differences found in the second step and selected the final data model as the final Iranian Research Information Format (IRIF).
Research meta-data entities have been identified and the message format and the lifecycle of their corresponding messages have been defined.
2.
2.
Meta-data entities are being generated at the possible earliest time in the research institutes.
3.
All messages are sent from the institutes to the National SEMAT.
4.
SEMAT receives the messages and processes them.
5.
SEMAT manages the messages and stores the validated and processed meta-data.
2.2
The Proposed Method
As discussed earlier, the ideal model of national SEMAT is to be able to communicate directly with institutional SEMATs. In this case, during the generation of research information in institutes, the required data (usually meta-data) would be sent to national SEMAT. This method will ensure the completeness, being up-to-date and validness of registered information. In our proposed method all communications are handled through passing messages in an Enterprise Service Bus. The messages are generated in each institute by a message client and in necessary intervals, the message client sends the data to the server. The block diagram of the architecture is shown in Figure 2.
Based on this architecture, the institutes are categorized in three groups: 1.
They don’t have any research information system. In this case, SEMAT provides an environment for the institutes to transmit their messages.
30
31
SEMAT
space in the system and is able to personalize the environment. General services such as email, forums and RSS feeds are also provided.
4
Conclusions
Based on the necessity of managing all research information in Iran and also the lack of generally agreed standard for this data the project of National SEMAT has been initialized by SCRST. In the first phase of the project, the main effort has been put for collecting all information using a dynamic method which connects the institutes to SEMAT. The system is based on message passing and each message has its own life cycle. The first implementation of the system has shown the effectiveness of the system in using and managing research data.
Contact Information Figure 2: SEMAT Architecture
3
SEMAT services
Having designed the architecture, we have defined different services for SEMAT as shown in Figure 2. These services include: 1) Query and Search service: This service is to perform queries in SEMAT. Full-text search, thesaurus aid in search and structural queries are among the features of this system. 2) Budget management in research institutes:In order to perform national budgeting and public resource allocation for research institutes this service is defined. 3) Coding services: To establish the registration system for research entities and to facilitate integration, a coding service is designed. This service assigns a unique code for every instance of research entities and this code is used as the identifier in the whole system.
S. Omid Fatemi, Ph.D. Director Iranian Research Institute for Information Science and Technology Ministry of Science, Research and Technology, Islamic Republic of Iran Email:
[email protected] Mohamad Javad khoshroo Email:
[email protected]
References Grete Christina Lingjoerde, Andora Sjørgren (2008): Quality Assurance in the Research Documentation System Frida. Cris 2008 . 9th international Conference on Current Research Information Systems. Rob C. Thomas (2001): A practical guide to Federal Enterprise Architecture; http://www.gao.gov/bestpractices/bpeaguide.pdf
4) Validation services: This service is in charge to evaluate, modify and validate data before being stored in SEMAT repositories. Both accuracy and consistency controls are done in this service. 5) Portal and community services: The single point of connection to SEMAT for the users to use the services is through this service. This service enables users based on their access rights to access information and make use of the services. Every user in the system has his/her own
32
33