Document not found! Please try again

GIS-based Family Tree System Integration

5 downloads 15887 Views 190KB Size Report
A few websites begin to establish characteristic functions such as constructing family tree image database, creating and managing family tree online, querying.
GIS-based Family Tree System Integration Di Hu, Guonian Lv1, Yongning Wen, Jing Jia, Yiru Feng, Qian Gong Key Lab of Virtual Geographic Environment, Ministry of Education, Nanjing Normal University Nanjing, China 1 [email protected] Abstract— With the advancement of electronization and digitization of family tree, information sharing and function integration have become the bottleneck of the development and application of family tree. To address the problems such as the existing family tree systems are mutually closed and incompatible, family tree information representation is incomplete and non-standard, time and space information implied in family tree information is not well excavated and expressed, family tree information is repeatedly collected by each system, this paper proposes to parse and position family tree information under the Consistently Spatial-temporal Framework by using GIS technique, in order to regularize the expression and sharing of family tree information. Absorbing the VGI idea, people are encouraged to collaboratively collect and share family tree information. This paper discusses some key issues about the integration of family tree systems from aspects of the Consistently Spatial-temporal Framework, information sharing, interoperability, application mode etc. It designs a prototype of integration framework of family tree systems based on GIS and implements it by taking HXJP-GIS system and DZP-PGV system as an example. The practice indicates that the framework is feasible. It provides some new conceptions, reference methods and technical supports to construct the Chinese Family Tree Database and Platform for better protecting and utilizing family tree.

II. FAMILY TREE SYSTEM With the advancement of computer and network techniques, digitalization and informatization of family tree has occurred, and a series of family tree websites and stand-alone systems have been published. According to the search ranking of Google and Baidu, main family tree websites at home and abroad and their functions are listed in Table 1. In addition, the U.S. National Archives and Records Department, National Archives of Canada, the United Kingdom Public Archives, National Archives of Italy, National Library of China, National Library of Taiwan have established family tree database. Meanwhile, many local libraries in Shanghai, Shandong, Anhui, Zhejiang, Fujian, Guangdong and Sichuan in China also have established family tree catalogue database. The main functions of these websites are news managing, family tree and surname research, family tree catalogue, forum etc. A few websites begin to establish characteristic functions such as constructing family tree image database, creating and managing family tree online, querying ancestor according to census data records, death records and land ownership, tracing relative through face recognition, and displaying family free information by map or GIS and so on. For example, www.zongen.com makes use of map image to Keywords— family tree, GIS, Consistently Spatial-temporal show clan’s migration routes; www.dazupu.net.cn connects Framework, system integration, distributed Google maps to display the path of rooting; with GIS I. INTRODUCTION technique, www.hxjiapu.com.cn plots clan’s migration map in Family tree documents information with different real time and dynamic map of personal life; other websites granularity from individual, family, clan to social in details, also have the function of searching family tree catalogue using multiplication and migration locality as spatial axis and according to location or map. About stand-alone systems, there are Family Tree Mister, clan lineage relationships as temporal axis. It is important Eternal Family Tree, Saina Multi-media Electronic Family historical literature which contains a large amount of Tree System, Green Family Tree management System, Sail information about history, geography, culture and others [1]. Chinese family tree is huge in quantity, rich in content, varied Family Tree System, Legacy Family Tree, Family Tree Maker in form and has core value on cultural relics, literature, and RootsMagic etc. at home and abroad. The main functions education and rooting [2][3]. It has been the hotspot of them are family tree inputting, managing and printing. concerned by many scholars in different fields. In recent years, Among them, Family Tree Maker supports built-in maps. Family tree systems are developed by different enterprises scholars of GIS also start to pay attention to it [2]–[11]. With and research institutes with respective characteristics. Their the advancement of computer and network techniques, a series functions are from simple, e.g. bibliographical retrieving, to of family tree systems, including websites and stand-alone complex, e.g. creating and managing family tree online. Map systems, successively appear both at home and abroad. System functions are from simple, such as news managing, and GIS are widely used in family tree systems and the surname, bibliographical and searching, to complex, such as advantages of managing family tree information by GIS are creating and managing electronic family tree online. A few obvious. Each system designs its own data model and format systems with electronic map or GIS functions have attracted and most foreign systems support GEDCOM specification. much attention. However, information sharing and system Presently, the main problems effecting the development of integration have become very important issues of family tree family tree systems are: (1) Systems are mutually closed and incompatible. (2) Family tree information representation is informatization.

___________________________________ 978-1-4244-8351-8/11/$26.00 ©2011 IEEE



incomplete and non-standard. (3) Time and space information implied in family tree information is not well excavated and expressed. (4) Family tree information is repeatedly collected by each system. Family tree resources are seriously wasted. So, it is urgent to join all the enterprises and research institutes relate to family tree and construct the Chinese Family Tree Database and Platform together to meet the needs of people coming to find their roots and studying family tree. TABLE I INTRODUCTION OF MAIN FAMILY TREE WEBSITES AT HOME AND ABROAD

Website www.chinajiapu.com

Major function news managing, family tree research, rooting , celebrity searching, family tree catalogue, family tree transaction etc. www.jiapu.com news managing, family tree managing, rooting, family tree transactions, family tree catalogue, family tree browsing, family tree revising etc. www.baixun.com/v2007 news managing, surname research, family tree managing, naming, forum etc. www.zupulu.com surname research, family tree managing, memorial hall managing, celebrity introduction, family tree research, family tree and people searching, forum etc. www.chinataiwan.org/zp news managing, family tree catalogue, pd rooting, surname research, celebrity managing, family tree knowledge, family story etc. www.hxjiapu.com.cn news managing, family tree research, family tree search, ancient and modern placename, clan migration, family tree managing, forum etc. www.zongen.com news managing, family tree search, family tree managing, digital ancestral hall, family tree knowledge, surname origin, celebrity managing, rooting, forum etc. www.dazupu.net.cn news managing, family tree search, family tree research, surname research, family tree managing etc. http://search.library.sh.cn family tree catalogue, family tree search, /jiapu family tree picture etc. http://ouroots.nlc.gov.cn news managing, family tree search, surname research, family tree culture, local records, consultation etc. www.ancestry.com news managing, family tree managing, family tree search, relative tracing, family tree story, family tree printing, clan history, forum etc. www.familysearch.org relative tracing, family tree research, family tree search, family tree managing, blog etc. www.geni.com family tree managing, relative tracing, family tree sharing, celebrity managing, forum etc. www.myheritage.com family tree managing, clan service, family community establishing, rooting, family tree picture etc.

III. FRAMEWORK OF FAMILY TREE SYSTEM INTEGRATION A. The Principles of System Integration

Chinese family tree is huge in number and distributed in the world. Meanwhile, there are hundreds of family tree websites and stand-alone systems at home and abroad. Each system has its own data model and data format. So, it is difficult to integrate these systems. Based on years of study about Family Tree GIS, the authors put forward four main principles as follow. 1) According to the Consistently Spatial-temporal Framework: Spatial-temporal information in family tree is expressed in diverse ways with different reference datum. Most family tree systems do not pay much attention to clan’s lineage relation and space relation. They just use text to store and express spatial-temporal information. These result in semantic loss of spatial-temporal information. Expressing family tree spatial-temporal information in a unified, standard and obvious way under the Consistently Spatial-temporal Framework is the foundation of family tree information sharing, analysing and applying. 2) Taking GIS software as basic platform: Spatial database is appropriate for storing and managing family tree information which is characterized by typical spatial-temporal feature. GIS, with unique spatial analysis and powerful visual expression ability, is useful to construct clan’s spatialtemporal spectrum and understand the mechanism of clan’s inheritance and development. Taking GIS software as basic platform to manage family tree information will be beneficial to combine historical literature method with geographical correlation analysis method. Some difficult problems of humanities and social science may get a breakthrough from multi-angles. 3) Normalization and standardization of family tree information: There are no unified definition and description of family tree information because of its rich content and varied ways of expression. Each family tree system designs its own data model and data format. Only a few foreign systems support GEDCOM specification. So, it is difficult to share family tree information among systems. The standard format of family tree information should be drawn up from family tree metadata and full-text data. 4) Distributed and centralized architecture: The objectives of family tree system integration are information sharing and system interoperation. On the one hand, distributed family tree and systems, especially proper family tree resource and research result, require that the integration platform is distributed. On the other hand, multi-level and multi-domain applications of family tree require the integration platform to manage all the Chinese family tree in the same way and provide abundant and convenient application entrances. Distributed architecture is efficient and centralized architecture is easy for management. Therefore, combing the two architectures will be able to maximize the integration of existing family tree systems and the value of the integration platform. Furthermore, it should also keep the original text and the source of information in family tree system integration.



B. The Framework of Family Tree System Integration At present, hundreds of websites about family tree already exist and new websites constantly appear. According to their content, there are comprehensive family tree website, surname website, personal website, blog and website which provides family tree column or publishes news related to family tree. There are also dozens of family tree stand-alone systems which are tools of websites for rapidly collecting data. All these systems have accumulated a large number of users and family tree data. Absorbing the VGI idea, collaboratively collecting and sharing family tree information is encouraged. Family system integration is to construct the Chinese Family Tree Database and Platform called ZHJP integration platform which integrates multi-source and heterogeneous family data and software systems. The framework of ZHJP integration platform is shown in Fig. 1. ZHJP Integration Platform

ऎඳྦྷ⇣㔥 ྦྷ⇣㔥 Surname website

ऎඳྦྷ⇣㔥 Comprehensive ऎඳྦྷ⇣㔥 family tree website

Comprehensive ऎඳྦྷ⇣㔥 ྦྷ⇣㔥 family tree website

ऎඳྦྷ⇣㔥 Stand-alone ྦྷ⇣㔥 family tee system

ऎඳྦྷ⇣㔥 ྦྷ⇣㔥 Surname website Regional ऎඳྦྷ⇣㔥 ऎඳྦྷ⇣㔥 surname website

Branching ऎඳྦྷ⇣㔥 ऎඳྦྷ⇣㔥 surname website

Regional or branching ऎඳྦྷ⇣㔥 ྦྷ⇣㔥 surname website Personal website and ऎඳྦྷ⇣㔥 ྦྷ⇣㔥 personal blog

Personal ऎඳྦྷ⇣㔥 ऎඳྦྷ⇣㔥 surname website

IV. THE KEY ISSUES OF FAMILY TREE SYSTEM INTEGRATION A. The Consistently Spatial-temporal Framework A lot of spatial-temporal information is implicated in the preface, personal biography and other content of family tree, which is important for studying clan’s lineage, migration and spatial distribution. It is usually expressed in the ways of text, picture and map etc. Time information is usually expressed both in Chinese traditional calendar and Gregorian calendar in the form of text. Chinese traditional calendar and Gregorian calendar are different in time datum. The order of time and text is also different. So, it is hard for computer to compare different time information. Problems such as clan’s lineage, population age, life regulation etc. cannot be well solved. Space information is mainly expressed by placename and simple map which lacking accurate description about spatial location and range. Placename changes frequently. A place may have many names, and meanwhile many places may have the same name. Placename position has become the key problem of rooting and studying spatial distribution and migration of clan. The key of computer cannot directly compare different time information is lacking unified time datum. And placename position depends on unified space datum. Therefore, it is the foundation of family tree expression, sharing and analysing to set up unified spatial-temporal datum.

ऎඳྦྷ⇣㔥 ྦྷ⇣㔥 Other website

B. Family Tree Information Sharing Family tree information sharing is the foundation and premise of family tree system integration. To address this ZHJP integration platform is the basic three-tier problem, there are two main specifications: Specification of architecture: surname website, regional surname website or Family Tree Description Metadata and GEDCOM branching surname website and personal surname website. (GEnealogical Data COMmunication) [12], [13]. But there are Comprehensive family tree website and surname website are many problems still have not been well solved. Therefore, in the same level. Surname website is about the same surname. family tree information sharing should consider diversified Only one surname website for every surname exists in the applications of family tree in different fields. Enterprises and platform. Comprehensive family tree website is about multiple research institutes relate to family tree should join together to surnames. The platform is easy to use. Everyone could access formulate a standard format for family tree exchanging from websites in different levels through a unified entry or URL. aspects of family tree metadata and full-text data. After successful application, any clan or individual can quickly build surname website or personal website. 1) Family tree metadata based on spatial-temporal Enterprises and research institutes also can conveniently set information: Metadata is the data about data, which is mainly up comprehensive family tree website by using function about the content, quality, condition, and other characteristics modules or web services provided by the platform. The of data. In different fields, metadata has its specific definition platform uses Web Crawlers of family tree to search and parse and application. About family tree metadata, Library of news released in websites outside it. Then, it provides lately Shanghai, China has drafted Specification of Family Tree and comprehensive news of family tree for users and web Description Metadata. It defines and describes the definition service of news for websites. So, the platform and internet are of family tree, the structure and properties of family tree combined deeply to promote socialization of sharing family metadata. But, it does not give a clear definition of family tree tree information together. A website can be integrated into the metadata. There are still many problems about family tree platform when it implements family tree information standard metadata research: (1) There is still no unified definition of format and web service interface specification of the platform. family tree metadata. Each definition just emphasizes the Stand-alone family tree systems interchange and share content of one or two fields. (2) Family tree metadata is not information with the platform through family tree information comprehensive. It mainly concentrates on external attributes standard format. So, data collected by stand-alone systems can of family tree, but the content of family tree is not well be easily put into the platform. According to hierarchical expressed. (3) Existing definitions of family tree metadata do design principles, the platform consists of three layers in not understand the essence of metadata. The core content of implementation: data layer, service layer and application layer. family tree such as clan’s lineage, source of clan, spatial Fig. 1 The framework of ZHJP integration platform



distribution and migration of clan is not well embodied. (4) Lack of methods and techniques of family tree metadata expressing. Therefore, it is necessary to develop a uniform family tree metadata standard which is suitable for family tree research and application. The metadata standard should both fully reflects the content of family tree and highlight the spatial -temporal information, which also have good scalability, openness and stability. 2) Expression and exchange of family tree full-text data: Family tree metadata helps users to quickly understand and find full-text data, which is the foundation of comprehensively sharing family tree information. In the aspect of family tree full-text data, GEDCOM etc. basically satisfies the demands of expressing and sharing of European and American family tree information. However, it is deficient in description of relationship among people, event and spatial-temporal information. More seriously, some special content and elements of Chinese family tree is difficult to express, even cannot be expressed. So, it is hard to meet the demands of sharing and exchanging of Chinese family tree information. Chinese family tree information can be divided into three parts: family tree bibliographic, entry content and lineage chart. The first part describes family tree bibliography including genealogical name, genealogical place, surname, compiler, version, abstract, collection unit and so on; the second describes the full content of Chinese family tree except for lineage chart; and the third one describes details of each person, family in lineage chart, relation and event. The expression of time information should simultaneously take Chinese traditional time and AD time into consideration. In fact, the space information in family tree is placename in specific time. As for frequent changes of placename, the expression of space information should outline the temporal characteristic of place, and describes place’s spatial location with longitude and latitude. Lineage chart is the most important information, among which the relationship between persons is the core. It can be described by individual, family, clan and relationship among them. Event is related to persons and has time and place information. Therefore, it will be easy to express and share family tree full-text data by taking person, time, space, event and relationship as the core elements. C. Interoperability of Family Tree System Interoperability of family tree system is a property referring to the ability of diverse family tree systems to share and process information and work together. It reflects the usability of system interface. It is the key of family tree system integration. SOA is an architecture model to combine, deploy and use different functional cells (services) of application. The interaction among services is through well-defined interface and contract. Web service is a modular application which is self-contained and self-descried, that is also the main method to implement SOA. It is published, searched and called via web with the characteristics of good encapsulation, loose coupling, using standard protocol and high integration capability.

Each family tree system uses different data model designed by itself, but has same or similar functions. So, the integration platform should draw up specification to uniform the interface of web service. According to this specification, functions of existing family tree systems can be packaged in DLLs and web services can be implemented based on DLLs. It is easy for family tree systems to access other system’s functions by calling DLL or web service. Family tree information standard format is used for information changing and processing among systems. Interoperability mode of family tree systems is shown in Fig. 2. Web service interface specification is the foundation of interoperability of family tree systems. GIS web services should accord and reference to OGC WMS and WFS etc., and other web services should use family tree information standard format to transfer data. System A

System B

Web Service

Web Service Family Tree Information Standard Interchage Format

DLL

DLL

Fig. 2 Interoperability mode of family tree systems

D. Application Mode of Family Tree System Integration 1) Multi-level and divers applications: The number of users of family tree systems is enormous and the content they care about is various. There are also great differences in their age, education and computer skills etc. Users can be divided into different kinds: system general user, manager, developer in the terms of the application of system, and family tree compiler and researcher in the terms of the content that they concern. Different users bring multi-level and divers applications. So, it is important to thoroughly analyse what are various users really need. Usually, system general users go to the websites to read subjects they are interested in. They need customizable and configurable user interface to conveniently find and read the most valuable content. System managers pay more attention to the friendliness and flexibility of system function. So, the configurable ability of modules is important. Meanwhile, the integration platform should also provide API, especially whose function is particular, based on script language. It is easy for system managers to embed these functions into their existing systems. And for the developers, DLLs and web services which package functions of integration platform will be excellent to develop personal applications through secondary development. Functions that exist in most systems, such as news managing, family tree creating and managing, bibliographical searching etc., should follow the same function interface specification. Users can select one of several function implement. Functions that exist in a few systems, such as time encoding, ancient and modern placename encoding, spatialtemporal spectra mapping etc., should be easy to use.



2) Distributed and centralized architecture: Family tree systems with characteristic functions are developed by different enterprises and research institutes. It demands the integration platform must have the ability to integrate distributed systems. On the one hand, family tree resource is huge in number and distributed in the world. On the other hand, it is logically concentrated in surname. Therefore, the integration platform should be distributed in integration and centralized in administration. According to the design in part III, the integration framework is divided into data layer, service layer and application layer. Each layer is independent so that it can be distributed or centralized. The integration platform includes center server and distributed server. It has a registration center which manages metadata of data, service and application in all servers and interaction among servers. Data, service and application can be deployed on both center server and distributed server. Fig. 3 shows distributed and centralized architecture. Every surname could have only one center database and website for all family tree of this surname, which must be deployed on the center server. But, there is no limit to the sub-database and sub-website of surname. People could upload stand-alone system to the download center of the integration platform, but also can release it alone.

The main web services provided by HXJP-GIS are as follows: GetFamilyTreeMetadata, GetClan, GetFamily, GetIndividual, TimeEncoding, GJDMEncoding etc. Their functions are getting family tree metadata, getting clan, family and individual information, encoding time and placename information etc. DZP-PGV employs PHPGedView which is open source and is based on data model of GEDCOM. The main web services provided by DZP-PGV are as follows: Authenticate, ServiceInfo, doSearch, getPersonByID, getGedcomRecord, updateRecord, getAncestry, getDescendants etc. [17]. Their functions are authentication, getting basic information of family tree, searching family tree, getting individual, ancestry information and GEDCOM record etc. The prototype of ZHJP integration platform is built by combining and improving function modules of the two systems and developing some new function modules. DLLs, function modules and standard web services are directly called by the platform. The process of integration of HXJP-GIS and DZP-PGV is shown in Fig. 4. ZHJP Integration Platform New Function Modules Developing Function Modules Combing and Improving

Application Standard Web Service

Function Module

Data

Service

DiStributed Server

DLL

Web Service

Function Module DLL

Web Service

HXJP-GIS

DZP-PGV

Application Application Service

Registration Center

Fig. 4 The process of integration of HXJP-GIS and DZP-PGV Data

Service

Standard web service implements the interface specification, which is agreed by the two systems, and family tree information standard format, and uses XML to pass parameter. It splits and combines web services provided by existing family tree systems. The process of packaging standard web service is shown in Fig. 5.

DiStributed Server

Data

Center Server

Application Service

Data

DiStributed Server X System

Fig. 3 Distributed and centralized architecture

Standard Web Service

Web Service

V. SYSTEM INTEGRATION OF HXJP-GIS AND DZP-PGV HXJP-GIS system is a system that captures, stores, analyses, manages, and presents family tree information, which is supported by GIS technology and visually expresses clan’s lineage relation and space relation. It follows consistently spatial-temporal datum, presents sufficient information and uses GIS technology to process the time and space information implied in family tree [14]. DZP-PGV system is a system that creates and manages electronic family tree with the characteristic of multilingual user interface and following GEDCOM specification. It protects privacy fully and the function of searching is powerful [15], [16]. The two systems have the same goal to protect and use family tree with different methods and technologies. To integrate the two systems can complement each other with their distinct data resource and system functions.

Input Parameters Processing

Web Service

Result Data Processing

Data Format Converting

Spatial-temporal Encoding

Web Service

Fig. 5 The process of packaging standard web service

Screenshots of the prototype of ZHJP Integration platform are shown as follows [18]. Fig. 6 shows home page of ZHJP integration platform. Fig. 7 shows map of family tree distribution. Figure 8 and 9 show pedigree chart of HXJP-GIS and DZP-PGV.



and GIS software are the foundation of family tree information sharing and function integration. In theory, it proposes some basic principles and discusses key issues of family tree system integration. In technology, a framework of family tree system integration based on GIS is designed. Then, it implements a prototype of the integration platform by taking HXJP-GIS system and DZP-PGV system as an example. The practice indicates that the framework is feasible. It provides new conception, reference method and technical support to construct the Chinese Family Tree Database and Platform for better protecting and utilizing family tree. Fig. 6 Home page of ZHJP integration platform

ACKNOWLEDGMENT This paper is supported by National Natural Science Foundation of China (40901186 and 40730527) and National High Technology Research and Development Program of China (2009AA12Z228). The authors would like to thank Feng Yu for her support and Jingwei Sheng, Qi Luo for their initial review, and meanwhile thank the group of the Family Tree GIS for their collaborative work. REFERENCES

Fig. 7 Map of family tree distribution

[1]

[2] [3] [4]

[5] [6] Fig. 8 Pedigree chart of HXJP-GIS [7]

[8] [9] [10]

[11]

Fig. 9 Pedigree chart of DZP-PGV

[12]

VI. CONCLUSION Based on the analysis of existing family tree websites and stand-alone systems, this paper points out that information sharing and function integration have become the bottleneck for the development and application of family tree. To address the problems of family tree information collecting, expressing and sharing and interoperability of family tree systems, it indicates that the Consistently Spatial-temporal Framework

[13]

[14] [15] [16] [17] [18]



G.N. Lv, M. Chen, Y.N Wen, D. Hu, Y. Yang, “Research on constructing the Family Tree GIS, ” in: Proceedings of the 1st Spatially Integrated Humanities and Social Science Forum, Hongkong, China, 2009, pp.114–127. H.M. Wang, “The value and abuse of genealogy,” Shanghai Education. vol. Z2, pp.63–63, 2006. J.X. Ge, “The value and limitation of genealogy as historical article,” History Teaching and Research. vol. 6, pp.3–6, 1996. Y.D. Yuan, J.R. Qiu, M.H. Zhang, Chinese family name-300 popular family names: population genetics and distribution, East China Normal University Press, 2007 J.H. Xu, “The value and local feathers of genealogy,” Fujian Tribune (A Literature, history & Philosophy Bimonthly), Vol.9, pp.56-59, 2005 L.J. Gao, “The role of the Chinese genealogy played in the studies of cultural history,” Journal of GuanZhou University (Social Science Edition), Vol.1, pp.18-20, 2002 C.R. Liu, “Changes of family population and social economy in Ming and Qing dynasties,” Institute of History of Taiwan Academia Sinica. 1992. D. J. Timothy, J.K. Guelke, Geography and genealogy, Ashgate Publishing, 2008. M. Kashuba, Walking with your ancestors: A genealogist’s guide to using maps and geography, Cincinnati, 2005 S.P. Chen, C. Huang, “Perspectives on the cultural heritage conservation and development,” Geographical Research. vol. 24, pp.489–498, 2005. D. Hu, G.N. Lv, Y.N. Wen, Y.R. Feng, M. Chen, H.T. Zhang, "GISbased family tree information sharing and service", in: Proceedings of the 18th International Conference on Geoinformatics, Beijing, China, 2010 Q.F. Zhou, Y. Gu, J.H. Chen, X.Y. Lou, L. Zhao, Specification of Family Tree Description Metadata, Shanghai Library. 2004. Family History Department the Church of Jesus Christ of Latter-day Saints, The GEDCOM standard release 5.5, The Church of Jesus Christ of Latter–day Saints, 1996. http://www.hxjiapu.com.cn. [Accessed on 28th Feb, 2011]. http://www.dazupu.com. [Accessed on 28th Feb, 2011]. http://www.dazupu.net. [Accessed on 28th Feb, 2011]. http://wiki.phpgedview.net. [Accessed on 28th Feb, 2011]. http://www.zhonghuajiapu.com. [Accessed on 28th Feb, 2011].