Investigating the Information Seeking Behaviour of ...

4 downloads 48279 Views 348KB Size Report
policies; database and website designs); more accurately measure user ..... All respondents reported experiencing phase 5: tree-build - easy and phase 6: ...
Pre-Referee Version Candidate Paper for the Journal of Information Science: http://jis.sagepub.co.uk

Information School, University of Sheffield - 50th Anniversary

Investigating the Information Seeking Behaviour of Genealogists and Family Historians

Journal of Information Science 39 (1) pp. 1-13 © The Author(s) 2013 Reprints and Permissions: sagepub.co.uk/journalsPermissions.nav DOI: 10.1177/016555150000000 jis.sagepub.com

Paul Darby and Paul Clough Information School, University of Sheffield, United Kingdom

Abstract People are increasingly investigating their family history (or genealogy) as part of their everyday information seeking activities. This paper provides insight into this behaviour and presents a new conceptual model that captures the stages of activity carried out during people’s lifelong family history research. The model offers a multi-phase view of the research process, intended to illustrate: (a) the different research phases themselves; (b) the inter-relationship between phases; (c) distinct phase-specific behaviours; and (d) phase-specific resource preferences. Data collected from amateur family historians by interview and questionnaire has helped to validate the model and provide insights into the information resources used. The findings complement existing knowledge about family history research and will benefit: family historians as they seek to navigate within the research process; providers of genealogical resources as they seek to better support users; and academics as they study information seeking behaviours in various contexts.

Keywords Librarianship and information science; Information seeking behaviour; Family history and genealogy

1. Introduction Family historians seek to identify their ancestors by locating and searching historical documentary resources, the goal being to construct a network of forebears, normally recorded and presented on a diagram known as a family tree. This task may be facilitated by documentary transcripts, name indices, or databases, but usually also involves hours of trawling through pages of source material. The terms genealogy and family history research are used interchangeably, but differ in scope: genealogy is the systematic tracing of an individual’s ancestors and their key information (dates and places of birth, marriage and death etc.); family history research seeks to go further by unearthing supplementary information about ancestors’ home, educational, working, social and political lives. Throughout this paper genealogy and family history research are referred to collectively as family history research and abbreviated to FHR, whilst genealogists and family historians are termed family history researchers, or FHRs. The legal and medical sectors have long utilised professionally acquired genealogical information, but there has recently been an explosion of interest in ancestral discovery amongst amateur and hobbyist researchers, facilitated greatly by the Internet [1]. FHRs utilise resources ranging from widely available broad-coverage collections to more specialised though limited sources, some requiring significant effort to access (e.g. travel, additional education, special permissions). Typically FHRs visit local libraries, archives, national centres, and niche collections, both in-person and online, behaviour characterised by Yakel as ‘everyday life information seeking ... a unique example of intensive and extensive use of libraries and archives over time’ [2]. The task is pleasantly open-ended, limited only by practicalities (record survival or accessibility; willingness to expend effort) or informational difficulties (ambiguities or dead-ends), and, though the research activity itself is fairly solitary, the sharing of methodological information is common.

Corresponding author: Paul Clough, Information School, The University of Sheffield, Regent Court, 211 Portobello Street, Sheffield, S1 4DP Email: [email protected]

Pre-Referee Version Candidate Paper for the Journal of Information Science: http://jis.sagepub.co.uk

Paul Darby and Paul Clough

2

The research described in this paper is intended to add to our understanding of the information behaviour of amateur FHRs, an important group for a number of reasons, not least that they make up an increasingly large proportion of users of heritage and archive resources, between 50 and 90 percent according to Tucker [3]. Improved understanding of their activities will allow institutions to: more effectively tailor information provision (finding aids; digitisation policies; database and website designs); more accurately measure user satisfaction; and so develop information systems to fully exploit collection potential with regard to the needs of FHRs. Amongst repository staff, a better appreciation of FHRs’ aims, preconceptions and misconceptions will improve the user experience for such researchers. The novelty of the interaction between FHRs and information resources provides an example of a little-investigated, potentially unique, mode of information behaviour, and there is evidence to suggest that established behavioural models fall short when describing FHR activity [4, 5]. In this paper we seek to illustrate the different types of information behaviour usually exhibited by FHRs throughout the full extent of their quest by proposing a conceptual model to describe the distinct research stages or phases typically encountered, and by providing definite examples of preferred resources during key research phases. Implicit is the belief that FHRs’ information behaviour is complex and changes over time as: research experience is gained; researchers’ personal circumstances change; research focus shifts; and research interests develop [2, 5, 6]. A multiphase view of the FHR process is offered which illustrates: (a) the different research phases; (b) the inter-relationship between phases; (c) distinct phase-specific behaviours; (d) phase-specific resource preferences. This research is novel in that it focuses on amateur FHRs and depicts their unusual, possibly unique, behaviour profile to a high degree of specificity and detail, by means of a conceptual model. Much previous work has considered academic- or employment-initiated FHR which generates quite different behaviours [5]. Furthermore, since ostensibly similar records and record-keeping systems differ internationally and most previous FHR-related studies originate in North America, their applicability to UK-based FHR is limited, particularly in relation to specific resources. This study, carried out in UK repositories, deliberately focussed on UK-based FHRs and their use of mainly UK-originated resources. Though the model is generalisable beyond the UK, some resource findings are locality-specific. Finally, the research takes a user- rather than a collection- or institution-based focus and seeks researchers’ own opinions regarding their information seeking activities and preferred information resources. The paper is structured as follows: section 2 summarises previous FHR studies and why further research is warranted; section 3 outlines the research methodology; section 4 presents results, including a full explanation of our final model and empirical findings on resource preference for specific FHR phases; section 5 considers findings in the broader research context; and section 6 provides a summary and suggestions for future work.

2. Past work When considering the interaction between users and custodians of archival information, only recently has the focus shifted from a collection-based perspective to a more user-oriented one. Widely regarded as the first academic study to investigate the behaviour of FHRs as a specific group, Duff & Johnson [6] focussed on primarily professional researchers who, it was supposed, had a wider perspective on the research process. They found that FHR surpassed simply gathering dates and suggested three collection phases: tracing individual ancestors; collecting relevant personal information; and gathering contextual information about their lives. They found that names were paramount to FHRs, followed by geographic locations (sometimes refined after consulting maps), and usually qualified by dates. Subjects or events were also used but less frequently. These search terms, though, did not mesh with the provenance-based finding aids found in archives and so research questions needed to be re-framed, a difficult task, particularly for novices. By necessity, FHRs overcame these difficulties by developing trusted and repeatable search strategies, thereby building collection-specific expertise, in part by sharing knowledge with fellow-researchers in informal social networks. Yakel [2] investigated specifically amateur FHRs, finding their research to be an on-going process, as meeting one information need usually generated another. Simply gathering genealogical facts was gradually overtaken by the desire to create a more inclusive narrative in the search for connection and meaning. FHR’s relevance to self-discovery and collective memory was explored by Yakel & Torres [7] and conceptualised as ‘a community of records’, the interrelationship of fact, meaning and truth. This shift of researchers' focus from cognitive to more affective aspects illustrates a changing information need as individual FHR progresses. Learning how to do FHR was a significant part of this evolutionary process with fellow researchers favoured over professional intermediaries as educators. Butterworth noted that, unlike traditional information seeking and retrieval (IS&R) research subjects, amateur FHRs were highly motivated within social and cultural contexts [5]. The process of creating self-generated research questions and solutions for a valued audience (self, family or simply posterity) offered regular incremental rewards sufficient to Journal of Information Science, 39 (1) 2013, pp. 1–13, DOI: 10.1177/016555150nnnnnnn

© The Author(s), 2013

Pre-Referee Version Candidate Paper for the Journal of Information Science: http://jis.sagepub.co.uk

Paul Darby and Paul Clough

3

offset any negative aspects of IS&R [8] and was an end in itself, often developing into local history research as contextual information assumed more importance. The same author stated that the complex motivations behind FHR are particularly under-researched, encapsulated by few if any existing models, and characterised FHR as strongly social (groups and sharing), but weakly collaborative (mainly solitary research), with ill-defined activity boundaries making it hard to study within established frameworks [5]. He suggested that the incremental nature of FHR, with continual redefinition of objectives, means that Bates’ berry picking model [10] has something to offer in explaining FHR behaviour. In this model of searching the user’s query is continually shifting and is not satisfied by a single retrieved set of results, but rather by a selection of relevant items (“berries”) found along the way. In investigating design methods for digital library systems Butterworth himself proposed a behavioural model for ‘personal history researchers’ (FHRs), based on ‘educated guesswork’ and input from a single researcher, which proposed a complex inter-relationship of four stages: question formation; identify archive; search and browse; interpret [9]. Butterworth & Davis Perkins [4] found that: FHRs had well-articulated research objectives, likely because they were highly motivated; their questions were self-generated and refined by previous interactions; and they suffered fewer negative aspects of IS&R than professional researchers with imposed objectives [8]. FHR within a wider context has been investigated by other researchers. For example, Fulton looked at how its pleasurable aspects as a leisure activity influenced research behaviour [11] and defined FHR as an example of Stebbins’ concept of ‘serious leisure’: the pursuit of a substantial and fulfilling amateur activity often necessitating the development of special skills [12]. Fulton also studied the social constructs and norms formed around the sharing of information between amateur FHRs and found it to be an important aspect of FHR, supporting learning as well as research success [13]. A few studies have looked at use of the Internet for FHR. In a US-focused study Frazier considered various aspects of online FHR, including which web resources were most-visited [14]. Veale investigated how FHRs interacted with the Internet, particularly researching, publishing and collaborating. Information quality, security and over-commercialisation were concerns voiced by FHRs [15]. Garrett investigated the use of Ancestry 1 , a commercial provider of online genealogical data, amongst users of a US archive and found that most participants used the website, many availing themselves of free on-site access. Such online tools were viewed positively but did not replace valued on-site visits to view material and seek advice from staff [16]. Skinner considered amateur FHRs’ satisfaction with various resources, interestingly by seeking opinion from both FHRs themselves and the professionals who assist them [17], an approach also advocated by Butterworth [5]. Skinner found a generally satisfied group of researchers, with certain preferences regarding information formats, and which again valued face-to-face interaction with knowledgeable repository staff. It is surprising that FHRs are still under-researched, particularly their selection of resources, and many researchers recommend further study since current models do not easily capture their behaviour [4, 5, 16, 17 etc.]. This paper seeks to complement existing research by: developing a conceptual model to highlight the activities undertaken by FHRs; the order in which these are carried out; and the resources used by individuals at key stages.

3. Methodology A mixed-methods approach was adopted to explore FHR activity and the resources used to support this activity, which might include, for example, historic census returns, church records, local newspapers and maps. An initial framework, based on prior work, was created to describe the main phases of the process (Section 3.1) and to provide a structure within which to orient subsequent investigations. The work is best described as deductive since the resulting model was based on data collected from practicing FHRs with reference to an initial conceptual framework. Predominantly qualitative in nature, the research used data from in-depth unstructured interviews, augmented by quantitative analyses of responses to several specific questions (Section 3.2). The particular characteristics of much amateur FHR activity (very long-term research, carried out sporadically yet intensively) meant that other research instruments, such as longterm longitudinal studies or personal activity records, whilst helpful in some situations, were inappropriate here.

3.1. Initial framework Based on past work (Section 2) combined with the first author’s experiences as a staff member assisting FHRs in a UK county archive and a local studies library, a framework was constructed to encompass eight proposed and distinct phases of activity that FHRs might encounter (see Section 4.1). The framework loosely reflects Wilson’s model of 1

http://www.ancestry.co.uk/ or http://www.ancestry.com/

Journal of Information Science, 39 (1) 2013, pp. 1–13, DOI: 10.1177/016555150nnnnnnn

© The Author(s), 2013

Pre-Referee Version Candidate Paper for the Journal of Information Science: http://jis.sagepub.co.uk

Paul Darby and Paul Clough

4

information behaviour in that it recognises: initial triggers; motivations for starting and continuing the research (‘activating mechanisms’); factors influencing research (better illustrated in later parts of the survey); and various levels of seeking behaviour [18]. It also draws on the information seeking model proposed by Ellis [19]. Further discussion on various types of conceptual models for IS&R research are provided by Järvelin & Wilson [20], including the analytic or causal versus the more process-oriented. Our final model describes the FHR process rather than possible causative factors. It seeks to illustrate the stages involved in FHR rather than explain underlying drivers. In the framework, phases are arranged in an ordered or temporal fashion supposing that people begin at the first phase and work through subsequent phases sequentially without omission, sometimes going back and resuming the sequence from an earlier phase as research focus changes. The activities fall into two categories: (a) preliminary phases: collecting information to enable FHR proper to start; and (b) main research phases: the primary data gathering. Most FHRs visualise and organise their information using a family tree (or pedigree) diagram, a documentary representation of genealogical relationships. We propose that the research effort required to populate the tree, by tracing as many ancestors as possible, varies with ancestor dependant on a number of factors and generates differing forms of information behaviour, described by different phases of the framework. Our overall aim was to produce a model describing the activities undertaken during FHR, by analysing findings from the interview and questionnaire surveys, with reference to the initial framework (Section 3.2). The interview data in particular suggested: (a) additional phases; (b) unnecessary phases; (c) alternative phase sequences; and (d) alternative model structures. The questionnaire data indicated preferred resources for key research phases. The model (described in Section 4.1) will show the research phases typically encountered by FHRs; particular resources associated with different research phases; and how the phases are encountered (sequentially, in parallel, iteratively etc.).

3.2. Data collection 3.2.1 Participants Participants were members of the public who were also practicing FHRs. Their informed consent was obtained, as required by the Research Ethics Committee of the University of Sheffield. The self-selecting sample consisted of users of repositories in Derbyshire and South Yorkshire (UK), supplemented by individual members of family history groups. Almost all were amateurs, conducting FHR as a leisure activity; one carried out fee-based research regularly, one infrequently. 23 participants made up the interview sample and 21 the questionnaire sample (of which 17 had taken part in the interview survey). The new questionnaire participants came from informal contacts. All but one participant (96%) did FHR for themselves and about a quarter (22%) carried out work for a family member. A few (13%) researched for a friend, and two (9%) did so for a paying client, at least occasionally. The mean duration of research was 9.7 years, ranging from a few months to 44 years. The mean number of research visits (any trip relating to FHR) was 33 per year, ranging from, on average, less than one trip to around 200 trips annually. Online sources were used frequently, with 66% of participants using them several times per week and 13% weekly. Only 4% of researchers never used the Internet for FHR. 3.2.2 Interviews Interviews allowed exploration of participants’ FHR practices and were piloted on knowledgeable staff from the target repositories (county archives and local studies libraries in the region). They were unstructured except for reference to the initial framework (and use of a topic list as an aide-memoir for the interviewer) and were conducted at the repository being visited or another convenient location, usually the participant’s home. Duration varied: the shortest around thirty minutes; the longest about two hours. To avoid intimidating participants, conversations were not recorded electronically. Comments were noted down and later transcribed into electronic documents whose content was subsequently divided such that all comments pertaining to a given topic were collected together. This allowed the different responses within each topic to be noted and given a code value. The texts for each topic were then analysed line-by-line and coded accordingly, then individual codes totalled to provide a measure of opinion and to facilitate simple statistical analyses. To improve coding consistency and transparency codes were added into comment texts, allowing subsequent review by the authors and other researchers.

Journal of Information Science, 39 (1) 2013, pp. 1–13, DOI: 10.1177/016555150nnnnnnn

© The Author(s), 2013

Pre-Referee Version Candidate Paper for the Journal of Information Science: http://jis.sagepub.co.uk

Paul Darby and Paul Clough

5

Figure 1. The proposed model of activities involved in and resources used for family history research 3.2.3 Questionnaires Following interview analyses additional data about phase-related resource choices were sought by questionnaire. The resources list therein acted as a memory-jogger for participants, the need for which became apparent during interviews when some resources were mentioned only after a chance reminder. The questionnaire sample was predominantly the same group as for the interviews. In most instances questionnaires were disseminated and returned by email. Piloting suggested 15 to 20 minutes to complete the questionnaire, which was made up of three identical lists of 48 resources (e.g. Census online; Post-1858 wills etc.), each list to be considered in relation to a single key tree-building phase, i.e. phases 5-7 in Figure 1. Using a Likert scale: very good; good; neutral; bad; very bad, resource effectiveness was graded for each phase. Resources not used were left blank. For each of the three phases under investigation, votes for each of the five grades, across the 48 resources, were summed to give a count for each grade/resource/phase combination. A single score per resource within phase was achieved by applying a weighting factor to the grades: +2; +1; 0; -1; -2 respectively, then summing the weighted values per resource/phase combination.

Journal of Information Science, 39 (1) 2013, pp. 1–13, DOI: 10.1177/016555150nnnnnnn

© The Author(s), 2013

Pre-Referee Version Candidate Paper for the Journal of Information Science: http://jis.sagepub.co.uk

Paul Darby and Paul Clough

6

4. Results and analysis The final model's research phases, their inter-relationships and attributes are shown in Figure 1. The proposed model is presented first as subsequent discussions will refer to particular phases. The results are presented as follows: the model overview (Section 4.1); details of the model’s constituent phases (Section 4.2); and findings linking particular resources to key research phases (Section 4.3). Note: Unless otherwise stated, individual statistics are independent of one another, i.e. participants can be counted in more than one category.

4.1. Model overview Table 1. Comments on proposed framework (interviews) Comments on proposed framework (response rate 22 of 23)

Count

Percentage

20 0 8 6 3

91% 0% 36% 27% 14%

General agreement with proposed phases Fundamentally disagree with proposed framework Jumped into middle of phase sequence Did some phases in different order Research seen as a strongly iterative process

A sequential ordering is implied by the phase numbers of the model (see Figure 1). It is assumed that FHR is initiated at phase 1: trigger event, and usually followed by phase 2: collect family information, though findings suggest some researchers skip this phase. Phase 2 may be followed by phase 3: learn the process, though this is not the case for an increasing number of FHRs, so phase 2 (or indeed phase 1) could alternatively be followed by phase 4: break in, bipassing phase 3. For phases beyond phase 3 we suggest a parallel aspect to reflect the continuous learning reported by many FHRs. By definition there is a strongly implied sequence from phase 4 through phases 5, 6 and 7: tree-build easy; medium; hard to phase 8: push back selected lines. At any of these phases it is possible to go back to an earlier phase and resume the sequence, possibly when initiating research into a new ancestral line. It would, though, be unusual to revert back as far as phase 1 or 2. Jumping back to phase 3 is now meaningless since learning is only considered as a discrete step the first time through. Thus the model describes the iterative and cyclical nature of the FHR process by supporting alternative pathways through its constituent phases. There was good agreement (91%) with, and recognition of, the phases of the initial framework and no respondents fundamentally disagreed with its components or overall structure. Some agreed it reflected how they had done things themselves but wondered whether this was still relevant to today’s researchers. Others questioned the order in which phases might be encountered and whether some would be skipped initially or omitted entirely. Some FHRs started their research in later phases (36%) or experienced phases in a different order from that proposed (27%): “At first you don’t follow a proper path - you want to learn a little bit about everybody” (A007); “I started looking immediately for specific individuals using online resources – didn’t do phases 2 to 4” (A023). One researcher “jumped to earlier records because I’m fascinated by the nineteenth century” (A025). Several (14%) commented on the process being iterative. Some respondents thought that the Internet encouraged novice or younger FHRs to skip earlier phases and to engage in initially unstructured, exploratory activity, behaviour characterised by speculative, unfocused searching using, perhaps, general search engines with simple search terms. One researcher suggested “an extra phase to reflect early speculative searching online” (A022), though this was not included in the final model.

4.2. Phases of the model The individual phases of the model are now considered in turn. 4.2.1 Phase 1: trigger event A specific event (or events) usually triggered FHR (see Table 2), catalysing investigation of long-held family questions posed by the researchers themselves or other family members (45%): “The desire to learn about father's Navy career” (A007); “Seeking birth parents of a ‘foundling’ ancestor” (A017). A change in practical circumstances e.g. more time after retirement or improved access to resources was a common trigger (41%), as was the desire to learn about mysterious stories (32%): “A wish to get to the truth behind some dreadful family stories” (A003); or “A desire to get to the bottom of some ‘funny business’ surrounding an ancestor who was ‘a bit of a rogue’” (A004). Other reasons included: the discovery of family documents; school projects; capturing the experiences of elderly relatives; media programmes. Journal of Information Science, 39 (1) 2013, pp. 1–13, DOI: 10.1177/016555150nnnnnnn

© The Author(s), 2013

Pre-Referee Version Candidate Paper for the Journal of Information Science: http://jis.sagepub.co.uk

Paul Darby and Paul Clough

7

Table 2. Phase 1: trigger event (interviews) Characteristics of phase 1: trigger event (response rate 22 of 23)

Count

Percentage

10 9 7 7 5

45% 41% 32% 32% 23%

Count

Percentage

16 12 7 4 3 3 3

80% 60% 35% 20% 15% 15% 15%

Specific objective or information gap Practical factors Family anecdotes & mysteries Family members Family documents or artefacts

4.2.2 Phase 2: collecting family information Table 3. Phase 2: collecting family information (interviews) Characteristics of phase 2: collecting family information (response rate 20 of 23) Verbal information from family members Family documents Family photographs Family artefacts Previous research by other family members Tried but not very successful Unco-operative family members

Beginning FHR can be daunting so novice researchers often start by gathering information from within the family, research activity which often provides relatively easily-won rewards. Table 3 summarises characteristics of this stage. Phase 2: collect family information includes collecting family anecdotes (80%), gathering documents like BMD (birth, marriage and death) certificates (60%), or family photographs (35%). Physical artefacts also provided useful information: “A personal diary from the Crimean war” (A027). Previous research by other family members helped some (15%), though information quality was variable. Some FHRs failed to gain useful information during this phase (15%), finding family members unwilling or unable to help: “What do you want to know that for?” (A007); others had no one to ask and had to rely on their own memories. 4.2.3 Phase 3: learn the process Table 4. Phase 3: learn the process (interviews) Characteristics of phase 3: learn the process (response rate 22 of 23)

Count

Percentage

10 9 6 8 7 6 6 6

45% 41% 27% 36% 32% 27% 27% 27%

Learning is on-going Learning-by-doing or trial-and-error approach No specific learning phase Join group or society and/or attend events Text books & magazines Online resources & ‘How To’ websites Family members or friends Library or archive staff

We propose that researchers must somehow learn about the FHR process to satisfy informational objectives effectively and indeed this was not questioned by FHRs. Learning, though, went beyond the single discrete phase 3: learn the process, with almost half of respondents (45%) saying that it was a continuous part of their research activity: “to appreciate the limitations of sources” (A003); “because catalogues and collections are often hard to understand” (A000); “to properly interpret resources” (A007). Some (27%) had not personally experienced a discrete learning phase; others wondered if it was relevant to today’s FHRs, citing as probable reason the easier information access afforded by the Internet and the growth of trial-and-error searching. Learning-by-doing was a significant method of developing information gathering techniques, with many FHRs (41%) employing a trial-and-error approach to searching. One respondent described “plunging in at the deep end” (A008); another said “I just followed my nose and learned things as I went along - I made a lot of mistakes at first” (A026). Our model includes a parallel learning component from phase 3 onwards. A distinct phase 3 was retained because it is still a valid research phase for some. Journal of Information Science, 39 (1) 2013, pp. 1–13, DOI: 10.1177/016555150nnnnnnn

© The Author(s), 2013

Pre-Referee Version Candidate Paper for the Journal of Information Science: http://jis.sagepub.co.uk

Paul Darby and Paul Clough

8

4.2.4 Phase 4: breaking in Table 5. Phase 4: breaking in (interviews) Characteristics of phase 4: breaking in (response rate 18 of 23)

Count

Percentage

13 0

72% 0%

Sufficient information to access census resources immediately Speculative jump into census

The nineteenth and early twentieth century census returns are unquestionably a key FHR resource - one participant advised: “Get to the census as soon as you can” (A014) - and the original supposition was that researchers would take purposive steps to gather ancestral information to enable them to utilise this valuable resource. In fact most FHRs (72%) were already aware of ancestors for whom to search within census resources. No FHRs reported jumping into the census speculatively, counter to other indications of trial-and-error searching. Many respondents talked more generally about an initial key discovery which launched their research: “A military service record provided his next of kin and an address and that set me going” (004); “Family bible dates often went back well into the nineteenth century” (A013); “...travelled to specific locality in Scotland and accessed initial information at a local library” (A026). In light of these findings the definition of phase 4 was broadened to accommodate any finds that launch the research process, and was renamed ‘break in’ to better illustrate this important step of commencing the research. 4.2.5 Phases 5, 6 and 7: easy, medium and hard tree-build characteristics Phases 5, 6 and 7: tree-build - easy; medium; hard encompass the key task of constructing the family tree, the essential reason for doing FHR. All respondents reported experiencing phase 5: tree-build - easy and phase 6: tree-build medium activity and around half (48%) had encountered phase 7: tree-build - hard. The task is most commonly approached in a breadth-first manner, initially perhaps going back five or six generations, though some FHRs (23%) pursue a depth-first alternative, following single lines back as far as possible, one at a time. The distinction between phases 5, 6 and 7 is subjective and not clear-cut, but still a valuable means by which different behaviours can be explored. Phase 5 is when the easiest finds are made, usually more recent forebears. Main resources are those most easily accessed, requiring little expert knowledge to use and perhaps offering some form of indexing or transcription: “...speed of finding people by [online] name searches” (A004); “Straightforward to go back several generations” (A010). In phase 6 reasonable effort is required and tasks might include confirmation of earlier findings using ‘more trustworthy’ sources e.g. originals or facsimile copies rather than transcripts: “I started with transcribed resources - easier to find information - then originals” (A017). Coverage is typically not nationwide from a single source, but similar records are available at local level throughout the country, necessitating travel and some knowledge to access: “I started to visit archives to access church records - baptisms on micro-fiche” (A029). Phase 7 describes the highly purposive behaviour necessary to locate ‘difficult’ ancestors, not found in earlier phases, or to resolve ambiguities. The breadth of potential resources is larger than for preceding phases making research more piecemeal. Use of specialist collections is common, as is the need to travel to access sources, e.g. “Gloucester record office to look at a prison register” (A000); “Society of Genealogists' library” (A026). Table 6. General characteristics, phases 5, 6 and 7: tree-build - easy; medium; hard (interviews) Characteristics of phases 5, 6 and 7: tree-build - easy; medium; hard Use of online resources Use of physical documents (including copies on various media) Archive, local studies library, family history centre Society, group, course, or introductory event Desire for contextual information

Phase 5 N=23

Phase 6 N=23

Phase 7 N=11

70% 48% 61% 17% 4%

48% 87% 78% 26% 9%

18% 91% 73% 9% 18%

Table 6 shows some non-resource-specific research characteristics through phases 5 to 7. As ancestor identification becomes more difficult the use of online resources declines (70% to 18%), whilst the use of physical material increases (48% to 91%), probably indicative of the lesser online availability of more unusual records and the initial focus on digitising records with the widest appeal. The high and relatively constant usage of archives, local studies libraries and family history centres (averaging 71%) illustrates the multi-level appeal of such institutions as places where serious researchers can access specific resources, and simultaneously where novices can take their first research steps. The Journal of Information Science, 39 (1) 2013, pp. 1–13, DOI: 10.1177/016555150nnnnnnn

© The Author(s), 2013

Pre-Referee Version Candidate Paper for the Journal of Information Science: http://jis.sagepub.co.uk

Paul Darby and Paul Clough

9

increasing desire for contextual information about ancestors and their lives (4% to 18%) supports findings of other researchers [5, 6, 7]: “I started to want to build up a general picture of people’s lives” (A024). 4.2.6 Phase 8: push back selected lines Relatively few survey participants (35%) reported reaching phase 8: push back selected lines, the phase during which, having constructed a well-populated family tree, the FHR concentrates subsequent effort on pushing back particular ancestral lines and which necessitates accessing more ‘difficult’ resources: difficult to locate; access; read; understand; or interpret. Few have guaranteed breadth of coverage or central means of access, and many require specific knowledge to interpret effectively. Two distinct sub-phases of behaviour were apparent in phase 8, defined by degree of familiarity with such sources: occasional users of such material (50% of respondents); and regular users (38% of respondents). The final model illustrates these two distinct but associated sub-phases by splitting phase 8 into: 8a: push back selected lines - occasional and 8b: push back selected lines - regular. It seems reasonable to assume that FHRs start at subphase 8a, and with experience some, though not all, progress to sub-phase 8b. Table 7. Phase 8: push back selected lines (interviews) Characteristics of phase 8: push back selected lines (response rate 8 of 23)

Count

Percentage

4 2 1 0 3 0 0 1

50% 50% 25% 0% 38% 0% 0% 33%

Occasionally use ‘difficult’ records Copy material for later interpretation Link into pre-existing research Seek specific tuition to utilise ‘difficult’ records Regularly use ‘difficult’ records Copy material for later interpretation Link into pre-existing research Seek specific tuition to utilise ‘difficult’ records

Note: italics counts and percentages are within non-italic categories.

Though respondent numbers were small, findings indicated distinguishing behaviour for each sub-phase. In subphase 8a FHRs used complex documents in relatively superficial ways, relying heavily on the work of intermediaries: (a) catalogues: one relatively inexperienced researcher “rapidly started to use difficult records after research linked to a major family with a large archive of documents” (A008); (b) transcripts; (c) previous research, though not without awareness of the need for independent verification; (d) interpretation by third parties: “I'm not an expert on reading old documents ... I take a copy ... ask others for help in reading and interpreting it later” (A024). By such means complex content is collected, interpreted and absorbed into researchers’ collections. In sub-phase 8b FHRs’ behaviour was more purposeful and targeted, for example, researchers sought additional skills like palaeography (reading old handwriting) or Latin to access record content: “I joined a group to learn to read old documents” (A014).

4.3. Information resources Table 8 shows highest-rated resources based on effectiveness for FHR, by tree-building phase, quantified by summing questionnaire votes after applying grade weightings (Section 3.2.3). A high score is taken to indicate preference for a resource. Effectiveness, though, is a complex criterion and measures opinion about various inter-related characteristics: availability; accessibility; personal research success etc., and so warrants further investigation. During phase 5: tree-build - easy FHRs preferred easy-to-access online resources which provide broad coverage with indexed or transcribed content, in particular: census material (online and physical); online BMD indices; and Ancestry. Even in this first tree-building phase there was significant use of physical records at repositories (45%), much content un-indexed, most commonly: parish register material; trade directories; and newspapers. In phase 6: tree-build - medium, census, BMD indices, and Ancestry scored highly, as did also physical parish material. Maps were proportionately more highly rated than previously, both published Ordnance Survey (old and modern) and manuscript versions (e.g. tithe and enclosure). County archive websites (for orienting and collection information) and visits to TNA were also popular. A wider resource pool is noticeable in phase 7: tree-build - hard where the resource mix differed markedly from phases 5 and 6. Physical versions of parish register records and BMD indices were considered most effective. Few online resources scored well, probably having been exhausted in previous phases. Wills (ancient and more modern) appear, as do less familiar records, deliberately sought out and pertaining to: nonconformist worship; workhouses and asyla; land and property. More generalised resources e.g. museum or library websites; Journal of Information Science, 39 (1) 2013, pp. 1–13, DOI: 10.1177/016555150nnnnnnn

© The Author(s), 2013

Pre-Referee Version Candidate Paper for the Journal of Information Science: http://jis.sagepub.co.uk

Paul Darby and Paul Clough

10

GENUKI website 2 (UK-oriented genealogy portal) did moderately well. Absolute scores decrease from phase 5 through to phase 7 because initially FHRs employed the same resources and broadly agreed as to their effectiveness, whereas in later phases a progressively wider pool of used resources meant ratings were more dispersed. Also, only in situations where a source was commonly accessed using different media was the resource/medium combination investigated as a separate entity; in other cases medium was not taken into account. Thus key resources (ascertained from preliminary work) were more closely investigated with respect to delivery medium. Medium-independent resource comparisons are not provided, but can be achieved by summing scores for a given resource across all media. Table 8. Resource effectiveness, highest-rated, phases 5, 6 and 7: tree-build - easy; medium; hard (questionnaire) Resource within research phase (Response rate 21 out of 21) Phase 5 Census returns online GRO indexes of birth, marriage & death - online (various free & fee-based) Ancestry website Parish register originals (including film; fiche; photocopies) Census returns - physical media (film; fiche; photocopies etc.) Trade directories - all media Newspapers - physical media (including film; fiche; photocopies) Parish register online transcripts (excluding International Genealogical Index) Phase 6 Parish register originals (including film; fiche; photocopies) Ancestry website Census returns online GRO indexes of birth, marriage & death - online (various free & fee-based) County archives & record office websites Census returns - physical media (film; fiche; photocopies etc.) Find My Past 3 website Maps Ordnance Survey (old & modern) - all media Phase 7 Parish register originals (including film; fiche; photocopies) GRO indexes (birth, marriage, death) - other media (including film; fiche; photocopies) Post-1858 wills - all media Pre-1858 wills (ecclesiastical) - all media Poor law / workhouse / asylum records Nonconformist chapel registers

Weighted Score

Votes

29 27 25 24 17 12 12 12

20 19 18 15 14 11 11 9

20 18 18 13 8 8 8 8

13 17 17 16 9 11 7 8

7 5 5 4 4 4

12 4 6 5 6 5

5. Discussion The objective of this research was to produce a conceptual model describing the informational activities of FHRs, particularly amateurs whose complex motivations for research potentially lead to unexpected behaviours. Xie describes search models as ‘illustrations of patterns of searching and the search process’, and adds ‘some ... identify factors that influence the search process’ [21]. This highlights the distinction between process-oriented and causal (or analytic) models as defined by Järvelin & Wilson [20]. Xie also notes that a universal model may not be achievable [21]. By Järvelin & Wilson’s definition [20] our model is process-based since the causative factors which influence FHR behaviour are not considered; rather the activity’s component parts and their inter-relationships are described. This model is deliberately more focused on a particular context (UK-based amateur FHRs) than are traditional generic models whose wider applicability necessarily reduces their direct relevance to FHR. For example, Wilson’s model [18], though powerful and the basis for others, does not offer contextual detail. Ellis’s model [19], founded on the information behaviour of academics, includes concepts like: chaining and monitoring which do not map well to FHR. The multi-phase structure of our model owes something to generic models by, amongst others, Ellis [19], Wilson [18] and Kuhlthau [8], and also reflects aspects of recent FHR-specific research, e.g. the gradual transformation of an 2 3

http://www.genuki.org.uk/ http://www.findmypast.co.uk/

Journal of Information Science, 39 (1) 2013, pp. 1–13, DOI: 10.1177/016555150nnnnnnn

© The Author(s), 2013

Pre-Referee Version Candidate Paper for the Journal of Information Science: http://jis.sagepub.co.uk

Paul Darby and Paul Clough

11

individual's FHR focus from cognitive to affective with associated behavioural changes. This is characterised by Duff & Johnson as a 3-stage process of collecting names, then relevant details and finally contextual information [6]; by Butterworth as the conversion of FHR into local history research [5]; and by Yakel as the increasing search for wider meaning through FHR [2]. We also describe the cyclical nature of FHR, as findings modify subsequent information objectives. Wilson’s model describes a similar feedback loop [18]; and Bates’ berry picking model [10] recognises the evolution of queries in the light of findings and that information gathering is iterative (berry-by-berry). There are commonalities across traditional models and with Butterworth’s FHR-focused behavioural model [9], within which the phases of our model may be situated. For instance, most describe an initial stage, possibly including a trigger, during which an information gap is perceived (partly phase 1: trigger event) and the problem is comprehended within existing frameworks, e.g. Ellis’s starting [19]; Wilson’s problem recognition driven, in part, by stress/coping factors [18] ; Butterworth’s (high-level) question formation [9]; Kuhlthau’s initiation [8]. Initial preparations are then made (phase 2: collect family information and phase 3: learn the process), as in Wilson’s problem definition [18]. Exploratory activity follows, when high-level browsing provides familiarisation, research leads or even serendipitous finds (phase 3: learn the process; continuous learning; trial-and-error; and phase 4: break in), but also where initial optimism may be displaced by doubt or confusion. This stage corresponds to Kuhlthau’s selection leading into exploration [8]; Ellis’s browsing [19]; Skov & Ingwersen’s exploratory behaviour [22]. A selection or evaluation stage usually comes next, during which decisions about potential usefulness of resources are made and confidence increases (phase 3 and continuous learning have relevance here by supporting the earlier information gathering phases: phase 5 and 6: tree-build - easy; medium), resonant with Ellis’s differentiating [19]; Kuhlthau’s formulation [8]; Butterworth’s identify archives and searching/browsing (within) [9]. With experience, and in the light of findings, objectives become more focused and information is extracted more effectively (all FHR data gathering phases exhibit this evolution, but it is particularly relevant in phase 7: tree-build - hard and phase 8: push back selected lines). This corresponds with Wilson’s problem resolution through active searching, modified by intervening variables [18]; Ellis’s extracting [19]; Kuhlthau’s collection [8]; Skov & Ingwersen’s known item searching and meaning making [22]; Butterworth’s interpretation [9]. Finally there may be a summarising stage (not included in our model) which for FHR might be producing a finalised pedigree chart or multi-media item for dissemination. Kuhlthau’s presentation activity [8] encompasses this stage, which may be accompanied by feelings of relief or perhaps disappointment. The decrease in pre-research methodological learning and the increase in trial-and-error searching has implications for planning services and tools for use by FHRs. This change may, at least in part, be explained by the proliferation of online resources which offer easily-won facts and incremental rewards [5] and thereby counteract potentially demotivating negative feelings [8]. Regarding resource preferences during tree-building (phases 5 to 7: tree-build - easy; medium; hard), it was unsurprising to see census, BMD, and parish material generally rated highly since they are widely recognised as the main sources for FHR back to the early nineteenth century. Online resources scored highly during phase 5, but tailed off in phases 6 and 7, an indication of the preference for easy access value-added content when available. Physical resources at repositories were strong throughout; in earlier phases probably partly due to valued contact with expert staff and fellow-researchers; in later stages because on-site visits gave access to a diverse range of potentially useful sources, many unlikely ever to be digitised and made available online. Findings suggested the supplementing of purely genealogical resources (e.g. BMD indices; parish registers) by those offering greater contextual information about an ancestor’s life (e.g. wills; maps; newspapers) which supports previous findings [2, 5, 6, 7]. Online providers like Ancestry and Find My Past were particularly popular in earlier phases, though the fact that local authorities in the survey locality provide free-of-charge access in their repositories to the former may have skewed findings. It is hoped that this model will be used as the basis for the exploration of further avenues of research into FHR, augmenting the interpretations provided by more established generic models, and thus providing a framework within which to situate observed behaviours and to formulate research hypotheses. Further studies might look at: causative factors behind FHR behaviour, particularly (a) the complex motivations involved; (b) how FHRs choose to learn (about research approach, resources etc.), and how this can be supported; (c) how FHRs’ informational objectives change over time to include contextual information as well as the purely factual, and the consequent effect on research behaviour.

6. Conclusions and future work This paper presents an investigation into the information seeking behaviours of amateur family history researchers (FHRs). Most past work has considered professional or academic researchers, rather than the growing sector of nonJournal of Information Science, 39 (1) 2013, pp. 1–13, DOI: 10.1177/016555150nnnnnnn

© The Author(s), 2013

Pre-Referee Version Candidate Paper for the Journal of Information Science: http://jis.sagepub.co.uk

Paul Darby and Paul Clough

12

professionals for which family history research (FHR) becomes a potentially lifelong serious leisure activity. This study has focused on FHRs from the UK in contrast to previous studies that considered mainly researchers from North America. Contributions of the work include a conceptual model of information seeking that aims to capture the specific activities conducted by individuals during FHR along with identifying resources used throughout the process. The type of information used is clearly seen to differ throughout the process and consists of both physical and digital (including online) resources. The resulting conceptual model offers a multi-phase view of the research process, intended to illustrate (a) the different research phases themselves; (b) the inter-relationship between individual phases; (c) distinct phase-specific behaviours; and (d) phase-specific resource preferences. The final model, based on data gathered from practicing FHRs by interview and questionnaire, helps to provide a clearer picture of their research behaviours. Future work should further validate the model and perhaps explore causative factors that may link the various phases.

Acknowledgments Sincere thanks to the family history researchers who took part in this survey and to the staff and representatives of the various institutions or groups from which participants were drawn. Work partially funded by the European Community's Seventh Framework Programme (FP7/2007-2013) under grant agreement number 270082 (PATHS).

References [1]

[2] [3] [4]

[5] [6] [7] [8] [9]

[10] [11] [12] [13] [14] [15]

[16]

Fox, S. Older Americans and the internet. Pew Internet & American Life Project 2004; http://www.pewinternet.org/~/media//Files/Reports/2004/PIP_Seniors_Online_2004.pdf.pdf (accessed April 2012). Yakel, E. Seeking information, seeking connections, seeking meaning: genealogists and family historians. Information Research 2004;10 paper 205. Tucker, S. Doors opening wider: library and archival services to family history. Archiveria 2007 62: 127-158. Butterworth, R. & Davis Perkins, V. Using the information seeking and retrieval framework to analyse nonprofessional information use. In: Ruthven, I. (chair), Proceedings of the First International Conference on Information Interaction in Context (IIiX). 18-20 October 2006, Copenhagen, Denmark. 162-168. ACM, New York, NY, USA. Butterworth, R. Information seeking and retrieval as a leisure activity. In: DL-CUBA 2006. Workshop on Digital Libraries in the Context of Users' Broader Activities: USA. Chapel Hill: JCDL; 2006. Duff, W.M. & Johnson, C.A. Where is the list with all the names? Information-seeking behavior of genealogists. The American Archivist 2003; 66: 79-95. Yakel, E. and Torres, D. Genealogists as a ‘Community of Records’. American Archivist 2007; 70: 93-112. Kuhlthau, C. C. Inside the search process: information seeking from the user's perspective. Journal of the American Society for Information Science 1991; 42: 361-371. Butterworth, R. Gathering user requirements when you do not know who the users are: a case study of digital library development. Technical Report IDC-TR-2006-002, Interaction Design Centre, School of Computing Science, Middlesex University 2006 Bates, M.J. The design of browsing and berrypicking techniques for the online search interface. Online Review 1989; 13: 407-424. Fulton, C. The pleasure principle: the power of positive affect in information seeking. Aslib Proceedings: New Information Perspectives 2009; 61: 245-261. Stebbins, R. A. Serious leisure: a conceptual statement. Pacific Sociological Review 1982; 25: 251-272. Fulton, C. Quid Pro Quo: Information sharing in leisure activities. Library Trends 2009; 57: 753-768. Frazier, R. A. Genealogy research, internet research and genealogy tourism. MA thesis, University of WisconsinStout 2001. Veale, K. A doctoral study of the use of the internet for genealogy , PhD Thesis, Curtin University of Technology, Australia. http://historia-actual.org/Publicaciones/index.php/haol/article/viewFile/89/83 (2005, accessed April 2012). Garrett, C. Genealogical research, Ancestry.com, and archives, PhD Thesis, Alabama, USA: Auburn University. http://etd.auburn.edu/etd/bitstream/handle/10415/2014/Christine.Garrett_thesis.pdf?sequence=1 (accessed April 2012).

Journal of Information Science, 39 (1) 2013, pp. 1–13, DOI: 10.1177/016555150nnnnnnn

© The Author(s), 2013

Pre-Referee Version Candidate Paper for the Journal of Information Science: http://jis.sagepub.co.uk

Paul Darby and Paul Clough

13

[17] Skinner, J. Does greater specialization imply greater satisfaction? Amateur genealogists and resource use at the state historical society of Iowa libraries. Libri 2010; 60: 27–37. [18] Wilson, T.D. Information behaviour: an interdisciplinary perspective. Information Processing & Management 1997; 33: 551-572. [19] Ellis, D. A behavioural approach to information retrieval system design. Journal of Documentation 1989; 45: 171-212. [20] Järvelin, K. and Wilson, T.D. On conceptual models for information seeking and retrieval research. Information Research 2003; 9: paper 163: 1-23. [21] Xie, I. Information searching and search models. In: Bates, M. (ed) Understanding information retrieval systems: Taylor & Francis Group 2012; 31-46. [22] Skov, M. & Ingwersen, P. Exploring information seeking behaviour in a digital museum context. In: Lalmas, L. & Tombros, A. (chairs), IIiX 2008. Proceedings of the second symposium on information interaction in context (IIiX): Europe. New York, NY, USA: ACM; 2008.

Journal of Information Science, 39 (1) 2013, pp. 1–13, DOI: 10.1177/016555150nnnnnnn

© The Author(s), 2013