Preserving Qualitative Data: A Data Model to Prepare ...

3 downloads 9050 Views 3MB Size Report
Computer Assisted Qualitative Data Analysis Software. Data for ... software and hardware obsolescence including normalization, ... Furthermore, current best.
Preserving Qualitative Data: A Data Model to Prepare Computer Assisted Qualitative Data Analysis Software Data for Long-term Preservation Umar Qasim

Kendall Roark

University of Alberta Alberta, Canada

Purdue University Indiana, USA

[email protected]

[email protected]

ABSTRACT Rapid changes in the technology has a great impact on the long term access to digital content. This makes preservation of a digital content a challenging task due to the content’s inherent dependency on a specific hardware/software platform. The changes in the technology without any support for backward compatibility, forces the content to become obsolete. Technological obsolescence is a known phenomenon and a number of strategies have been proposed to reduce the impact of software and hardware obsolescence including normalization, emulation and migration.

pieces of information needed for future access. In this poster, authors are proposing a data model approach for CAQDAS applications which can help to extract important pieces of information whereas any gaps are covered with necessary documentation. The proposed data model is intended to provide an approach to extract and preserve all the information, which is part of a CAQDAS application file, in a way that this information can be later assembled and viewed in any other current or future CAQDAS application. Currently, some of the major CAQDAS applications lack support for interoperability amongst various CAQDAS platforms. The proposed data model provides an alternate approach to make these CAQDAS applications interoperable.

File format normalization is one of the preservation strategies which is being widely discussed and used in the digital preservation community. In this strategy, digital objects of a specific type are converted into a single selected format which is thought to have a higher chance of being accessible in the future. This strategy has been used successfully with simpler file formats like text, pdf, images etc. mainly due to the availability of software libraries for normalizing these types of files. One major limitation of this strategy is that there are large number of file formats in use and not every file type has supporting libraries available for conversion purposes. An alternate way is to do this conversion using the original application by exporting or saving the desired content into an industry standard format. Unfortunately, this process is dependent on commercial vendors to provide such a support, which is not always provided. Data driven applications such as Computer Assisted Qualitative Data Analysis Software (CAQDAS) are one example of applications which store data in complex file formats and currently no libraries are available to do the normalization process. Some of these applications are proprietary, further complicating the situation because these vendors not always provide support for converting files into a standard file format.

To get a deeper understanding of the whole process, Roark (2015 forthcoming) conducted one on one interviews with researchers, and Qasim and Roark (2015) conducted both a pilot and a formal workshop on documenting and preserving CAQDAS projects at the University of Alberta. During the pilot and the formal workshop, the authors demonstrated how to take a CAQDAS project apart and capture all the important study documentation embedded in the project file. Participant feedback was solicited to improve the transformation process. In addition, current preservation strategies such as normalization, migration and emulation and the contexts in which each might be used were discussed. Preservation strategies for both proprietary and nonpropriety software were discussed. Furthermore, current best practices and workflows for quality assurance and documentation (metadata, provenance, codebooks, scripts) were reviewed and as well as how to operationalize ethical and contractual commitments around data access and ownership into a data management plan and preservation practices.

Under these circumstances, having a deeper understanding of data models in these complex data files helps in identifying essential iPres 2015 conference proceedings will be made available under a Creative Commons license. With the exception of any logos, emblems, trademarks or other nominated third-party images/text, this work is available for reuse under a Creative Commons Attribution 3.0 unported license. Authorship of this work must be attributed. View a copy of this licence.

In this poster, we would like to share the findings of our work on preserving qualitative research data and analysis documentation. We are proposing a data model driven approach for CAQDAS file preservation and provide guidance on how to extract data model from both proprietary and nonproprietary file formats.

Preserving Qualitative Data: A Data Model to Prepare Computer Assisted Qualitative Data Analysis Software Data for Long-term Preservation Umar  Qasim  (University  of  Alberta),  Kendall  Roark  (Purdue  University) ABSTRACT Rapid  change  in  technology  has  an  impact  on  the  long  term  access  to  digital  content.   This   makes   preservaAon   a   challenging   task   due   to   the   dependence   on   specific   hardware/soQware   plaforms.   Tools   are   not   always   available   to   perform   normalizaAon   on   complex   file   formats   such   as   qualitaAve   data   analysis   soQware   files.   In   this   work,   we   are   proposing   a   data   model   to   normalize   computer   assisted   qualitaAve   data   analysis   soQware   that   preserves   source   data,   the   value   added   contextual   documentaAon/analyAc   acAvity   and   the   relaAonships   between   and   within  the  two.       CHALLENGES METHODS     hardware/soQware  dependence    semi-­‐structured  Interviews     no  normalizaAon  method  for  project  file   pilot   w orkshops,   p re   &   p ost   t est      interdisciplinary  literature  review   communicaAon,  repository  &  researchers       synthesis,   collaboraAve  concept  mapping   maintaining  context  and  confidenAality    

Figure 1. Conceptual Model for Long-term Access & Use

Figure   1   represents   a   conceptual   data   model   created   by   Roark   in   collaboraAon   with   Qasim.   In   addiAon   to   interviews   and   workshop   pilot   parAcipants,   this   model   is  informed  by  the  work  of  CliggeG  (2013),  CorA  (2011)  and  Roorda  and  van  den  Heuval  (2012).  This  model  highlights  the  value  added  processing  and  analysis  (as   annotaAon),  the  relaAonships  and  sources/data  and  draws  on  interdisciplinary  discussions  of  open  annotaAon.    

Figure 2. Data Model for Preservation

NEXT STEPS Roark  will  work  with  Qasim  to  develop  case  studies  that  can  help  translate   potenAal   researcher   idenAfied   problems   into   “data   stories”   for   the   repository   development   team.   It   may   be   useful   to   expand   this   work   by   directly   involving   developers   and   researchers   in   “future   workshops”   informed  by  the  principles  of  parAcipatory  design.      

                 

RECOMMENDATIONS think  about  re-­‐use  outside  of  CAQDAS   think  about  implicaAons  for  preservaAon  workflows   prioriAze  communicaAon  across  fields  of  experAse   need  for  flexible  approach  

WORKS  CONSULTED  

Figure  2  represents  a  data  model  created  by  Qasim  in  collaboraAon  with  Roark.  In  addiAon  to  workshop  pilot  parAcipants,  this  model  is   informed   by   the   requirements   and   exisAng   workflows   of   an   insAtuAonal   repository,   data   asset   management   system   and   local   deployment  of  the  ArchivemaAca  soQware  uAlized  to  create  archival  informaAon  packages.    

1.  CliggeG,  L.  (2013).  QualitaAve  data  archiving  in  the  digital  age:  Strategies  for  data  preservaAon  and  sharing.  The   QualitaAve  Report,  18(How  To  Art.  1),  1-­‐11.  Available  at:  hGp://www.nova.edu/ssss/QR/QR18/cliggeG1.pdf   2.  CorA,  L.;  Gregory,  A.  (2011).  CAQDAS  Comparability.  What  about  CAQDAS  Data  Exchange?  Forum  QualitaAve   Sozialforschung  /  Forum:  QualitaAve  Social    Research,  Vol  12,  No  1.  Available  at:  URN: hGp://nbn-­‐resolving.de/urn:nbn:de:0114-­‐fqs1101352   3.  Roorda,  D.  and  van  den  Heuval,  C.  (2012).  AnnotaAon  as  a  New  Paradigm  in  Research  Archiving,  Proceedings  of  the   American  Society  for  InformaAon  Science  and  Technology;  Volume  49,  Issue  1,  pages  1-­‐10,  2012,  arXiv:1412.6069v1

23