Preserving Qualitative Data: A Data Model to Prepare Computer Assisted Qualitative Data Analysis Software Data for Long-term Preservation Umar Qasim
Kendall Roark
University of Alberta Alberta, Canada
Purdue University Indiana, USA
[email protected]
[email protected]
ABSTRACT Rapid changes in the technology has a great impact on the long term access to digital content. This makes preservation of a digital content a challenging task due to the content’s inherent dependency on a specific hardware/software platform. The changes in the technology without any support for backward compatibility, forces the content to become obsolete. Technological obsolescence is a known phenomenon and a number of strategies have been proposed to reduce the impact of software and hardware obsolescence including normalization, emulation and migration.
pieces of information needed for future access. In this poster, authors are proposing a data model approach for CAQDAS applications which can help to extract important pieces of information whereas any gaps are covered with necessary documentation. The proposed data model is intended to provide an approach to extract and preserve all the information, which is part of a CAQDAS application file, in a way that this information can be later assembled and viewed in any other current or future CAQDAS application. Currently, some of the major CAQDAS applications lack support for interoperability amongst various CAQDAS platforms. The proposed data model provides an alternate approach to make these CAQDAS applications interoperable.
File format normalization is one of the preservation strategies which is being widely discussed and used in the digital preservation community. In this strategy, digital objects of a specific type are converted into a single selected format which is thought to have a higher chance of being accessible in the future. This strategy has been used successfully with simpler file formats like text, pdf, images etc. mainly due to the availability of software libraries for normalizing these types of files. One major limitation of this strategy is that there are large number of file formats in use and not every file type has supporting libraries available for conversion purposes. An alternate way is to do this conversion using the original application by exporting or saving the desired content into an industry standard format. Unfortunately, this process is dependent on commercial vendors to provide such a support, which is not always provided. Data driven applications such as Computer Assisted Qualitative Data Analysis Software (CAQDAS) are one example of applications which store data in complex file formats and currently no libraries are available to do the normalization process. Some of these applications are proprietary, further complicating the situation because these vendors not always provide support for converting files into a standard file format.
To get a deeper understanding of the whole process, Roark (2015 forthcoming) conducted one on one interviews with researchers, and Qasim and Roark (2015) conducted both a pilot and a formal workshop on documenting and preserving CAQDAS projects at the University of Alberta. During the pilot and the formal workshop, the authors demonstrated how to take a CAQDAS project apart and capture all the important study documentation embedded in the project file. Participant feedback was solicited to improve the transformation process. In addition, current preservation strategies such as normalization, migration and emulation and the contexts in which each might be used were discussed. Preservation strategies for both proprietary and nonpropriety software were discussed. Furthermore, current best practices and workflows for quality assurance and documentation (metadata, provenance, codebooks, scripts) were reviewed and as well as how to operationalize ethical and contractual commitments around data access and ownership into a data management plan and preservation practices.
Under these circumstances, having a deeper understanding of data models in these complex data files helps in identifying essential iPres 2015 conference proceedings will be made available under a Creative Commons license. With the exception of any logos, emblems, trademarks or other nominated third-party images/text, this work is available for reuse under a Creative Commons Attribution 3.0 unported license. Authorship of this work must be attributed. View a copy of this licence.
In this poster, we would like to share the findings of our work on preserving qualitative research data and analysis documentation. We are proposing a data model driven approach for CAQDAS file preservation and provide guidance on how to extract data model from both proprietary and nonproprietary file formats.
Preserving Qualitative Data: A Data Model to Prepare Computer Assisted Qualitative Data Analysis Software Data for Long-term Preservation Umar Qasim (University of Alberta), Kendall Roark (Purdue University) ABSTRACT Rapid change in technology has an impact on the long term access to digital content. This makes preservaAon a challenging task due to the dependence on specific hardware/soQware plaforms. Tools are not always available to perform normalizaAon on complex file formats such as qualitaAve data analysis soQware files. In this work, we are proposing a data model to normalize computer assisted qualitaAve data analysis soQware that preserves source data, the value added contextual documentaAon/analyAc acAvity and the relaAonships between and within the two. CHALLENGES METHODS hardware/soQware dependence semi-‐structured Interviews no normalizaAon method for project file pilot w orkshops, p re & p ost t est interdisciplinary literature review communicaAon, repository & researchers synthesis, collaboraAve concept mapping maintaining context and confidenAality
Figure 1. Conceptual Model for Long-term Access & Use
Figure 1 represents a conceptual data model created by Roark in collaboraAon with Qasim. In addiAon to interviews and workshop pilot parAcipants, this model is informed by the work of CliggeG (2013), CorA (2011) and Roorda and van den Heuval (2012). This model highlights the value added processing and analysis (as annotaAon), the relaAonships and sources/data and draws on interdisciplinary discussions of open annotaAon.
Figure 2. Data Model for Preservation
NEXT STEPS Roark will work with Qasim to develop case studies that can help translate potenAal researcher idenAfied problems into “data stories” for the repository development team. It may be useful to expand this work by directly involving developers and researchers in “future workshops” informed by the principles of parAcipatory design.
RECOMMENDATIONS think about re-‐use outside of CAQDAS think about implicaAons for preservaAon workflows prioriAze communicaAon across fields of experAse need for flexible approach
WORKS CONSULTED
Figure 2 represents a data model created by Qasim in collaboraAon with Roark. In addiAon to workshop pilot parAcipants, this model is informed by the requirements and exisAng workflows of an insAtuAonal repository, data asset management system and local deployment of the ArchivemaAca soQware uAlized to create archival informaAon packages.
1. CliggeG, L. (2013). QualitaAve data archiving in the digital age: Strategies for data preservaAon and sharing. The QualitaAve Report, 18(How To Art. 1), 1-‐11. Available at: hGp://www.nova.edu/ssss/QR/QR18/cliggeG1.pdf 2. CorA, L.; Gregory, A. (2011). CAQDAS Comparability. What about CAQDAS Data Exchange? Forum QualitaAve Sozialforschung / Forum: QualitaAve Social Research, Vol 12, No 1. Available at: URN: hGp://nbn-‐resolving.de/urn:nbn:de:0114-‐fqs1101352 3. Roorda, D. and van den Heuval, C. (2012). AnnotaAon as a New Paradigm in Research Archiving, Proceedings of the American Society for InformaAon Science and Technology; Volume 49, Issue 1, pages 1-‐10, 2012, arXiv:1412.6069v1
23