.
.
Nursing Research March/April 2006 Vol 55, No 2S, S50–S56
Generating New Knowledge From Existing Data The Use of Large Data Sets for Nursing Research Tracy Magee
4
Susan M. Lee
4
b Background: An unprecedented amount of data from a variety of disciplines containing variables of interest to nursing are available to nurse researchers. In response, the use of large data sets is emerging as a legitimate method that can help facilitate the translation of knowledge to practice. b Objective: To explore the spectrum of methodological issues and practical applications encountered by three nurse researchers using secondary data analysis of three existing large data sets as a means to ask new questions and generate new nursing knowledge. b Methods: Three research studies using the analysis of three existing large data sets were described. The following are discussed: developing a theoretical framework, selecting an appropriate data set, operationalizing and measuring variables, preparing data for analysis, and identifying threats to validity and reliability. b Results: Although the use of existing data may shorten the time from question to answer, the research process remains the same. The three research studies were used to illustrate conceptual congruence, threats to internal and external validity, and threats to reliability and generalizability. b Discussion: Data obtained from a variety of disciplines and for a variety of reasons can and should be used to answer nursing practice and research questions. Using existing large data sets offers nurse researchers a unique opportunity to ask and answer questions that can affect how nurses care for patients in a time-effective and costefficient manner. Exploring the spectrum of methodological issues and practical applications involved in this work will help guide nurse researchers through the process. b Key Words: databases & nursing methodology research & nursing theory
T
he recent increase in the capabilities of digital technology has resulted in the ability to electronically capture and store enormous amounts of data that are growing at record speed in all disciplines (Berger & Berger,
S50
Karen K. Giuliano
4
Barbara Munro
2004). Existing large data sets, including administrative and patient-oriented data sets, may contain hundreds of variables of interest to nursing researchers. The opportunities to test nursing theories, generate knowledge for practice, and evaluate patient and nursing outcomes are unprecedented. In response, analysis of large data sets is emerging as a legitimate method of nursing research that may facilitate the translation of knowledge to practice. Secondary data analysis (SDA) is the use of existing data to address new research questions or methods (Black, 1995; Pollack, 1999). Some of the unique conceptual and methodological considerations of using existing data sets and of SDA have been addressed by nurse researchers, such as choosing a database (Moriarty et al., 1999), combining data sets (Orsi et al., 1999), weighing conceptual congruence (Shepard et al., 1999), handling statistical issues regarding complex sampling designs (Kneipp & Yarandi, 2002), and determining overall benefits and limitations (Smaldone & Connor, 2003). The purpose of this article is to describe the methodological issues and practical applications encountered by three nurse researchers during the SDA of three existing large data sets as a means to ask new questions and generate new nursing knowledge.
Methods In the first study, Magee used the National Longitudinal Survey of Youth 1979 (NLSY79) to identify the relationships within a proposed conceptual framework of childY motherYenvironment transactions, based on a transactional model of child development, and to identify the contribution
Tracy Magee, PhD, RN, CPNP, is Postdoctoral Fellow, Department of Maternal Child Nursing, University of Illinois at Chicago. Susan M. Lee, PhD, RN, NP-C, is Clinical Nurse Specialist, Knight Nursing Center for Clinical and Professional Development, Massachusetts General Hospital, Boston. Karen K. Giuliano, PhD, RN, FAAN, is Clinical Research Scientist, Philips Medical Systems, Andover, Massachusetts. Barbara Munro, PhD, RN, FAAN, is Dean and Professor, William F. Connell School of Nursing Boston College, Chestnut Hill, Massachusetts. Nursing Research March/April 2006 Vol 55, No 2S
Copyr ight © Lippincott Williams & Wilkins. Unauthorized reproduction of this article is prohibited.
Nursing Research March/April 2006 Vol 55, No 2S
of each variable to behavior problems of the school-age child (Magee, 2004; Magee & Roy, submitted for publication). The NLSY79 was initiated by the Department of Labor, Bureau of Labor Statistics, to gather information on young people aged 14 to 23 years in 1979 (Mott, 1995). In 1986, the National Institute of Child Health and Human Development began to sponsor supplemental surveys to collect data about the children born to women of the NLSY79. The supplemental child surveys consist of a battery of cognitive, socioemotional, and physiological assessments that have been administered every other year since 1986. In the second study, Giuliano used the Project IMPACT data set to assess the clinical relevance of the early physiologic screening criteria advocated by Early Goal-Directed Therapy for Sepsis, the Surviving Sepsis Campaign guidelines, and the Institute for Healthcare Improvement Sepsis Bundle (Giuliano, 2004; Giuliano, in press). The international Project IMPACT data set was created in conjunction with the Society of Critical Care Medicine (SCCM) and nearly 100 SCCM multidisciplinary critical care experts. The data set represents a large and diverse set of critically ill patients. One of the purposes of the Project IMPACT data set was to give critical care clinicians access to the data for critical care research. The validity of the data set has been established (Cook, Visscher, Hobbs, & Williams, 2002). In the third study, Lee used the Behavioral Risk Factor Surveillance System (BRFSS) to assess which personal factors of the Health Promotion Model (HPM) predicted the likelihood of persons with diabetes participating in diabetes education (Lee, 2005). The BRFSS is an annual telephone survey of persons in the United States and its territories conducted by the National Center for Chronic Disease Prevention and Health Promotion, Centers for Disease Control, and is designed to collect data on behavioral determinants of health. The studies will be referred to respectively as the child behavior study, the sepsis study, and the diabetes education study (Table 1).
Results Using existing data to ask and answer a research question may seem less demanding than asking the question and collecting the data to answer the question. However, to conduct a rigorous secondary analysis, the nurse researcher must give careful consideration to issues that could threaten the validity, reliability, and generalizability of the study. Use of the following strategies will help insure the rigor of the secondary analysis. Theoretical Framework The use of any large data set to generate new knowledge requires a conceptual match between the primary data collection and the use of the existing data to ensure the rigor of the research findings. All research should begin with a hierarchical structure to include a conceptual model, a theory, and an empirical research method (Fawcett, 2004). There must be a conceptual fit between the original data collection and the secondary analysis (Orsi et al., 1999). A conceptual model is a way of organizing and classifying concepts into a useful structure (Kim, 1996) to
New Knowledge From Existing Data In Nursing Research S51
view phenomena. A good conceptual fit should focus only on the phenomena that are relevant. There was a conceptual match between the NLSY79 and the child behavior study; both reflected concepts of transactions among individuals, families, and their environments over time. The Project IMPACT data set contained all the variables of interest for the sepsis study and was therefore a good conceptual match. There was also a good conceptual match between the BRFSS data set and the diabetes education study because all the variables of interest for the study were included as part of the BRFSS. In the conceptualYtheoreticalYempirical hierarchical structure, the theory used to guide a study is a reflection of the conceptual framework. The importance of a theory to guide a secondary analysis cannot be overstated. Large data sets may be used to test propositions of a theory or explore relationships between concepts or variables of a theory (Polit & Hungler, 2002). Therefore, the theory used to guide the study should direct the empirical method and the research question to be asked and, in turn, inform the selection of concepts or variables. For example, the use of the HPM (Pender, Murdaugh, & Parsons, 2002) to guide the diabetes education study influenced the research question: ‘‘To what extent do the personal factors of the HPM, biological factors (age, gender, body mass index [BMI], comorbidities, disabilities, insulin use), sociocultural factors (marital status, education, race, ethnicity, employment, income, insurance), and psychological factors (mental health, perceived health status) predict the likelihood of persons with diabetes participating in diabetes education?’’ The variables deemed important to the study, derived from diabetes education literature and HPM literature, framed the selection of variables from the BRFSS data to those that reflected personal factors as outlined in the HPM. With so many variables from which to choose in large data sets, the impulse to select interesting variables can be almost irresistible. However, ignoring the importance of a conceptual match and a theoretical framework may result in the inclusion of variables that are not pertinent to answering the research question and will dilute the focus of the research and may add potential theoretical and methodological flaws to the study. The conceptual fit, the theoretical framework, and the research question should be used together to limit the selection of variables to only those items or variables that are conceptually meaningful. Whether the researcher selects the data set first or asks a research question first, a conceptual match between the data and the research question will guard against threats to validity and reliability and will increase the ability to generalize the findings. Strategies to Minimize Error Measurement, inherent in all research, is not just discovering the differences or similarities between phenomena but also minimizing error at all phases of the research process (Polit & Hungler, 2002). Minimizing error is of particular importance when using a large data set because of the lack of a priori controls specific to the research being conducted. As discussed above, a conceptual match between the primary data collection and the existing data is an imperative first step to minimize error and increase the
Copyr ight © Lippincott Williams & Wilkins. Unauthorized reproduction of this article is prohibited.
S52 New Knowledge From Existing Data In Nursing Research
Nursing Research March/April 2006 Vol 55, No 2S
q
TABLE 1. Large Data Sets and Nursing Research: A Brief Review of Three Studies
Title
Data set
Sample (n)
Theoretical framework
Analytic strategies
Behavior Problems in Childhood: Testing an Interactive Model
National Longitudinal Survey 1979 Children and Youth 1992Y1998 survey Sponsored by United States Bureau of Labor
721
Transactional model of child development
Multiple regression
The Use of Common Physiologic Monitoring Parameters in the Identification of Critically Ill Patients at Risk for Sepsis
Project IMPACT Ongoing (begun in 1996) International data set (primarily US) Critically ill adult patients
727
Biophysical, human pathophysiology
Logistic regression
Participation in Diabetes Education: A Secondary Analysis of the Behavioral Risk Factor Surveillance Survey 2003
Behavioral Risk Factor 1,924 Surveillance Survey, 2003 Sponsored by National Center for Chronic Disease Prevention and Health Promotion, Centers for Disease Control
HPM
Logistic regression
Research question To what extent do the child variables of age, gender, temperament, motor and cognitive development, chronic illness, and prenatal influences; parent variables of age, education, and parenting ability; and environmental variables of income and social support in children from 1 year to 4 years old explain the variance in child behavior problems in young school age children? Based on measurement in the first 24 hr of the critical care admission, can a combination of the physiologic parameters of heart rate, mean arterial blood pressure, body temperature, respiratory rate, and urine output be used to distinguish between critically ill adult patients with a diagnosis of sepsis and those without a diagnosis of sepsis? To what extent do personal factors of the HPM, biological factors (age, gender, BMI, comorbidities, disabilities, insulin use), sociocultural factors (marital status, education, race, ethnicity, employment, income, insurance), and psychological factors (mental health, perceived health status) predict the likelihood of persons with diabetes participating in diabetes education?
Note. HPM = Health Promotion Model.
validity or level of control present in the research. Without a conceptual match, the ability to have either valid or reliable data is greatly minimized. Poor selection of the variables and data from which the analysis will occur poses a significant threat to both external and internal validity. If the researcher first asks a question and then looks for the data set from which to answer the question, then external validity is threatened because the data set may or may not adequately represent the target population that is most pertinent to the research question. Conversely, if the re-
searcher obtains a data set and then asks the question, internal validity is potentially threatenedVthe data set may or may not have the robustness needed for appropriate control of any confounding variables or influences that may affect the outcome measure. For example, the child behavior study was grounded on the assumption that the simultaneous transactions among child, parent, and environment influence child development (Sameroff, 1975). The NLSY79 was a good conceptual match to answer the research question because it was possible to query child,
Copyr ight © Lippincott Williams & Wilkins. Unauthorized reproduction of this article is prohibited.
Nursing Research March/April 2006 Vol 55, No 2S
parent, and environmental variables simultaneously. It also contained all of the variables important to control for numerous potential confounding influences pertinent to the research question being asked. Had that not been the case, the NLSY79 would not have been a good match for the child behavior study. Using large data sets does not eliminate the need for data management, particularly data cleaning and preparation. In fact, the effort required at this stage of the research process is likely to be substantially more than that required in a prospectively designed study. Preparing existing large data sets for statistical analysis may be a multiple-step process. For instance, in the sepsis study, the data were provided by the Project IMPACT data set in Microsoft Access format and in three separate files. Because of the architecture of the Microsoft Access software, importing data directly into SPSS (Version 10, the software used for the statistical analysis) was not possible. As a result, substantial data preparation was required before conducting any analyses. First, the data from the three files were merged, so that all of the variables of interest were contained in a single file. Next, the data were moved from Access to Microsoft Excel, and then from Microsoft Excel into SPSS. The final data transfer was done in several stages because Microsoft Excel is only able to import a maximum of 65,000 records during a single session. Because of this technical difficulty, as well as the likelihood of introducing error at any one or more of those points during data transfer, it was necessary for the researcher to employ the assistance of a statistical consultant for this part of the data preparation. Once the data are in an appropriate statistical package, further cleaning may be required to make data meaningful. For example, in the sepsis study, some of the core body temperature measurements that were used had been entered into the data set in degrees centigrade, and others had been entered in degrees Fahrenheit. Because of this inconsistency in data entry, several additional calculations had to be performed to create a new temperature variable to record temperature using a consistent metric. Another example from the sepsis study with regard to data management was the need to create two subject comparison groups based on illness acuity before running the analyses. This step was made significantly more complicated by using a large data set. A stratified random sampling strategy was used, which required many more steps than that if a prospectively collected sample had been used. Missing data is an issue in all researches. Large data sets often contain missing data disguised in responses such as ‘‘don’t know’’ or ‘‘not sure.’’ In the diabetes education study and in the child behavior study, these responses were recoded as missing data (Schafer & Graham, 2002; Wang & Fan, 2004). Another potential downfall of existing data is that important variables may not be available at all for analysis due to missing data. In the sepsis study, all the necessary data related to patient weight were virtually missing because that particular variable was not a mandatory data entry field in the data set; as a result, one of the research questions could not be answered.
New Knowledge From Existing Data In Nursing Research S53
Study validity is the measure of the accuracy of the claim and is a direct reflection on the propositions from which the study was developed (Burns & Groves, 2001). This aspect of research is particularly vulnerable when using large data sets because, generally, the researcher has limited or no ability to establish or assess the accuracy of measures independently. For example, in the sepsis study and the child behavior study, the use of existing data sets precluded the researchers’ ability to determine the accuracy of any instrumentation used for data collection or the accuracy of any of the physiologic data collected. In the sepsis study, because the data set provided only intermittent rather than continuous measurements of the physiologic data, it was possible that important trends or patterns for the early identification of patients with sepsis were missed, an issue that could have been better controlled in a prospective study design. External validity, a design characteristic, is influenced by judgments that the researcher makes during sampling procedures. In contrast to the possible limitations regarding internal validity when using large data sets, external validity has the potential to be stronger. Because many large data sets use complex sampling designs, the results obtained using large data set analysis may be more generalizable than many smaller individual prospective studies. However, because the sample sizes can be so large, the researcher must be particularly cognizant of artificial inflation of the significance of the results. When reporting the results, the significance values should be carefully interpreted, appropriate corrections to protect against Type I error (e.g., Bonferroni or Tukey) should be used as needed, and an assessment between statistical and clinical significance should be discussed explicitly. In the child behavior study, only 8% of the variance in child behavior problems was accounted for by the study’s conceptual model; however, the conceptual model was statistically significant in large part because of the large sample size (n = 720). This was true also of the sepsis study, where the model was significant (n = 727), although only 13.1% of the variance was accounted for by the model. In the sepsis study, the number of patients with an admitting diagnosis of sepsis in a data set of approximately 120,000 critically ill patients was 363 (G0.003%). Given the projection by Angus et al. (2001) of 750,000 new cases of severe sepsis annually in the United States, as well as recent data suggesting that sepsis is the fourth most common principal diagnosis in hospitalized patients, this number was expected to be much higher. Perhaps, the incidence and prevalence of sepsis is underreported in some large national databases or may be embedded in other diagnostic categories. However, the accuracy of the diagnosis of sepsis in this data set could threaten the internal and external validity of the study. If the inaccuracy of the diagnosis was caused by underreporting of sepsis in the critically ill patient or by misdiagnosis, then internal validity is threatened (Cook & Campbell, 1979) because there are differences between the kinds of participants in the group. For this study, the information in the data set did not match norms from other sources, creating a possible threat to external validity or ability to generalize the findings of this study.
Copyr ight © Lippincott Williams & Wilkins. Unauthorized reproduction of this article is prohibited.
S54 New Knowledge From Existing Data In Nursing Research History is a particularly relevant threat to internal validity when using an existing large data set, especially data sets that are a combination of successive years of data. For example, the NLSY79 has assessed children every other year since 1986. However, although children born in 1990 and children born in 1998 received the same assessments, and these assessments could be compared statistically, in the 8 years between 1990 and 1998, there were many societal changes that may have had effects on the dependent variable of child behavior by affecting the child’s home and school environments. In existing large data sets using established instruments, reliabilities (retest, alternative form, split halves, or internal consistency) provided by the instrument’s author, the reliabilities provided by the primary research, and the reliabilities provided by the current sample must be considered carefully. In the child behavior study, not all the child assessments were reported as statistically reliable in the primary research, particularly the temperament scales. Because of this, the researcher was only able to use one subscale to operationalize the concept of temperament. However, two subscales measuring aspects of parenting ability and environmental stimulation were reported as reliable but were not internally consistent in the study author’s sample. Therefore, subscales from two assessment years (1994 and 1996) were added together to achieve internal consistency for the child behavior research sample. Another issue related to the validity and reliability of the concept of chronic illness was encountered during the child behavior study. The researcher had to create and test a new scale from individual questions to reflect that concept adequately. For example, the chronic illness scale was created using questions pertaining to health issues related to specialized medical supervision, medication, or special equipment. The questions were reviewed by two experienced advanced practice nurses to assess content validity. Variables were then recoded into dichotomous variables so that the higher scores indicated more severe chronic illness. Testing showed that the new scale was psychometrically sound, with a Cronbach’s " of greater or equal to .70. As the use of existing large data sets becomes more popular in nursing research, consensus is needed regarding the use of weighted versus nonweighted data. Because nonweighted data are easier to analyze and interpret, most researchers will choose these data when appropriate. The use of nonweighted data is permissible when exploring associations among variables, especially in novel or exploratory research (Kneipp & Yarandi, 2002). For example, the aim of the child behavior study was to test an interactive model. Although weighted data were available in the NLSY data set, nonweighted data were used because the research was exploratory. Sample weights do not affect means, and some research can tolerate minor statistical imprecision of confidence intervals for practical significance. The use of nonweighted data allows the use of standard statistical software such as SPSS or SAS, thus facilitating the analysis of large data sets, knowledge discovery, and translation into practice. On the other hand, authors have demonstrated the pitfalls of using standard statistical software when analyz-
Nursing Research March/April 2006 Vol 55, No 2S
ing large data sets, which can result in the underestimation of the true variance (narrow confidence intervals) and erroneously small p values, increasing the likelihood of concluding a difference between groups when no difference actually exists (Kneipp & Yarandi, 2002). Survey sample software, such as SUDAAN (Research Triangle Institute, Research Triangle Park, North Carolina), adjusts for the sampling design and sampling weights and generally provides the greatest degree of accuracy. The advantages of using weighted data include accurate hypothesis test values and better generalizability of the findings to the total population. Using weighted data and survey sample software retains the inherent power in a probability sample, whereas using unweighted data, in effect, can render the large data set a convenience sample. Nurse researchers who desire to analyze weighted data need access to all sample weights, survey sample software, and often, statistical consultation. It is helpful for researchers to consult with the personnel responsible for creating and managing the data set regarding the issue of weighted data and appropriate statistical software. As nursing research using existing large data sets becomes increasingly more common, nurse researchers will look to the literature for guidance as they consider more complex sampling designs. Meanwhile, it is important for researchers to be aware of the statistical implications of complex samples and to include in their manuscripts their reasons for using weighted or unweighted data and to describe the population to which their findings can be generalized (Caplan, Slade, & Gansky, 1999; Kneipp & Yarandi, 2002). For example, in the diabetes education study, the researcher chose not to use weighted data in the analysis and defined the population as the total number of persons with diabetes in the BRFSS. A 10% sample was chosen and compared with the total population on demographic and other key variables to understand best how that sample compared with the total population. Because the sample was found to be highly representative of the diabetes population sample, it was reasonable to conclude that the findings could be generalized to the total population.
Discussion Contribution to Nursing Science Through an SDA of large data sets, nurse researchers have the opportunity to accelerate the pace of nursing knowledge development in two broad areas: nursing theory and nursing practice. Assuming that the primary function of nursing research is to generate or test nursing theories (Fawcett, 1995), the many variables and participants contained in large data sets provide more opportunities than ever before to accomplish that goal. Secondary analysis of large data sets will continue to be an increasingly common method for contributing to nursing research knowledge when the inquiry is conducted through the lens of nursing theory. Use of existing large data sets can accelerate the pace of research by saving time and resources on data collection and allowing those resources to be applied to other aspects of the research process. Use of these data sets allows
Copyr ight © Lippincott Williams & Wilkins. Unauthorized reproduction of this article is prohibited.
Nursing Research March/April 2006 Vol 55, No 2S
researchers to ask complicated questions, use more variables, and study more representative samples than could generally be included using the researcher’s own geography. This is not to imply that the use of existing data is a quick endeavor. Substantial time is devoted to familiarization with the data sets and specialized software, achieving congruence between study variables and theoretical constructs, cleaning and preparing the variables for analysis, recoding and creating new variables, and creating new subscales to better represent the pertinent theoretical constructs. Nevertheless, an existing data set may help expedite the time from inquiry to knowledge discovery and, in turn, offer theory-guided, evidence-based knowledge that is more quickly available for use in practice. Finally, the discipline of nursing will benefit by having its scholars become more knowledgeable regarding the use of large data sets not only for research but also for influencing policy and practice. A notable example of this is research by Aiken, Clarke, Cheung, Sloane, and Silber (2003), who used multiple large data sets to demonstrate a decrease in patient mortality in hospital units where higher proportions of nurses holding bachelor’s degrees were employed. This study has been widely disseminated and serves as an example of how existing knowledge not specific to nursing was used to generate new knowledge that had far-reaching effects across many disciplines.
Recommendations The conceptual and practical issues faced in these three studies lead to the following recommendations to nurse researchers who contemplate using large data sets. Choose the best data set to answer your questions. This may not be the current data set or even a healthcare data set, but it should be the richest set you can find that has the most theoretical congruence to your research. No matter which data set you choose, concessions may be required when variables of interest are absent or unusable or when the level of measurement does not support the assumptions of the statistical analysis that are necessary for your research. Test nursing theory. To advance the knowledge of the discipline, use nursing theory or a theory with application to nursing practice to facilitate the advancement of nursing knowledge. Conduct a small pilot study. Whenever possible, conduct a pilot study with the data or past versions of the data to discover if there are any problems with the variables. Determine if the level of data is adequate for your proposed analysis, if key variables needed to answer the research question are missing or severely skewed, whether or not you can work with the data in the form that they are provided, and if there are data to create any new variables needed to best support your research. Identify threats to validity and reliability. Determine if the measures are accurate, whether accuracy can be tested, the level of data, the nature of the sample included in the data set, and how generalizable the results are likely to be. Select variables that are congruent with the theory being tested and only those variables that are necessary to answer the research questions. You may add some dupli-
New Knowledge From Existing Data In Nursing Research S55
cate variables, in case any of the primary variables are skewed or unusable. Use computations to create new variables using the data provided by other variables in the data set. Resist the urge to add interesting but not theoretically grounded variables. Seek statistical consultation early. Secondary analysis of large data sets provides nurse researchers with unprecedented power in the analyses and may enhance opportunities to publish findings across disciplines. Statisticians can assist with unfamiliar software or analyses.
Conclusion The SDA of existing large data sets is one way to facilitate the research process to accelerate the pace of research from knowledge generation to application in practice. The time and effort it takes to translate current research knowledge into practice is an ongoing issue that should be addressed if patients are to receive the best care possible, and much more effort should be expended in this area by nursing researchers (Giuliano, 2003). The use of large data sets provides an opportunity for nurses to participate in interdisciplinary research and knowledge development while retaining and emphasizing the core dimensions of nursing. In the postmodern era, primarily based on the complexity of humans and their responses to the environment, wide-ranging development and advancement of nursing knowledge are not possible without embracing methodological pluralism. Hence, secondary analysis of large data sets can be an indispensable methodology for generating knowledge that is theory guided, evidence based, and pertinent to translation into nursing practice. q Accepted for publication December 14, 2005. Corresponding author: Tracy Magee, PhD, RN, CPNP, Department of Maternal Child Nursing, University of Illinois at Chicago, Chicago, IL 60637 (email:
[email protected]).
References Aiken, L. H., Clarke, S. P., Cheung, R. B., Sloane, D. M., & Silber, J. H. (2003). Educational levels of hospital nurses and surgical patient mortality. JAMA, 290, 1617Y1623. Angus, D. C., Linde-Zwirble, W. T., Lidicker, J., Clermont, G., Carcillo, J., & Pinsky, M. R. (2001). Epidemiology of severe sepsis in the United States: Analysis of incidence, outcome, and associated costs of care. Critical Care Medicine, 29, 1303Y1310. Berger, A. M., & Berger, C. R. (2004). Data mining as a tool for research and knowledge development in nursing. Computer Informatics in Nursing, 22, 123Y131. Black, C. (1995). Using existing data sets to study aging and the elderly: An introduction. Canadian Journal on Aging, 14, 135Y150. Burns, N., & Groves, S. (2001). The practice of nursing research. Philadelphia, PA: Saunders. Caplan, D. J., Slade, G. D., & Gansky, S. A. (1999). Complex sampling: Implications for data analysis. Journal of Public Health Dentistry, 59, 52Y59. Cook, T. D., & Campbell, D. T. (1979). Quasi-experimentation: Design and analysis issues for field settings. Boston, MA: Houghton Mifflin Company. Cook, S. F., Visscher, W. A., Hobbs, C. L., & Williams, R. L. (2002). Project IMPACT: Results from a pilot validity study of
Copyr ight © Lippincott Williams & Wilkins. Unauthorized reproduction of this article is prohibited.
S56 New Knowledge From Existing Data In Nursing Research a new observational database. Critical Care Medicine, 30, 2765Y2770. Fawcett, J. (1995). Analysis and evaluation of conceptual models of nursing (3rd ed.). Philadelphia: F. A. Davis. Fawcett, J. (2004). Contemporary nursing knowledge: Analysis and evaluation of nursing models and theories (2nd ed.). Philadelphia: F. A. Davis. Giuliano, K. K. (2003). Expanding the use of empiricism in nursing: Can we bridge the gap between knowledge and clinical practice? Nursing Philosophy, 4, 44Y52. Giuliano, K. K. (2004). The use of common physiologic monitoring parameters in the identification of critically ill patients at risk for sepsis. Critical Care Medicine, 32(12), A149. Giuliano, K. K. (in press). Monitoring critically ill patients at risk for sepsis: What does the evidence show? AACN Clinical Issues. Kim, H. S. (1996). Challenge of new perspectives: Moving toward a new epistemology for nursing. Paper presented at the Nursing Knowledge Impact Conference, Boston, MA. Kneipp, S. M., & Yarandi, H. N. (2002). Complex sampling designs and statistical issues in secondary analysis. Western Journal of Nursing Research, 24, 552Y566. Lee, S. (2005). Participation in diabetes education: A secondary analysis of the Behavioral Risk Factor Surveillance Survey 2003 (Doctoral Dissertation, Boston College, 2005). Dissertation Abstracts International, 54(01), 534B. (UMI No 9315947). Magee, T. (2004). Behavior problems in childhood: Testing an interactive model (Doctoral Dissertation, Boston College, 2005). Dissertation Abstracts International, 54(01), 534B. (UMI No. 9315947). Magee, T., & Roy, C. (2005). Predicting school age behavior problems: The role of early childhood risk factors. Manuscript submitted for publication.
Nursing Research March/April 2006 Vol 55, No 2S
Moriarty, H. J., Deatrick, J. A., Mahon, M. M., Feetham, S. L., Carroll, R. M., Shepard, M. P., et al. (1999). Issues to consider when choosing and using large national databases for research of families. Western Journal of Nursing Research, 21, 143Y153. Mott, F. (1995). The children of the NLSY 1992: Description and evaluation. Columbus, OH: Center for Human Resource Research, Ohio State University. Orsi, A. J., Grey, M., Mahon, M. M., Moriarty, H. J., Shepard, M. P., & Carroll, R. M. (1999). Conceptual and technical considerations when combining large data sets. Western Journal of Nursing Research, 21, 130Y142. Pender, N. J., Murdaugh, C. L., & Parsons, M. A. (2002). Health promotion in nursing practice (4th ed.). Upper Saddle River, NJ: Haworth Press. Polit, D., & Hungler, B. (2002). Nursing research: Principles and methods (7th ed.). Philadelphia: Lippincott. Pollack, C. D. (1999). Methodological considerations with secondary analysis. Outcomes Management for Nursing Practice, 3, 147Y152. Sameroff, A. (1975). Transactional models in early social relations. Human Development, 18, 65Y79. Schafer, J. L., & Graham, J. W. (2002). Missing data: Our view of the state of the art. Psychological Methods, 7, 147Y177. Shepard, M. P., Carroll, R. M., Mahon, M. M., Moriarty, H. J., Feetham, S. L., Deatrick, J. A., et al. (1999). Conceptual and pragmatic considerations in conducting a secondary analysis: An example from research of families. Western Journal of Nursing Research, 21, 154Y167. Smaldone, A. M., & Connor, J. A. (2003). Ask the expert. The use of large administrative data sets in nursing research. Applied Nursing Research, 16, 205Y207. Wang, L., & Fan, X. (2004). Missing data in disguise and implications for survey data analysis. Field Methods, 16, 332Y351.
Copyr ight © Lippincott Williams & Wilkins. Unauthorized reproduction of this article is prohibited.