dysphonia represents a difficulty or deviation in the vocal production that impedes ... Scientific Advisory Committee of the Medical Outcomes Trust â SAC has ...
Validation of self-assessment protocols in languages different from the original version Mara Behlau, PhD and Gisele Gasparini, MSc Speech language pathologists, Consultants in Human Communication – Corporate SLP - “Centro de Estudos da Voz – CEV”, São Paulo, Brazil Recently, the World Health Organization - WHO broadened the health concept so that it could include the aspects of quality of life in its definition of complete physical, mental and social well-being 1. According to WHO, health and treatment outcome evaluation must include not only the indicators of severity and frequency of disease, but also an estimate of well-being, which can be measured by evaluating the individual’s quality of life. This organization defines quality of life as individuals perception of their position in life in the context of the culture and value systems in which they live and in relation to their goals, expectations, standards and concerns 2,3. This is an ample concept that may be affected in many different ways according to the individual’s physical health, psychological state, independence level, social relations, and personal beliefs, as well as the environment-related characteristics1. Quality of life evaluation is basically done by means of questionnaires, many of which developed in English and directed to the population that speaks this language. Thus, in order for these instruments to be used in other languages, they must be translated and adapted based on international guidelines and its measure properties be demonstrated in a specific cultural context 4,5. The instrument must be culturally adapted and carefully translated and tested, avoiding literal translation that excludes cultural and social contexts5. Instruments must be submitted to tests in order to have its validity, reliability and responsiveness proved. Equally, such instruments must be able to evaluate specific populations, for instance, patients with cancer, war refugees, or even certain disorders such as dysphonia 6. Among several infirmities, dysphonia represents a difficulty or deviation in the vocal production that impedes natural voicing 7, most of the time it is a disorder that does not risk individual’s life, and so treatment is elective. A simple translation of a protocol originally developed in another language is far bellow the minimum required for its applicability. There are different types of recommendations to validate protocols. The Scientific Advisory Committee of the Medical Outcomes Trust – SAC has established several guidelines to evaluate protocols that are submitted to the Trust for inclusion in the library8. There are eight instrument attributes to base their evaluation and the need to fulfill all requirements or parts of them depends on the intent and application of the instrument, such as to distinguish between two or more groups, to assess change over time, or to predict future status. Regarding quality of life and voice disorders the available protocols not predict future changes; however they can be used to separate groups and to evaluate treatment outcomes.The 1st requirement is related to the conceptual and measurement model which applies only when a new instrument is under development. This attribute is related to the rationale for and description of the concept(s) that the measure is intended to assess and must include a vast survey with patient data, professional information from charts and experts opinion on the issue. The 2nd requirement is reliability of the instrument, which is the degree to which an instrument is free from random error. In order to demonstrate it, two steps must be achieved: internal consistency (Coefficient Cronbach's alpha) and test-retest reproducibility. A translated protocol must complete this requirement. The 3rd requirement is the validity of the instrument, which is the degree to which the instrument measures what it purports to measure. There are three ways of getting evidence for validation: contentrelated (experts panel judgments of the clarity, comprehensiveness and redundancy of items and scales of an instrument); construct-related (indication of logical relations with other measures, scores or evaluations); criterion-related (relation to widely accepted valid measures). The two first ways are
usually accomplished in the health area but the last one is rarely tested because of the lack of widelyaccepted criterion measures. The 4th requirement is responsiveness, which reflects the ability of the protocol to detect any minimal important change regarding the health condition or the specific problem addressed by the instrument. Responsiveness is also referred to as sensitivity to change. In order to achieve this step an intervention must be performed and groups pre and post treatment should be compared using the same instrument. The 5th requirement is interpretability which is related to the degree to which one can assign qualitative meaning to an instrument's quantitative scores. Interpretability of a measure is usually carried out when developing a protocol and not when validating it into another language. The 6th requirement is respondent burden, which is related to the time, energy and other demands placed on those to whom the instrument is administered. Another perspective of this step is to consider the interviewer-administrative burden that is defined as the demands placed on those who administer the instrument. The 7th requirement is the alternative forms that refer to all modes of administration other than the original source instrument. This can include self-administered self-report, interviewer-administered self-report, trained observer rating, computer-assisted self-report, computer-assisted intervieweradministered, and performance based measures. In addition, alternative forms may include reduced versions or several versions using items with exact correlation. The 8th requirement is the cultural and language adaptation that involves two primary steps: assessment of conceptual and linguistic equivalence, and evaluation of psychometric properties. The conceptual equivalence is the equivalence in relevance and meaning of the same concepts being measured in different cultures and/or languages. Linguistic equivalence refers to equivalence of question wording and meaning in the formulation of items, response choices, and all aspects of the instrument and its applications. The guidelines recommended for this requirement are to have at least two forward translations from the source language, preferably by persons experienced in translations, and in health status research. This should result in a pooled forward translation; at least one backward translation to the source language that results in another pooled translation; a review of translated versions by expert panels with revisions; and finally field tests to provide evidence of comparability. CEV took a major task of validating the 3 most important protocols regarding the impact of a voice problem in one’s quality of life. All protocols were translated by two bilingual speech-language pathologists, who were also English teachers. The backward-translation was done by an English teacher, who was not a speech-language pathologist that had no previous contact with the instrument and has not participated in the first stage. The three translators were previously informed about the objective and procedure of the validation. A committee with 5 voice specialists revised the final protocols. SPSS - Statistical Package for Social Sciences, 10.0 version, was used to perform all statistical analysis. Level of significance adopted was 5% (0,050). Validity was determined by comparing protocols’ scores to self-rating of voice quality with Kruskal-Wallis Test. To determine internal consistency Cronbach’s alpha correlation coefficient was generated and Wilcoxon MatchedPairs Signed-Ranks Test was performed to determine reproducibility. To determine test-retest reproducibility, voice patients were administered the protocols a second time before any kind of treatment. A typical and effective retest interval is usually between 2 to 14 days. This period should be short enough so that not much change have happened and long enough so that patients would not remember their answers9. Responsiveness was evaluated comparing pre and post-treatment, voice quality rating and the protocols’ scores. The first instrument selected was the V-RQOL, which has received the Brazilian Portuguese abbreviation of QVV (Qualidade de Vida em Voz – Figure 1) and it was validated by Gasparini, Behlau (200610, 200711). The selection was based on its functional simplicity and on the fact that it has very clear questions. The original protocol was introduced by Hogikyan, Sethuraman (1999) 9. It is a 10-item instrument for voice disorders, containing s a Physical Functioning domain (items 1,2,3,6,7,9)
and a Social-Emotional domain (items 4,5,8,10). The protocol provides a total and two domain scores, by calculation of a standard algorithm. Scores may vary from 0 to 100, with 0 indicating a very poor VRQOL and 100 an excellent one. In addition to terms and words translation, it was necessary to change the sentences of the Likert scale ratings, since they were based both on how severe the problem is and how frequent it happens. This double attribute rating confused the Brazilian respondents and it was not possible to keep it the way it was in the original version for the Brazilian culture. Some respondents got confused with the double assessment, offering comments such as “it happens occasionally but I consider it a big problem” or “it happens frequently but I don’t see it as a problem”. Therefore, in order to accomplish the cultural and linguistic adaptation, modifications were introduced to reflect the cultural perspective of such evaluation. In order to evaluate cultural and linguistic equivalency of the QVV (V-RQOL), the option “not applicable” was introduced to each item of the questionnaire and was administered to 38 patients. None of the questions showed to be invalid. The instrument was administered to 234 individuals, 114 presenting with vocal complaint, 19 men and 95 women, aged between 18 to 79 years, mean of 41,3 years and to 120 individuals presenting with dermatological complaints, 31 men and 89 women, aged between 16 to 75 years, mean of 43 years. All individuals also gave a self-rating of his/her voice quality using a Likert Scale with 5 items: poor, fair, good, very good, or excellent. 19 patients submitted to voice rehabilitation were administered a post-treatment V-RQOL questionnaire and also gave a post-treatment self-rating of voice quality. The validation was determined by the comparison of the dysphonic and non-dysphonic group, with statistical differences, considering the self-assessment of vocal quality and different domains of the instruments (vocal complaint group: total score - p = 0.008, physical score p = 0.007 and socioemotional score p = 0.03; dermatological complaint group: total score - p = 0.091, physical score p = 0.168 and socioemotional score p = 0.67). Results showed that internal consistency was demonstrated with high coefficient values (p