Sophisticated text management software is currently available for doing thematic or ... qualitative analysis; text analysis; computer assisted qualitative data ..... contiguous blocks of text but is best used with small codebooks (i.e., less than.
10.1177/1525822X03261269 FIELD Ryan METHODS / TAGGING AND RETRIEVING BLOCKS OF TEXT
ARTICLE
Using a Word Processor to Tag and Retrieve Blocks of Text GERY W. RYAN
RAND Corporation Sophisticated text management software is currently available for doing thematic or code-based analysis, the principal procedural approach to qualitative data analysis. Such packages allow researchers to tag and retrieve contiguous blocks of data, maintain complex codebooks, manage large corpora of data, and display search results in interesting ways. For simple projects, however, with a few themes and a small number of texts, a complex program that requires a large investment of time to learn may be more technology than one needs. This article provides tips for making better use of the humble word processor, in this case, Microsoft Word. The use of text-formatting features, find-and-replace command, and Microsoft Word’s macro programming language can be used to mark themes in texts and retrieve exemplars of themes on demand. Keywords:
qualitative analysis; text analysis; computer assisted qualitative data analysis systems (CAQDAS)
Today, sophisticated text management software is widely available for
doing thematic or code-based analysis, the principal procedural approach to qualitative data analysis. These software packages allow researchers to tag and retrieve contiguous blocks of data, maintain complex codebooks, manage large corpora of data, and display search results in interesting ways. For simple projects, however, with a few themes and a small number of texts, a complex program that requires a large investment of time to learn may be more technology than one needs. This may be particularly true for short-run market or applied research (e.g., consultancy, evaluation studies) or where the analysis needs to be done by participants (or stakeholders, in the jargon of some fields) who are unfamiliar with the basics of qualitative analysis. One potential solution is to make better use of the humble word processor. Early in the microcomputer revolution, Gillespie (1986) noted that word processor macros could be constructed to handle repetitive tasks in qualitative data analysis. Since Gillespie made his suggestion, better wordprocessing programs have been developed and the macro-programming
Field Methods, Vol. 16, No. 1, February 2004 109–130 DOI: 10.1177/1525822X03261269 © 2004 Sage Publications
109
Downloaded from http://fmx.sagepub.com at PORTLAND STATE UNIV on April 17, 2010
110
FIELD METHODS
language of top-end word processors, such as Microsoft Word and WordPerfect, have become quite sophisticated. (See Bernard [1991] and Ryan [1993] for other examples of using macros for text management.) Later, I will offer some macros for Microsoft Word that can be used for marking themes in texts and retrieving exemplars of themes on demand. But first, some background.
GENERAL APPROACHES TO CODING IN QUALITATIVE DATA ANALYSIS Coding serves two purposes in qualitative analysis: (1) Codes act as tags to identify text in a corpus for later retrieval or indexing. Tags are not associated with any fixed units of text. They can mark simple phrases or extend across multiple pages. (2) Codes act as values assigned to fixed units of data (see Bernard 1991, 2002; Seidel and Kelle 1995). In this case, codes are nominal, ordinal, or ratio scale values that are applied to nonoverlapping units of texts (such as paragraphs, pages, or documents), episodes, cases, or persons. Codes as tags are associated with grounded theory (e.g., Glaser and Strauss 1967; Strauss and Corbin 1990; Dey 1993). Codes as values are associated with classic content analysis and content dictionaries (e.g., Berelson 1952; Pool 1959; Krippendorf 1980; Weber 1990). The two types of code are not mutually exclusive, but the use of one gloss, code, for both concepts can be misleading. Table 1 illustrates the difference between codes as tags and codes as values. The three illness narratives come from undergraduates at a Midwestern university. Signs and symptoms are tagged with italicized text; treatments and behavioral modifications are tagged with underlining, and diagnosis is tagged with small caps. Note that the tags vary in size from a single word (cold) to several lines. The columns to the right of the narratives represent value codes. Each narrative is coded as a separate unit. The variable diagnosis takes on nominal/ categorical values such as cold or sinus/upper respiratory/asthma. Signs and symptoms and treatments are dummy variables, with dichotomous values (yes or no). Duration is coded in days, an interval-level variable. Assigning values to a unit of data is inherently an interpretive, qualitative act. Of course, sometimes the interpretation is obvious. When a respondent says in a narrative that he had a cough, runny nose, and headache, it is clear that we would code the variable coughing as a yes. Coding decisions are not always so simple. In one narrative, the respondent says, “Then the next minute it was like I was in an ice-cube bath!” Coding this as “having chills”
Downloaded from http://fmx.sagepub.com at PORTLAND STATE UNIV on April 17, 2010
111
Downloaded from http://fmx.sagepub.com at PORTLAND STATE UNIV on April 17, 2010
M
F
F
108
116
118
Diagnosis
SINUS/UPPER RESPIRATORY INFECTION/ASTHMA. Drainage into lungs, Sinus/upper down back of throat, lower breathing capacity, used peak flow meter, respiratory shortness of breath, cough, fatigue, wanted to sleep more. Annually ocinfection/ curring. Wheezing, used inhaler three times a day, about every four hours. asthma Had symptoms for three days before going to health center. Coughing up phlegm, sinus headache, ears popped, runny nose. Amoxicillin for two weeks. Dizzy, lightheaded. Lungs felt tight, harder to breathe. The last time I had a COLD my throat was sore. It felt like I had needles in Cold my tonsils. Every time I would swallow it felt like needles were digging in farther and farther. It also felt as though my throat was closing up making it hard to breathe. My nose was stuffed up but it was running like a faucet. There was a lot of pressure in my head like my head was in a vice. I had a horrible headache like someone was smashing my head with a hammer. Every muscle in my body ached. It felt like I couldnt move. I had a 102 degree fever. Sometimes I was so hot I felt like I was on fire. Then the next minute it was like I was in an ice-cube bath! I had difficulty breathing not only because my throat felt like it was closing but also because I felt like someone was sitting on my chest. The last time I had a COLD was back in November, I think. I was tired, Cold crabby, had a sore throat, runny nose, and a bit of a cough. I remember going to Wal-Mart to look for the new Cold-Eeze throat lozenges that my mother swears by. They have zinc in them and are supposed to reduce the lengthof your cold. I couldn’t find them at Wal-Mart because they are a pretty hot item. So I think I just suffered this way throughout the cold with no medication because I’m not a big believer in their benefits (unless, of course, my mother swears by it). I did have some peppermint tea that the midwife at work gave me (I work as an office assistant at a birth center). I tried to get more sleep than usual, but I didnt take any time off of work or school. I remember trying not to kiss my boyfriend (thats pretty tough, you know!) so that he wouldn’t get sick, too. My cold lasted probably five days. It was about the fourth time I had been sick that semester, which is quite unusual for me. I usually only get sick only once or twice a year.
Narratives
Cough Y
?
Y
Sore Throat Y
Y
N
Vomiting N
N
N
N
Y
N
Fever N
Y
N
Fatigue N
N
Y
HR Y
N
N
Treatment
N
N
Y
N
N
Y
N
N
N
CAM
5 days
?
3 days or 14 days
NOTE: Signs and symptoms are in italics. Treatments and behavioral modifications are underlined. Diagnoses are in small caps. HR = home remedies; OTC = over the counter; WM = Western Medical; CAM = complementary and alternative medicine.
Sex
ID
Chills
Signs and Symptoms
OTC
TABLE 1
WM
Example of Tagging and Value Coding
Duration
112
FIELD METHODS
requires an inference, an interpretation. And if you think this is a high inference, just think of all the subtle (and not so subtle) ways there are to say that a person vomited. The interpretive act of assigning a value also requires that the investigator recognize ambiguity and bring his or her own experiences and knowledge to bear. For example, in another narrative, it is unclear just how long the illness lasted. The respondent reports that he waited three days before going to the health center, but then he reports that he took amoxicillin for two weeks. From our experience, we might know that two weeks is the time frame for a standard antibiotic regimen and that the duration of the signs and symptoms may have been much shorter. There are advantages to each coding system. Tagging retains the richness of the data, in that nothing is lost. If you want to mark and retrieve exactly what people said, then tagging is the way to go. Value coding is clearly a data-reduction step, but it allows us to make systematic comparison across units. If you want to know what percentage of narratives mentioned fever and chills, or if you want to know whether having fever and chills is correlated with seeking treatment at a clinic or a doctor’s office, then you simply have to do some sort of value coding. An ideal system would allow you to do both. In the rest of this article, I describe techniques for tagging text. La Pelle (2004 [this issue]) describes techniques for assigning value codes to text.
APPROACHES TO TAGGING AND RETRIEVING Over the years, researchers have developed many ways to tag and retrieve themes from data. Before computers, we wrote notes in the margins, highlighted texts with colored pencils and markers, and cut and sorted multiple copies of notes and transcripts. (These are still good ways to start a project.) High-tech solutions to the tag-and-retrieve problem in precomputer days involved edge-notched cards and knitting needles. (One famous system was the McBee Cards. See Bolton [1984] for a discussion of how these systems were used in the coding and managing of field notes.) The simplest tagging system is an index, like the one at the back of a book. An index provides a reference table that links themes (subject headings) with pages in a text. Indexes, however, do not specify where on the page any theme occurs, nor do they tell us what other themes are located nearby. With point markers (another tagging system), you place codes directly into the text to indicate that the theme occurs “around here.” Notes scribbled in the margins are point markers. The retrieval process consists of extracting chunks of text (e.g., sentences or paragraphs) above and below the marker.
Downloaded from http://fmx.sagepub.com at PORTLAND STATE UNIV on April 17, 2010
Ryan / TAGGING AND RETRIEVING BLOCKS OF TEXT
113
Deciding how much text to extract is important: Picking too much produces extraneous information, and picking too little produces truncated hits. Contiguous tagging solves this problem by linking themes with contiguous blocks of data. In written data, blocks include words, phrases, sentences, paragraphs, or entire pages. For sound and video, blocks mark segments of tape. For visual data, blocks mark segments of an image. Using colored pencils to underline sections of text or circle portions of an image is an example of contiguous tagging. Below, I outline three approaches to tagging and retrieving texts with Microsoft Word. (Similar results can be achieved with other word processors, such as WordPerfect.) The first approach allows you to tag and retrieve contiguous blocks of text but is best used with small codebooks (i.e., less than ten themes). The second solution allows for larger, more complex codebooks but limits you to using point markers. The last approach allows for contiguous coding and large codebooks but requires more programming steps.
CONTINUOUS TAGGING FOR SMALL CODEBOOKS Tagging with Text Attributes Since at least version 1997, Microsoft programmers have included a feature in Word’s search command that lets users locate examples of text characteristics such as bold, italics, and underline. This means you can mark a text related to Theme1 with bold and then retrieve instances of Theme1 by searching the entire document for bolded text. The same works for overstrike, all the colors of the palette, and highlighting. You can tag the same text with several themes by choosing characteristics that can co-occur, such as bold, underline, double underline, overstrike, and italics. If the themes are mutually exclusive, you can use colored fonts and highlighting. Use of color, of course, is not for those who are color-blind. Searching for Text Attributes To search for text attributes (underline, bold, color) or combinations of these attributes, do the following: 1. Type Ctrl F (find). 2. Click on the More button. 3. Click on the Format button and select Font (a window similar to the one in Figure 1 should appear). 4. Select all the attributes for which you want to search in the font dialogue box. 5. Click on OK.
Downloaded from http://fmx.sagepub.com at PORTLAND STATE UNIV on April 17, 2010
114
FIELD METHODS
FIGURE 1 Word’s Find Font Characteristic Screen
6. Leave the Find What text box blank (this allows you to search across all text; the attributes you selected will appear under this box). 7. Click on Find Next.
Word finds the next instance of the desired attribute(s). Note that Word will highlight the block of text. If you close the Find dialogue box and hit Ctrl C, the block of text will be copied into memory. You can then switch to a second document, hit Ctrl V to paste the copied data, and then switch back to the original document. (If you like using the mouse instead of commands such as Ctrl C and Ctrl V, just click on Edit at the top of the document and choose either “copy” or “paste.”)
Downloaded from http://fmx.sagepub.com at PORTLAND STATE UNIV on April 17, 2010
Ryan / TAGGING AND RETRIEVING BLOCKS OF TEXT
115
You can automate this process with Word’s macro capability. Macros allow users to record and play back a series of keystrokes or mouse clicks. Before making a macro, it is good practice to run through the steps a couple of times to make sure you consistently get the desired results. Creating a Macro The easiest way to create a macro is to turn on the macro recorder, run through all the steps you want to do, and then turn off the macro recorder (described below). Before creating the macro, you need to do three things: 1. Open up your original document (the one that has been coded). 2. Open up a second, blank document. Save this blank document with the name Hits.doc. 3. Return to the top of the original document.
Next we will build a macro that (1) locates the next chunk of text you have marked in red to indicate a particular theme, (2) copies the red text to memory, (3) pastes it in the Hits.Doc document, and (4) returns you to the original document: 1. Start recording a macro (Tools/Macro/Record New Macro). 2. Name the macro Find_Red. 3. Hit OK. (A little Macro toolbar should appear in the upper left of your document, and the cursor should now have a cassette icon attached to it. You can now begin recording.) 4. Hit Ctrl F (find). 5. Click on the More tab. 6. Click on the Format tab and select Font. 7. Under the font color pull-down menu, select red. 8. Click on OK. 9. Click on Find Next. 10. Close the Find and Replace dialogue box. 11. Hit Ctrl C (copy). (This copies the red text into memory.) 12. Click on Windows in the top toolbar and select Document 2 (Hits.doc). 13. Hit Ctrl V (paste). (The copied red text should appear.) 14. Hit the Enter key a couple of times to add some blank lines. 15. Click on Windows in the top toolbar and select Document 1 (your original data file). 16. Hit the right arrow key once. (This deselects the marked texts and moves the cursor one position to the right so you are ready to search for the next texts.) 17. Click on the square in the Macro toolbox. (Alternatively, hit Tools/Macro/ Stop Recording.)
Downloaded from http://fmx.sagepub.com at PORTLAND STATE UNIV on April 17, 2010
116
FIELD METHODS
You have now recorded a macro called Find_Red. Running Macros, Creating Buttons, and Shortcut Keys To run the macro again, do the following: 1. Select Tools/Macros. 2. Select Find_Red. 3. Hit Run.
The macro should find and paste the next instance of red text into Hits.doc. To save time, you can either place a button on a toolbar for the macro or define a shortcut key for it. This can be done either before you record your keystrokes or afterward. To assign the macro to a either a toolbar or a specific keystroke before recording your keystrokes, do the following: 1. 2. 3. 4.
Start recording a macro (Tools/Macro/Record New Macro). Click on either the Assign macro to: Toolbar or Keyboard button. Select the toolbar or keystroke you wish to use. Hit OK and follow the procedures describe above.
To place the macro onto a tool bar after you have recorded the keystrokes, do the following: 1. 2. 3. 4. 5.
Right click on any blank portion of a toolbar. Select Customize. Click on the Command tab. Select Macros. Find the file for Find_Red (it should be called Normal.NewMacros. Find_Red). 6. Drag the file up to a place on the tool bar (you must drop it on an active portion of a toolbar; do not try to drop it on a blank space on the screen). 7. You should now see a button with the name Normal.NewMacros.Find_Red. 8. To change the name of the button, right click on it and type in a new name, then hit Enter when you are done.
Now to find, copy, and paste the next instance of red, all you have to do is to click the button. To assign a macro to a shortcut key after recording your key strokes, perform the following steps: 1. Select Tools/Customize. 2. Click on the Keyboard button at the bottom of the window.
Downloaded from http://fmx.sagepub.com at PORTLAND STATE UNIV on April 17, 2010
Ryan / TAGGING AND RETRIEVING BLOCKS OF TEXT
117
3. In the Categories window, select Macros. 4. In the Macros window, select the macro you want to play as a shortcut key (in this case, Find_Red). 5. In the window titled “Press new shortcut key,” type in the shortcut key combination. For example, if you want the macro to play each time you use Alt + X, then hold down the Alt key and hit X. You see Alt + X appear in the Shortcut window. 6. Click Assign. 7. When you are done assigning shortcut keys, hit Close, Close.
To run the macro, hold down the Alt key and hit X. Paragraphs of Text That Do Not Wrap When searching for attributes such as italics or bold, Word stops when it encounters a hard return or blank line. For imported text that has hard returns (text that does not wrap when you change the margins), you will need to replace hard returns with soft returns if you want to do your retrieval as a continuous operation. The following sequence of find-and-replace commands accomplishes this task. Step 1: Step 2: Step 3:
• Find: ^p^p (^p is Words code for new paragraphs or hard returns) • Replace: **pp** • Find: ^p • Replace: (hit space bar once) • Find: **pp** • Replace: ^p^p
This first step converts blank lines to **pp**, the second step converts single hard returns to spaces, and the final step converts the **pp** back to blank lines. I recommend running this procedure on all text before tagging it with any of the Word features described here. This is especially critical if you are working with colleagues and you are sharing texts. You never know when a pesky hard return will creep into a text. Simple Boolean Searches Researchers often want to conduct simple Boolean searches such as find all occurrences of themes “X and Y” or “X or Y.” Unfortunately, Boolean logic is currently not an option in Word’s Find command. You can, however, find such combinations by using a simple succession of steps. To find text
Downloaded from http://fmx.sagepub.com at PORTLAND STATE UNIV on April 17, 2010
118
FIELD METHODS
marked by both Theme X and Theme Y, check multiple text attributes in the Font Dialog box describe above. For example, checking bold and underline locates all texts that are underlined and bolded. The simplest way to find text marked by either Theme X or Theme Y is to search first on one theme, then the next. All hits will be placed in the Hits Document. (Beware, though: You might encounter duplicates if the same text is marked for both themes.) Microsoft Word offers ten or so text attributes for marking themes (italics, bold, underline, double underline, strikethrough, shadow, and so on). Themes also can be marked with combinations of attributes (e.g., bold and strikethrough, underline and shadow, etc.). Such combinations, however, tend to be more cumbersome and make it more difficult to search for overlapping themes. If you anticipate building a longer codebook with subthemes, then consider using a system of point markers.
POINT TAGGING FOR LARGER CODEBOOKS With point markers, you place codes or mnemonics directly into the text to indicate that the theme occurs “around here.” To use this system, read through your document. Each time you find a place that is related to Theme1, type in the corresponding mnemonic (e.g., [[Theme1]]). If the paragraph also refers to Theme2, then embed the mnemonic for Theme2 as well. For additional examples of point markers, particularly those used for field notes, see Ryan (1993). You can make light work of the theme-marking chore by building a series of little macros, one for each theme. You can assign each macro to a button and place them all in a tool bar such as the one shown in Figure 2. Alternatively, you assign each macro to a key, such as Alt J, or whatever. If you use key combos, be sure not to use things such as Ctrl F (or you won’t be able to use that combination to find things in texts) or Alt F (or you won’t be able to open the File menu at the top left of the screen) using the pull-down menus. You can, of course, assign a macro to Alt F and still open the File menu with the mouse. For theme mnemonics, be sure to use characters such as double square brackets [[ ]] that don’t occur anywhere in your text except for theme markers. That way, when you look for the [[marriage]] theme, say, you won’t find all the uses of the word marriage in your text, only the uses of the word that mark a section of text as being about marriage. To retrieve all the paragraphs that refer to Theme1, you can build a macro that searches for the Theme1 mnemonic and copies a fixed chunk of text (such as a paragraph, sentence, or line) to the second document.
Downloaded from http://fmx.sagepub.com at PORTLAND STATE UNIV on April 17, 2010
Ryan / TAGGING AND RETRIEVING BLOCKS OF TEXT
119
FIGURE 2 Toolbar for Inserting Point Markers
First, make sure that both the original document and the Hits.Doc files are open and that you are in the original document. Then do the following: 1. Start recording a macro (Tools/Macro/Record New Macro). 2. Name the macro Find_Theme1. 3. Hit OK. (A little Macro toolbar should appear in the upper left of your document, and the cursor should now have a cassette icon attached to it. You can now begin recording.) 4. Hit Ctrl F (find). 5. In the Find What: text box, type [[THEME1]] (or whatever mnemonic you are looking for). (Be sure to include the mnemonic indicators, such as square brackets, to avoid finding words in the original text; see above.) 6. Click on Find Next. 7. Close the Find and Replace dialogue box. 8. Hold down the Shift key and hit the up arrow key (this will move the cursor to the top of the paragraph). 9. Hold down the Ctrl and the Shift key simultaneously, and hit the down arrow key (this will highlight the entire paragraph). 10. Hit Ctrl C (copy) (this copies the red text into memory). 11. Click on Windows in the top toolbar and select Document 2 (Hits.Doc). 12. Hit Ctrl V (paste) (the copied red text should appear). 13. Hit the Enter key a couple of times to add some blank lines. 14. Click on Windows in the top toolbar and select Document 1 (your original data file).
Downloaded from http://fmx.sagepub.com at PORTLAND STATE UNIV on April 17, 2010
120
FIELD METHODS
15. Hit the right arrow key once (this deselects the marked texts and moves the cursor one position to the right so you are ready to search for the next texts). 16. Click on the Stop square in the Macro toolbox (alternatively, hit Tools/ Macro/Stop Recording).
You have now recorded a macro called THEME1. Whenever the macro encounters a mnemonic for Theme1, it copies the entire paragraph to the second document Hits.doc. If you want to pull larger chunks, say the paragraph above and below the point marker, just increase the number of times you hit the Ctrl Up and Ctrl Down arrow keys in steps 8 and 9. For smaller chunks, you can use just the up and down arrows to move highlight blocks one or two lines above and below the point marker.
SEARCHING FOR THEME FAMILIES With point markers, you can use a hierarchical codebook and search for families of themes. For example, suppose you built the following codebook: Theme1 Theme1.aa Theme1.ab Theme1.ca Theme1.cb Theme2 Theme2.a Theme2.ab Theme3.da
To search for theme families (e.g., Theme1.aa, Theme1.ab, Theme1.ca, etc.), you can use the wildcard option in Word’s Find command. After hitting Ctrl F, click on the box in front of Use Wildcards. (If Use Wildcards does not appear in the dialogue box, hit the More button.) “Options: Wildcards” should now appear under the Find What text box. Then type in your wildcard search string. Here are some examples you might want to use: [[Theme* [[Theme1* [[Theme1.* [[Theme?.?b
Finds all instances of all themes Finds all general and subthemes associated with Theme1 Finds all instances of subthemes for Theme1 Finds Theme1.ab, Theme1.cb, and Theme2.ab
If you plan to use the wildcard option, do not use wildcard symbols such as *, ?, –, @, !, , (, ), {, or } in your mnemonic coding conventions.
Downloaded from http://fmx.sagepub.com at PORTLAND STATE UNIV on April 17, 2010
Ryan / TAGGING AND RETRIEVING BLOCKS OF TEXT
121
CONTIGUOUS CODING FOR LARGE CODEBOOKS For those who want to use a complex codebook but need the functionality of contiguous tagging, a two-step process is required. First, indicate where a text block begins and ends. Then you can locate these blocks and copy them to a second document. Following the marking system suggested by Truex (1993), I have written two macros to accomplish these tasks. To tag a block of text, first select the text, then start the macro Tag_Theme. A dialogue box like that in Figure 3 will appear and ask you which theme you want to use. If you type in “Treat” and hit the OK button, the macro will embed [[Treat]] at the end. To find all instances of a theme, move the cursor to the top of the document. Start the second macro, Find_Theme. A similar dialogue box will appear and ask you what theme you want to find. The macro then searches through the entire text and copies all hits to a file named Hits.doc. (The Find_Theme macro won’t work unless the Hit.doc file is located in the default directory. See note in Appendix B for more details.) Since each task requires input from the user, I have written the macros in Microsoft’s Visual Basic. The code appears in Appendices A and B. The codes are also downloadable from www.qualquant.net. To reproduce the macro in Appendix A on your own computer, do the following: 1. Hit Alt F8 (Or Tools/Macro/Macros). 2. In the Macro Name text box, type in Tag_Themes. 3. Hit Create.
The Microsoft Visual Basic screen will appear showing the current macros you have stored on your computer. At the very bottom of your screen, you should see the macro you just created: Tag_Themes. A horizontal line should separate it from the other macros, and the following programming language should appear below the line: Sub Tag_Themes() ' ' Tag_Themes Macro ' Macro recorded [date] by [your name here] ' End Sub
Downloaded from http://fmx.sagepub.com at PORTLAND STATE UNIV on April 17, 2010
122
FIELD METHODS
FIGURE 3 Tag_Theme Dialogue Box
You have two options: (1) Retype the code exactly as it appears in Appendix A (you can skip the comments demarked by a single quote at the end of each line) or (2) copy the code from the URL above and paste it into the macro-editing window. When you are done, hit Ctrl S (save) and close the Visual Basic screen. Follow the same steps to create the macro in Appendix B (you might want to name it Find_Themes). If you want to assign these macros to a toolbar or a shortcut key, use the steps described above. This approach works well if you want to tag and retrieve specific text within a paragraph or larger text chunks that extend across paragraphs. You can also search for “X or Y” combinations by first searching on one theme then another. Unfortunately, it is difficult to search for “X and Y” combinations. Such searches require many steps depending on the degree to which themes overlap or are nested entirely within each other and are probably easier to do manually.
Downloaded from http://fmx.sagepub.com at PORTLAND STATE UNIV on April 17, 2010
Ryan / TAGGING AND RETRIEVING BLOCKS OF TEXT
123
COMPARING APPROACHES When asked to recommend text analysis software, I typically respond with a series of questions. First, “What is it that you are trying to do?” Many simple tasks can be done with a word processor. If the job requires developing a complex codebook or requires displaying data and codes in graphical format, then I usually recommend a dedicated text management program. If, however, the principal objective is to mark themes for later retrieval (which is often the case), I ask a second question: “How many documents will you need to work with?” The techniques described here work with only with one file at a time and do not search across files. If you have stored your text in twenty-five files and want to find the same theme in each, you will have to replicate the search task twenty-five times. To work around this problem, you might want to merge all your files into a single document with a delimiter code between texts (such as ### or qqq, something that can never appear in any other context) to indicate where one file ends and another begins. Another solution is to use a program such as DtSearch (http://www. dtsearch.com/) or Windows Grep (http://www.wingrep.com/) that will pull text between beginning and end markers across multiple files. (See Truex [1993] for a review of how to accomplish this procedure with an older DOS version of DtSearch.) People familiar with Unix will recognize Grep as a powerful search command. It is not for the faint of heart, however, and some familiarity with programming is helpful. If neither solution looks feasible, you will probably need to use a dedicated text management program. Finally, I ask, “How many codes/themes do you anticipate using?” If the number of themes is relatively small (say, fewer than ten), consider marking and retrieving themes based on text attributes. The approach is quite intuitive (it resembles marking themes on paper with colored pencils), the mechanics are easy (especially for people familiar with word-processing basics, which means just about everyone these days), and search results return precise hits (see Table 1). The downside is that the researcher must remember what themes go with which attributes, and the size of the codebook is limited to the number of text attributes available in Word. For tasks that require larger codebooks, consider using either point markers or beginning and end markers. Point markers are easily embedded in a document. Accompanying searches are simple, require little programming skill, and allow wildcard searches. Once you have made a macro for each theme (or theme family), the search process is semiautomated. The main drawback with point markers is that the search results are often imprecise (see Table 1 for an example). If a theme represents only a small portion of a
Downloaded from http://fmx.sagepub.com at PORTLAND STATE UNIV on April 17, 2010
124
FIELD METHODS
text, searches that retrieve the paragraph in which the point marker is embedded will be filled with extraneous text. On the other hand, if a theme extends across multiple paragraphs, hits will be truncated (unless you have marked each paragraph separately). Using the two macros describe above to embed beginning and end markers in the text allows researchers to tag and retrieve text with the same precision they can obtain using text attributes. In addition, the markers make it easy to locate where specific themes occur in a document. Instead of creating a macro for each theme, the two generic macros handle codebooks of any size and complexity. Although the macro programming appears daunting, the code is available on the Web and can be readily copied into the macro editor. This approach, however, does not yet allow the use of wildcard or “X and Y” Boolean searches. (I say “not yet” because the capabilities of modern word processors are upgraded with each new release, and this might well be among the next things that are built in.) Furthermore, the more themes that are coded, the more cluttered the document becomes. In general, using your word processor for basic tagging and retrieval tasks is quite efficient. There is very little learning curve since you begin with a program that you already know a lot about. You can use your original wordprocessing documents without having to reformat them, and there is no additional cost for new software.
Downloaded from http://fmx.sagepub.com at PORTLAND STATE UNIV on April 17, 2010
Ryan / TAGGING AND RETRIEVING BLOCKS OF TEXT
125
APPENDIX A Macro for Marking the Beginning and End of a Text Block The following macro can be used after you select a block of text in your document. The macro begins by querying the user for the theme associated with the block, then embeds appropriate beginning and end markers to the selected text.
Warning: Before Running the Tag Themes Macro The Tag Themes document embeds beginning and ending code markers directly in your text. They are a hassle to remove. I strongly suggest that you make a copy of your original text file before you begin tagging the file for codes. This way, if you decide you don’t want to use the tags, you can always start over with a clean document.
Sub Tag_Theme() ' ' Tag_Theme Macro ' Macro recorded 7/16/2002 by Gery Ryan ' Dim Tag$ Tag$ = InputBox("What theme do you want to use?", "Mark Themes", "") Tag$ = CleanString(Tag$) 'cleans nonprinting chars Tag$ = LTrim$(RTrim$(Tag$)) 'removes spaces at beginning and end If Tag$ = "" Then 'checks for cancel or blank text box GoTo Finish End If Selection.Cut 'cuts selection for text, stores to memory Selection.TypeText Text:=" [[" 'adds begin mnemonic symbols WordBasic.Insert Tag$ 'adds theme Selection.TypeText Text:="]]" 'adds begin mnemonic symbols Selection.Paste 'pastes selection Selection.TypeText Text:=" [[>"'adds begin mnemonic symbols WordBasic.Insert Tag$ 'adds theme Selection.TypeText Text:="]] " 'adds end mnemonic symbols Finish:: End Sub
Downloaded from http://fmx.sagepub.com at PORTLAND STATE UNIV on April 17, 2010
126
FIELD METHODS
APPENDIX B Macro for Finding Contiguous Tagged Texts The following macro can be used to find blocks of texts marked with the macro in Appendix A. The macro begins by querying the user for which theme is to be searched, then finds the appropriate text and copies each hit to document 2.
Warning: Before Running the Find_Themes Macro This macro searches for themes in one document and pastes them in a second document called Hits.doc. Two conditions must be met for the macro to function correctly. First, the document to be searched must be saved and have a real filename. Temporary files produced when you open a new file in Word (i.e., document 1, document 2, etc.) do not count. Second, the Hits.doc file must be located in the current default directory. To see whether the Hits.doc is in the correct place, use the pull-down File-Open menu and see if file Hits.doc is listed. If not, open a new file and save it as Hits.doc.
Sub Find_Theme() ' ' Find_Theme Macro ' Macro recorded 7/16/2002 by Gery Ryan ' Dim Tag$ Dim BeginTag$ Dim EndTag$ Dim Workdoc$ Dim Hitsdoc$ Dim Currentdir$ Dim Count_ Hitsdoc$ = "Hits.doc" Workdoc$ = WordBasic.[FileName$]() 'identifies current working document Currentdir$ = WordBasic.[FileNameInfo$](WordBasic.[FileName$](), 5) Hitsdoc$ = "Hits.doc" 'identifies location of hits document Hitsdoc$ = Currentdir$ + Hitsdoc$ ' Tag$ = InputBox("What theme do you want to search for?", "Search Themes", "") Tag$ = CleanString(Tag$) 'cleans nonprinting chars
Downloaded from http://fmx.sagepub.com at PORTLAND STATE UNIV on April 17, 2010
Ryan / TAGGING AND RETRIEVING BLOCKS OF TEXT
Tag$ = LTrim$(RTrim$(Tag$)) BeginTag$ = "[[" + Tag$ + "]]" EndTag$ = "[[" + Tag$ + "]]" If Tag$ = "" Then GoTo Finish ' End If '
127
'removes spaces at beginning and end 'creates beginning marker 'Creates end marker 'checks for cancel or blank text box
WordBasic.FileOpen Name:=Hitsdoc$, Revert:=0 'Result Header Selection.TypeParagraph ' Selection.TypeParagraph ' WordBasic.Insert "Searching For Theme:" + Tag$ ' Selection.TypeParagraph ' Selection.TypeParagraph ' WordBasic.FileOpen Name:=Workdoc$, Revert:=0 ' For Count_ = 1 To 1000 ‘Beginning of loop (max set for 1,000 hits) Selection.EscapeKey Selection.Find.ClearFormatting 'Search for beginning marker With Selection.Find ' . Text = BeginTag$ ' . Replacement.Text = "" ' . Forward = True ' . Wrap = False ' . Format = False ' . MatchCase = False ' . MatchWholeWord = False ' . MatchWildcards = False ' . MatchSoundsLike = False ' . MatchAllWordForms = False ' End With ' Selection.Find.Execute ' If WordBasic.EditFindFound() = 0 Then 'Stop if not found WordBasic.FileOpen Name:=Hitsdoc$, Revert:=0 ' If Count_ = 1 Then 'Hit summary WordBasic.Insert "End of Search: No Hits Found" ' Else ' WordBasic.Insert "End of Search:" + Str(Count_ - 1) + "Hits Found" Selection.TypeParagraph ' Selection.TypeParagraph ' WordBasic.FileOpen Name:=Workdoc$, Revert:=0 ' End If ' GoTo Finish Else
Downloaded from http://fmx.sagepub.com at PORTLAND STATE UNIV on April 17, 2010
128
FIELD METHODS
Selection.MoveRight Unit:=wdCharacter, Count:=1 Selection.TypeText Text:="**//**" 'Inserts temp front marker Selection.Find.ClearFormatting 'Finds End Marker With Selection.Find ' . Text = EndTag$ ' . Replacement.Text = "" ' . Forward = True ' . Wrap = False ' . Format = False ' . MatchCase = False ' . MatchWholeWord = False ' . MatchWildcards = False ' . MatchSoundsLike = False ' . MatchAllWordForms = False ' End With Selection.Find.Execute Selection.MoveLeft Unit:=wdCharacter, Count:=1 Selection.Extend 'Starts at end of text chunk Selection.Find.ClearFormatting ' With Selection.Find ' . Text = "**//**"" ' . Replacement.Text = "" ' . Forward = False ' . Wrap = False ' . Format = False ' . MatchCase = False ' . MatchWholeWord = False ' . MatchWildcards = False ' . MatchSoundsLike = False ' . MatchAllWordForms = False ' End With ' Selection.Find.Execute 'Finds beginning of text chunk Selection.MoveRight Unit:=wdWord, Count:=1, Extend:=wdExtend Selection.Copy 'Copies selection to memory Selection.MoveLeft Unit:=wdCharacter, Count:=2 Selection.TypeBackspace 'Erases temp front marker Selection.TypeBackspace ' Selection.TypeBackspace ' Selection.TypeBackspace ' Selection.TypeBackspace ' Selection.TypeBackspace ' Selection.Find.ClearFormatting With Selection.Find 'Moves cursor to end of hit . Text = EndTag$ '
Downloaded from http://fmx.sagepub.com at PORTLAND STATE UNIV on April 17, 2010
Ryan / TAGGING AND RETRIEVING BLOCKS OF TEXT
129
. Replacement.Text = "" . Forward = True . Wrap = False . Format = False . MatchCase = False . MatchWholeWord = False . MatchWildcards = False . MatchSoundsLike = False . MatchAllWordForms = False End With Selection.Find.Execute Selection.MoveRight Unit:=wdCharacter, Count:=1
' ' ' ' ' ' ' ' ' ' ' 'Sets up for next search
WordBasic.FileOpen Name:=Hitsdoc$, Revert:=0 WordBasic.Insert Str(Count_) + ". " Selection.Paste Selection.TypeParagraph Selection.TypeParagraph WordBasic.FileOpen Name:=Workdoc$, Revert:=0 Selection.EscapeKey End If Next Finish:: End Sub
'Switches to Hits document 'Counts number of hits 'Pastes hit 'Blank line 'Blank line 'Returns to original document
REFERENCES Berelson, B. 1952. Content analysis in communication research. Glencoe, IL: Free Press. Bernard, H. R. 1991. About text management and computers. Cultural Anthropology Methods Journal 3:1–4, 7, 12. . 2002. Research methods in anthropology: Qualitative and quantitative approaches. Thousand Oaks, CA: Sage. Bolton, R. 1984. Computers in ethnographic research: Final report. Washington, DC: National Institute of Education. Dey, I. 1993. Qualitative data analysis: A user friendly guide for social scientists. London: Routledge and Kegan Paul. Gillespie, G. W. Jr. 1986. Using word processor macros for computer-assisted qualitative analysis. Qualitative Sociology 9:283–92. Glaser, B. G., and A. Strauss 1967. The discovery of grounded theory: Strategies for qualitative research. Chicago: Aldine. Krippendorf, K. 1980. Content analysis: An introduction to its methodology. Beverly Hills, CA: Sage. Pool, I. D. S. 1959. Trends in content analysis. Urbana: University of Illinois Press.
Downloaded from http://fmx.sagepub.com at PORTLAND STATE UNIV on April 17, 2010
130
FIELD METHODS
Ryan, G. 1993. Using WordPerfect macros to handle field notes I: Coding. Cultural Anthropology Methods Journal 5:10, 11. Seidel, J., and U. Kelle. 1995. Different functions of coding in the analysis of textual data. In Computer-aided qualitative data analysis: Theory, methods and practice, edited by U. Kelle, 52–61. London: Sage. Strauss, A., and J. Corbin. 1990. Basics of qualitative research: Grounded theory procedures and techniques. Newbury Park, CA: Sage. Truex, G. F. 1993. Tagging and typing: Notes on codes in anthropology. Cultural Anthropology Methods Journal 5:3–5. Weber, R. P. 1990. Basic content analysis. Newbury Park, CA: Sage.
GERY W. RYAN (Ph.D., University of Florida) is a behavioral scientist at RAND Corporation. He has conducted fieldwork on health care choices in the United States, Latin America, and Africa. He also has written and lectured on qualitative data collection and analysis techniques. Before joining RAND, Ryan was the associate director of the Fieldwork and Qualitative Data Laboratory at the UCLA Medical School and assistant professor of anthropology at the University of Missouri–Columbia. He was a coeditor of Cultural Anthropology Methods Journal (1993-–98) and is currently on the editorial board of Field Methods. He has published in Social Science & Medicine, Human Organization, and Archives of Medical Research.
Downloaded from http://fmx.sagepub.com at PORTLAND STATE UNIV on April 17, 2010