Henn, S. (2014) âWhen Women Stopped Coding,â in Planet Money, episode 576, ... from www.npr.org/blogs/money/2014/10/21/357629765/when-women- ...
THE ROUTLEDGE COMPANION TO MEDIA STUDIES AND DIGITAL HUMANITIES Edited by Jentery Sayers
First published 2018 by Routledge 711 Third Avenue, New York, NY 10017 and by Routledge 2 Park Square, Milton Park, Abingdon, Oxon OX14 4RN Routledge is an imprint of the Taylor & Francis Group, an informa business © 2018 Taylor & Francis The right of the editor to be identified as the author of the editorial material, and of the authors for their individual chapters, has been asserted in accordance with sections 77 and 78 of the Copyright, Designs and Patents Act 1988. All rights reserved. No part of this book may be reprinted or reproduced or utilised in any form or by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying and recording, or in any information storage or retrieval system, without permission in writing from the publishers. Trademark notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe. Library of Congress Cataloging-in-Publication Data Names: Sayers, Jentery, editor. Title: The Routledge companion to media studies and digital humanities / edited by Jentery Sayers. Description: New York : Routledge, Taylor & Francis Group, 2018. Identifiers: LCCN 2017014964| ISBN 9781138844308 (hardback) | ISBN 9781315730479 (ebk) Subjects: LCSH: Mass media. | Digital humanities. Classification: LCC P90 .R673 2018 | DDC 302.23—dc23 LC record available at https://lccn.loc.gov/2017014964 ISBN: 978-1-138-84430-8 (hbk) ISBN: 978-1-315-73047-9 (ebk) Typeset in Bembo by Florence Production Ltd, Stoodleigh, Devon, UK
30
IMAGES ON THE MOVE Analytics for a Mixed Methods Approach Virginia Kuhn
Whether we like it or not, it is the movies that mold, more than any other single force, the opinions, the taste, the language, the dress, the behavior, and even the physical appearance of a public comprising more than 60 percent of the population of the earth. (Panofsky 2002 [1934]: 70)
Everything I want to say about the need for video analytics (broadly defined as the use of computational methods to analyze filmic archives), as a component of humanities research is encoded in the above epigraph. The essay from which it is taken, “Style and Medium in the Motion Pictures” (2002 [1934]), was written over 80 years ago. A highly respected art historian, Panofsky argued that cinema should be considered art, his somewhat qualified claim and resignation notwithstanding. His reference to the public influence of media, as well as his use of statistical evidence (regardless of the precision of his “60 percent”), are emblematic of current debates in the humanities—those framed as “public humanities” and “digital humanities,” respectively. Quantification is a somewhat odd move for someone so famous for close-reading the iconography of visual art, and yet I argue that it provides essential context for qualitative methods. Panofsky’s essay and his attendant lectures were sensationalized simply because of the subject matter. Cinema had not been considered art by most critics prior to the 1930s. However, for my current purposes, the richness of Panofsky’s essay lies in his discussion of cinema’s potential, specifically the “dynamization of space” and “spatialization of time” (2002 [1934]: 71). Considering the medium’s complexity as well as its increasing ubiquity, a mixed methods approach to cinema studies is vital because no single method is adequate on its own. And research is sorely needed. Moving images bombard us, impacting our lives in ways we have yet to adequately understand. The abundance of consumer grade recording devices and online networks that can house and stream video combine to render filmic media (also called moving images) a common form of communication and expression, whose sources are varied and numerous. They include analogue media, such as digitized films and television shows; massive databases of instant video 300
IMAGES ON THE MOVE
(e.g., Netflix, Hulu, and Amazon); hosting sites (e.g., YouTube, Vimeo, and the Internet Archive); TED talks; academic video, including recordings of conferences, lectures, and tutorials; and, finally, incidental footage from surveillance cameras, drones, and body cameras (e.g., GoPro). Further, as depth cameras—for example, RGB-D, Kinect, and Leap Motion— with motion sensors proliferate, they create big, fat files that contribute to the vastness of these databases. This proliferation of filmic media does not mean these databases are particularly useful for research. Even as the recording of images is now commonplace, their organization, arrangement, and modes of access are far less straightforward. That is to say, while the processes of production have been automated, those for accessing media have not, leaving a vital aspect of culture essentially unavailable to researchers. As Lev Manovich pointed out more than a decade ago, “by the end of the 20th century, the problem was no longer how to create a new media object such as an image; the new problem was how to find an object that already exists somewhere” (2005: 35). Indeed, filmic archives are characterized by incomplete metadata and wildly divergent content tagging schemas. Since there is more media produced daily than can be viewed in a lifetime, the situation will only amplify as time passes (ReelSEO 2015). This is the climate in which computational approaches to studying media have emerged. Cultural analytics, data analytics, big data, information visualization, distant reading, network analysis—all are founded on the notion that today’s media landscape is too vast for a single human to study with any sort of breadth or, indeed, depth. Yet one of the most troubling aspects of computational approaches to research is the way they are often misunderstood by humanities scholars and, as such, simply dismissed. The issue is not strictly a matter of quantitative versus qualitative methods, or even formal analysis versus critical theory. Such divides are exacerbated when it comes to computational analysis, which seems to somehow exceed human perception or cognition, orienting toward the hard sciences and away from an emphasis on the individual. To make matters worse, some of the most prominent proponents of computational methods, such as Lev Manovich and Franco Moretti, criticize manual approaches like close reading and instead rely almost entirely on computation. Manovich (2011) argues that, to date, the study of culture has relied on either shallow data, such as statistics, or deep data, such as close reading and thick description, and these methods should be replaced by the use of pattern recognition to analyze massive cultural data sets that are generated and disseminated via social media and web technologies. Moretti argues that the very act of close reading is premised upon an extremely small number of canonical texts, since scholars only invest such effort in the artifacts that are deemed valuable (2013: 48). Though clearly this is not a zero sum game—close reading can exist with distant reading and pattern recognition— criticisms of close reading have alienated many humanists, and this alienation risks limiting the knowledge that might be gained from computational approaches, which could benefit from the input of researchers who see themselves as more analogue than digital. From this angle, a mixed methods approach is strategic since it implies choosing appropriate methods for a particular project rather than simply denouncing any particular one. Before proceeding, it is important to expose the exigency for studying filmic media as well as the specific challenges inherent in this type of computational research. Only then will my argument for video analytics as part of a mixed methods approach fully resonate with other methods, such as close reading, hermeneutics, and ethnography, in media studies and digital humanities. If, however, the choice of methods is viewed in binary terms, then the work produced will necessarily lack the nuance expected by scholarly communities. It is worth 301
VIRGINIA KUHN
remembering, too, that current methodologies, like the academic disciplines from which they proceed, were reified during an analogue era and may need rethinking for a digital or computational context. This rethinking should also make explicit the processes by which knowledge is produced, opening them up to analysis and re-evaluation should they no longer prove viable. Moreover, the complexity of a globally networked world demands a collaborative approach to research across disciplines and institutions, and this collaboration means methods will come under scrutiny (Balsamo 2011: 9; see also Lear Center 2010; Kuhn 2010). Filmic media both reflects and shapes society, but the precise processes by which this occurs remain unclear, as do their cumulative effects. Study after study reveals the often subliminal impact of visual media on human beings, from stereotype threat to psychic numbing, from the relationship between edit lengths and attention span to the physical effects of vision via mirror neurons (Slovic 2007; Steele 2011). And since the medium operates at the affective level (much of which occurs below the threshold of consciousness or at least outside the range of verbal language), it is vital to study not only the narrative content but also the formal properties, both among and across genres. However, none of this research is tenable without the curating, indexing, tagging, and cataloguing necessary to make filmic archives searchable and discoverable. These databases need metadata, or “data about data” (e.g., titles, dates, creators, and descriptions). Metadata comes from the audio and visual features of filmic media and should include information about format, content, context, and provenance. The generation of this metadata may require some degree of automation, in addition to some form of collaborative editing, the most common example of which is Wikipedia.
Human Vision, Computer Vision Automating image analysis is no trivial matter, in part because human vision is quite different from computer vision. In fact, popular programs such as CAPTCHA were created to discern between the two. An acronym for Completely Automated Public Turing test to tell Computers and Humans Apart, CAPTCHA distinguishes “bots” from humans by asking people to type distorted letters on a screen into a website’s text box. This procedure helps to reduce the likelihood of website hacks. In the process, people also help computers to interpret images as plain text. When compared with computers, human visual perception allows people to sharpen the blurriness of distorted letters and read them as distinct from their backgrounds, literally separating the text from the context. Computers often have difficulty making such distinctions. Even if a strength of human vision is its ability to contextualize images, this capacity may also result in flawed apprehension, where people either filter out too much information or fill gaps too thoroughly. In Now You See It (2011), Cathy Davidson offers a great example of such filtering. She notes that the impetus for the book came after a colloquium during which participants were shown a video that tracks “attention blindness,” a phenomenon that effectively cloaks an object in one’s visual field when the body or the brain is otherwise engaged. The now-viral video features six people—three wearing white shirts and three wearing black shirts—passing two basketballs. Viewers are told to count the passes by the white-shirted players as the players bob in and out of the other players. In the course of gameplay, a person in a gorilla suit enters the frame, gestures, and then leaves. Most people do not see the gorilla since they are focused on isolating the three players and counting how many times the ball is passed. The opposite effect can happen, too, as people fill gaps in a visual narrative to give it meaning. This attention blindness is why audiences typically do not 302
IMAGES ON THE MOVE
notice continuity errors in films. Of course, such lapses in vision are revealed upon repeated viewings and the sustained analysis made possible by those technologies that allow video to be stopped, started, slowed, and replayed. By contrast, computer vision attends to all aspects of the visual field and does not privilege one pixel over another. It scans every detail of an image without filling in narrative gaps. While advances are being made in machine learning or “neural networking,” training an algorithm what to identity is quite tricky (Hern 2015). The bottom line is that, at present, computerized image detection cannot replace human vision; however, it significantly narrows the conditions for human interpretation of visual materials. In this way, computer vision algorithms function much like search engines, endowing the indexing process with an element of discovery. This automation, however, still requires human intervention, whether to verify results or to train the algorithm in the first place, a process that is labor intensive (Li 2015; Pangburn 2016).
Manual Tagging and Crowdsourcing Clearly, more is better when it comes to annotating the content of moving image media, since annotation allows for the search and discovery of content. But some types of metadata are easier to generate than others. For instance, the voice and sound tracks of filmic media are less daunting to automate than image tracks. Also, advances in audio signal processing have paved the way for music detection applications such as Shazam. These applications are fairly accurate, but voice recognition programs are less precise. Still, transcripts are often available, particularly with studio films, which begin development in script form. There are also several ventures, such as The Closed Captioning Project, which asks users to transcribe video, often as a classroom project, providing searchable captions for researchers. Although this method is still manual, it contributes to the larger repository and the labor is spread out across users. And even as people tend to add nonlinguistic utterances and inflections when speaking, automating the process of speech recognition is less daunting than image recognition. Manual tagging is a very different story. The creation of content tags for image-based media is not only labor intensive but also inexact. This is true when labeling objects as well as their contextual elements, or what Sara Shatford Layne (1986) calls “ofness” and “aboutness,” based on Panofsky’s three levels of pictorial meaning (1955). First, there are many human languages and dialects, and within each language different words may have similar meanings. Second, word choices and tagging schemas for image labeling are not straightforward, nor are they ideologically neutral. This bias impacts any application they undergird. For instance, Anne Balsamo interprets semantic mapping tools such as the Visual Thesaurus and convincingly shows how the deep structures on which such tools are built subtly reinforce the gender stereotypes of a culture (2011: 186–89). Finally, there is an inevitable gap in meaning when words stand in for images. For all of these reasons, then, content tagging is vital to research with filmic media, but it is insufficient on its own. In addition to tagging ofness and aboutness, or objects and their contextual materials, or “things and stuff” (Heitz & Koller 2008: 2), metadata is also needed to identify formal properties of filmic media, such as shot type, camera angle, color palette, and the like. These elements shape meaning. For example, low camera angles bestow power to the image in relation to the viewer, while people and objects in the foreground are given more weight than those in the middle and backgrounds (Kress & van Leeuwen 2006: 140–42). Color likewise functions to make meaning, particularly with regard to the representation of human beings and their skin tones, which are impacted by cool or warm color timing (Kuhn 2012). 303
VIRGINIA KUHN
Between the interprestive slipperiness and the vastness of filmic media, crowdsourcing becomes a viable mode for content tagging. Crowdsourcing describes the combined efforts of thousands or millions of people, each completing a relatively trivial task, the cumulative effect of which contributes to a far larger endeavor, one that would be otherwise untenable. Further, when there are varying interpretations, crowdsourcing can offer a system of checks and balances (Frick 2015). Just as Wikipedia became a leading database of information, crowdsourcing proves attractive in this case, given its ability to use content tags created by many people. All things considered, then, the optimal approach to research with filmic media would pair the strength of human analysis with the computational power and speed of machines. It would also allow for the study of media’s conceptual and formal qualities. Though utterly necessary, formal analysis is not adequate on its own, since its interpretation hinges on the cultural, political, and technological affordances that shape creation. However, in solving some of the technological issues, it is easy to become preoccupied with tools and lose sight of critical engagement with this media. This risk is one reason why Alan Liu contends that digital humanities needs cultural criticism and must take a more active role in the issues that confront contemporary culture. He suggests that “digital humanists will need to show that thinking critically about metadata, for instance, scales into thinking critically about the power, finance, and other governance protocols of the world” (2012: 495). Indeed, identifying the ways in which seemingly neutral systems like metadata standards are not ideologically neutral and, in fact, impact what can be found and what remains invisible is the hallmark of critical theory. Cultural critique is also a necessary component of media literacy, but while critique provides an analysis of things as they are—identifying and interpreting problems—it seldom offers alternatives to those problems. A productive approach would generate the folowing questions: Which metadata schema might be more inclusive? How might we mitigate the negative impacts of sexist representations of women that are rampant in mass media? Further, when screens disappear, giving way to images that are embedded in objects and the environment, and so less obviously staged, the need for critical evaluation, media literacy, and guidelines for the ethical use of media increases dramatically.
Exemplars Since the combination of computational techniques with critical analysis is such a compelling approach to filmic media, it is rapidly gaining ground among digital humanists and computer scientists. This growth is evidenced by the recent blossoming of projects that deploy computational methods to aid in reading the formal and conceptual features of filmic media (Tsivian 2005; Manovich & Douglass 2008; Casey et al. 2012; Hoyt & Acland 2013). Some of these methods involve annotation tools, while others measure and visualize media in new ways. Annotation tools include ANVIL (anvil-software.org), created by Michael Kipp, and ELAN, developed at the Max Planck Institute for Psycholinguistics. Both are useful for a single scholar’s analysis, but they are desktop programs lacking networked functionality such as online collaboration and integration with existing film databases, thereby limiting their potential for broad analysis and interoperability. The Semantic Annotation Tool, a project recently launched by Mark Williams and John Bell, promises to help remedy this limitation via a webbased tool with a server that will allow researchers to share their notes with their colleagues. Shared annotations will be hugely helpful in making film archives discoverable and searchable. Still, image-processing applications are needed. Not only do they hold the potential to 304
IMAGES ON THE MOVE
automate approaches to time-based media, they also provide a way to compare the formal elements of film across genres and linguistic orientations. One of the best-known proponents of image processing via computer vision is Lev Manovich, who founded the Software Studies Initiative in 2005. Manovich has embarked upon several projects that visualize large quantities of images (including photographs, page images, digitized paintings, and film frames) that create a mosaic of sorts, using thumbnails of the images themselves to form the larger visualization. For this purpose, the research team at the Software Studies Initiative developed ImagePlot, an application that can be used in combination with ImageJ, which helps medical researchers compare magnetic resonance imaging (MRI) scans and other biological scans, and is thus useful for manipulating images and visualizing them in “stacks” that afford a three-dimensional component. This approach is interestingly deployed in the service of film analysis by Kevin Ferguson, who argues for a renewed interest in cinema as a volume even as it is instantiated in a two-dimensional form (2015). Elsewhere, Barry Salt and Yuri Tsivian are well known for measuring the shots of a film— the length of each section between edits—and, along with computer scientist Gunars Civjans, have developed a tool called Cinemetrics, which measures the average shot length (ASL) of a particular film. While shot length can be (and has been) measured manually, the ability to compare the ASL of films across time and space has proven transformative for Tsivian, who reports having “felt [his] heart beat faster, for it turned out that between 1917 and 1918 the cutting tempo of Russian films had jumped from the slowest to the fastest in the world” (2005). Frederic Brodbeck, by contrast, combines shot measurement and visualization, creating “movie fingerprints” that are aesthetically pleasing yet short on analytic potential (n.d.). While these types of projects are proliferating, their findings are either purely artistic (and thus not immediately useful) or somewhat obvious (e.g., action films are more frenzied). More conceptual and practical work will be needed before the field can hone this method, possibly combining it with textual annotations and other image processing applications to make it a more meaningful research tool. Combining textual and visual tools was the impetus for the Video Analysis Tableau (the VAT), a system that builds on several of these tools to establish a software workbench for video analysis, annotation, and visualization (Kuhn et al. 2015). The VAT—a project I lead and whose team includes supercomputing scientists, media scholars, and digital humanists— employs a host of algorithms for machine reading in addition to user-generated tagging and annotation, using both current and experimental discovery methods. This work is supported by the National Science Foundation’s XSEDE program (Extreme Science and Engineering Discovery Environment), uses open source code, and is built on the Clowder framework as its database and interface. Upon ingestion into the VAT, footage is processed with extractors that segment the video, convert it to a common codec to facilitate previewing, and create a novel visualization that serves as a barcode of sorts (see Figure 30.1). This preprocessing allows users to quickly preview a video by looking at its major segments and then selecting a particular frame to be queried across a specified collection of videos. Searching based on that query frame, the system returns the 10 most similar frames for each of the computer vision algorithms (also known as descriptors), which create histograms of the frame to identify features (such as image boundaries or distributions of color), detect edges, or look for transformations in texture. Each result returned includes a calculation of relative distance from the query frame to the frame returned, providing a pedagogical component as users get a glimpse of the deeper structures upon which the VAT is built. 305
VIRGINIA KUHN
Figure 30.1 Screen shot of the VAT’s “Movie Slice” barcode visualization, which affords a spatial view of films. Visualization by Dave Bock. Source: Image care of the author.
By assessing the various results returned, users begin to understand the logic of the particular descriptor, encouraging a type of algorithmic literacy. They can then hone their query frame based on this understanding, even as they construct new research questions and pursue further inquiry. For instance, in an early test case, we were looking for depictions of cigarettes and people smoking. The query frame used was the best representation of a person holding a cigarette close to her mouth. The frames returned included copious images from television news, which featured reporters speaking into microphones or holding them out to others whom they were interviewing. This makes sense as the system searches for small objects held at an angle. In order to further enhance the search, then, we focused on color distribution, looking for instances of white near those of flesh colors. We also had luck finding smoke in a frame by focusing more on the DCT (discrete cosine transform) descriptor, which works at the boundaries of an object, allowing us to find “fuzzy” borders that indicate smoke over an image. While the VAT is nascent, its development team is focused on establishing a community of users such that the project’s evolution is directed by and responsive to the scholarly community, rather than an individual or corporation. Indeed, I suspect that—via a mixed methods approach—the VAT and projects like it will enrich cinema and media studies by making visible those artifacts that may have been overlooked by scholars so far.
New Directions: Cinema Studies and Neuroscience While the projects discussed in the previous section will become more compelling the more they are adopted, there is ample room for continued experimentation with computational 306
IMAGES ON THE MOVE
approaches to filmic media. Perhaps the most exciting and intriguing emergent work lies at the crossroads of cinema studies and visual neuroscience. Neuroscientists have long used MRIs and fMRIs (functional MRIs, which measure blood flow) to track brain activity in order to study how the brain reacts to various stimuli in real time. While MRIs can be quite uncomfortable and loud, there is some promise among research with sounds and images as researchers use headphones and other padding to dampen the machine’s noise and amplify the sound of the stimulus. For instance, Pia Tikka (2008) uses fMRI scans to track brain activity while people view filmic media. This work builds on Tikka’s ambitious doctoral dissertation that includes a monograph, short film, and video documentation of a gallery installation. The written portion charts the practical and theoretical work of film pioneer, Sergei Eisenstein, as Tikka argues for a research-based practice (as Eisenstein did in 1935) instead of the practice-based research that has dominated university-level arts education. She notes that, rather than “proposing a readymade theory in conclusion, perhaps the extended reach of the mind’s conceptual grasp (read ‘research’) may provide the practical domain of cinema with new insights” (2008: 18). The media elements of the dissertation do just this, documenting her Helsinki gallery installation, which consisted of four chairs that were wired to track viewers’ skin response and heart rates as they viewed a short film, whose subject ranged from the quotidian (a laundromat) to the violent (a rape). This work provides evidence of media’s physical impact on viewers—a notion that is vaguely accep ted as true, though seldom given much credence. A systematic study of this impact will go a long way toward understanding the relationship between media inundation and human health. Filmmaker and critic, Peter Wyeth, offers a conceptual framework for such efforts, contending that we have shortchanged the extraordinary power of cinema by ignoring the sciences. In The Matter of Vision: Affective Neurobiology and Cinema, he uses recent work in brain science and evolutionary biology to arrive at his three main premises: vision is a primary source of human intelligence, the visual detection of movement is a key function of human perception, and emotion is the mind’s default state, with reason being but a filter of affect (2015: 12). Obviously, if these premises are correct, then they suggest the importance of and a natural affinity for cinema—a visually rich medium, full of movement with an affective dimension. Being familiar with much of the underlying research, I find Wyeth’s arguments plausible. However, without a mixed methods and interdisciplinary research agenda, it will be difficult to assess and enhance this research. Perhaps it was Panofsky’s early call to treat cinema as art that pushed it away from its early engagement with science, or maybe cinema studies’ roots in literary criticism confined filmic media to the arts and humanities. Regardless of the cause, one thing is clear: that moment has passed.
Further Reading Dinsman, M. (2016) “The Digital in the Humanities: A Special Interview Series,” Los Angeles Review of Books, retrieved from lareviewofbooks.org/feature/the-digital-in-the-humanities. Drucker, J. (2015) “Database Narratives in Book and Online,” Journal of Electronic Publishing 18(1), retrieved from quod.lib.umich.edu/j/jep/3336451.0018.113?view= text;rgn=main. Manovich, L. (2013) “Visualizing Vertov,” Software Studies Initiative, retrieved from softwarestudies.com/cultural_ analytics/Manovich.Visualizing_Vertov.2013.pdf. Ross, S. (2014) “In Praise of Overstating the Case: A Review of Franco Moretti, Distant Reading (London: Verso, 2013),” Digital Humanities Quarterly 8(1), retrieved from www.digitalhumanities.org/dhq/vol/8/1/000171/ 000171.html.
307
VIRGINIA KUHN
References ANVIL annotation software (n.d.) retrieved from anvil-software.org. Balsamo, A. (2011) Designing Culture: The Technological Imagination at Work, Durham, NC: Duke University Press. Brodbeck, F. (n.d.) Cinemetrics project website, retrieved from cinemetrics. fredericbrodbeck.de. Casey, M., M. Williams, and T. Stoll (2012) ACTION: Tools for Cinematic Information Retrieval, retrieved from aum.dartmouth.edu/~action/index.html. Cinemetrics software (n.d.) retrieved from cinemetrics.lv. Cutting, J. E., J. E. DeLong, and C. E. Nothelfer (2010) “Attention and the Evolution of Hollywood Film,” Psychological Science 21(3), 432–39. Davidson, C. (2012) Now You See It: How Technology and Brain Science Will Transform Schools and Business for the 21st Century, New York, NY: Viking. ELAN annotation software (n.d.) retrieved from tla.mpi.nl/tools/tla-tools/elan. Ferguson, K. L. (2015) “Volumetric Cinema,” in [in]Transitions, 17 March. Frick, W. (2015) “What to Do When People Draw Different Conclusions from the Same Data,” Harvard Business Review blog, retrieved from hbr.org/2015/03/what-to-do-when-people-draw-different-conclusions-from-thesame-data. Heitz, G. and D. Koller (2008) “Learning Spatial Context: Using Stuff to Find Things,” ECCV 2008 Proceedings of the 10th European Conference on Computer Vision, October 12–18, Marseille, FR. Henn, S. (2014) “When Women Stopped Coding,” in Planet Money, episode 576, National Public Radio, retrieved from www.npr.org/blogs/money/2014/10/21/357629765/when-women-stopped-coding. Hern, A. (2015) “Yes, Androids Do Dream of Electric Sheep,” The Guardian, 18 June, retrieved from www. theguardian.com/technology/2015/jun/18/google-image-recognition-neural-network-androids-dream-electricsheep. Hoyt, E. and C. Acland (2013) “Project Arclight: Analytics for the Study of 20th Century Media,” Digging into Data Challenge, retrieved from diggingintodata.org/awards/2013/project/project-arclight-analytics-study-20thcentury-media. Image Plot (n.d.) Software Studies Initiative, retrieved from lab.softwarestudies.com/p/imageplot.html. ImageJ software (n.d.) National Institutes of Health, retrieved from imagej.nih.gov/ij. Kress, G. and T. van Leeuwen (2006) Reading Images: The Grammar of Visual Design, 2nd edn, New York, NY: Routledge. Kuhn, V. (2010) “The Techno-Humanist Interaction,” EDUCAUSE Review 45(6), 58–60, retrieved from er.educause.edu/~/media/files/article-downloads/erm1067.pdf. Kuhn, V. (2012) “The Rhetoric of Remix,” in F. Coppa and J. Levin Russo (eds.) “Fan/Remix Video” special issue, Transformative Works and Cultures 9, retrieved from dx.doi.org/10.3983/twc.2012.0358. Kuhn, V., A. Craig, M. Simeone, S. Puthanveetil Satheesan, and L. Marini (2015) “The VAT: Enhanced Video Analysis,” in Proceedings from the 2015 Extreme Science and Engineering Discovery Environment XSEDE conference, Atlanta, July, article 29. Lear Center (2010) “Creativity and Collaboration in the Academy: Technology and the future of research,” Annenberg School of Communication, University of Southern California, retrieved from learcenter.org/project/collab. Li, F. (2015) “How We’re Teaching Computers to Understand Pictures,” TED talks, retrieved from ted.com/playlists/ 310/talks_on_artificial_intelligence. Liu, A. (2012) “Where Is Cultural Criticism in the Digital Humanities?” in M. Gold (ed.) Debates in the Digital Humanities, Minneapolis, MN: University of Minnesota Press, pp. 490–501. Manovich, L. (2005) The Language of New Media, Cambridge, MA: MIT Press. Manovich, L. (2011) “Guest Column: Lev Manovich Takes Us From Reading to Pattern Recognition,” The Creators Project, January 20, retrieved from thecreatorsproject.vice.com/blog/guest-column-lev-manovich-takes-us-fromreading-to-pattern-recognition. Manovich, L. and J. Douglass (2008) “Cultural Analytics,” Software Studies Initiative, retrieved from lab.software studies.com/p/cultural-analytics.html. Marini, L., et al. (2015) Clowder interface and database, retrieved from isda.ncsa.illinois. edu/clowder. Moretti, F. (2013) Distant Reading, London, UK: Verso. Pangburn, D. (2016) “Here’s What Actually Goes into Creating Artificial Intelligence,” The Creators Project, retrieved from thecreatorsproject.vice.com/blog/artificial-neural-network-visualization. Panofsky, E. (1955) Meaning in the Visual Arts: Papers in and on Art History, Garden City, NY: Doubleday. Panofsky, E. (2002 [1934]) “Style and Medium in Motion Pictures” in A. Vacche (ed.) The Visual Turn, New Brunswick: Rutgers University Press, pp. 69–94. ReelSEO (2015) “500 Hours of Video Uploaded to YouTube Every Minute [Forecast],” retrieved from www.reelseo.com/youtube-300-hours.
308
IMAGES ON THE MOVE Shatford Layne, S. (1986) “Analyzing the Subject of a Picture: A Theoretical Approach,” Cataloging & Classification Quarterly 6(3), 639–62. Simons, S. (2012) “But Did You See the Gorilla? The Problem with Inattentional Blindness,” in Smithsonian Magazine, retrieved from www.smithsonianmag.com/science-nature/but-did-you-see-the-gorilla-the-problem-with-inattentionalblindness-17339778/?no-ist. Slovic, P. (2007) “‘If I Look at the Mass I Will Never Act’: Psychic Numbing and Genocide,” Judgment and Decision Making 2(2), 79–95, retrieved from journal.sjdm.org/7303a/jdm7303a.htm. Steele, C. (2011) Whistling Vivaldi: How Stereotypes Affect Us and What We Can Do, New York, NY: Norton. Tikka, P. (2008) Enactive Cinema—Simulatorium Eisensteinense, Helsinki: University of Art and Design. Tsivian, Y. (2005) Cinemetrics: Movie Measurement and Study Tool Database, retrieved from cinemetrics.lv/ index.php. Wyeth, P. (2015) The Matter of Vision: Affective Neurobiology and Cinema, New Barnet, UK: John Libbey Publishing.
309