Science & Society
Spinning the web of open science Social networks for scientists and data sharing, together with open access, promise to change the way research is conducted and communicated Andrea Rinaldi
D
igital and information technologies have drastically changed the way how people work, interact professionally and socially and spend their leisure time. Scientific research is not exempt from these changes: the free and rapid flow of information, ideas and documents will both require and foster new habits of collaboration among researchers, stimulate economic activities and even improve public dialogue on science (http://royalsociety.org/policy/ projects/science-public-enterprise/report/). “A growing body of evidence suggests that public visibility and constructive conversation on social media networks can be beneficial for scientists, impacting research in a number of key ways,” wrote computational biologist Holly Bik and marine biologist Miriam Goldstein [1]. The digital revolution could herald a new era of “open science” where scientists freely and easily share published work, experimental data, ideas and opinions and mutually benefit from the open and collaborative realm that is emerging in the digital age. Yet, not all is brave in the new digital world and scientists seem to be rather reluctant to engage in what the web 2.0 has to offer in terms of exchanging ideas. Most scientists use social media for two main reasons: networking with other researchers and for public visibility (Fig 1). “Scientists need to be engaged in new media platforms because everyone else is already talking about their thoughts and feelings, having discussions about things they care about, and generally—as the name implies —being social,” commented marine biologist and blogger Christie Wilcox [2]. Beyond making research more visible to funders or
policymakers, social media may also help to build dialogue and constructive conversation with the general public, in particularly about sensitive topics, such as stem cell research or genetically modified food. “[W]e have to make a concentrated effort to get involved in the public discussion about science. We have to be approachable and available to talk about our research. More than ever, this means to be online and actively engaged in new media,” Wilcox noted [2].
......................................................
“The digital revolution could
herald a new era of “open science” where scientists freely and easily share published work, experimental data, ideas and opinions. . .” ...................................................... A rapidly growing number of scientific associations and research institutions are already active on social media, notably Facebook and Twitter. To support this trend, “there is a pressing need for scientific institutions to offer formalized training opportunities for graduate students and tenured faculty alike to learn how to effectively use this new technology” [1]. The annual meeting of the American Society for Cell Biology in New Orleans, USA, last December for instance, offered a wellattended “Social Media for Scientists” session. Despite the seemingly soaring importance of social media and the possible benefits, however, acceptance is still limited among researchers. “In academia,
there is often a particular stigma attached to online activities. Actively maintaining an online profile and participating in social media discussions can be seen as a waste of time and a distraction from research and teaching duties,” wrote Bik and Goldstein [1], mentioning that in 2011 only 2.5% of UK and US academics had established a Twitter account. “We believe this perception is misguided and based on incorrect interpretations of what scientists are actually doing online. When used in a targeted and streamlined manner, social media tools can complement and enhance a researcher’s career,” [1].
I
n addition to the large, popular online media, specialized online networks aim to connect scientists based on their professional interests and offer them a space to share and review articles or data and to collaborate (Fig 2). Yet, only a few initiatives have been able to carve out a significant niche among the scientific community. The San Francisco-based Academia.edu, for example, is a platform to share research papers, monitor the impact of one’s own research through article downloads and other analytics and track the research of colleagues. The company’s platform, whose stated mission is “to accelerate the world’s research,” has raised US$17 million in venture capital (www.academia.edu/), has more than 6 million academic users and attracts over 12 million unique visitors a month. One of the main reasons why scientists sign up to these platforms is the hope of attracting interest in their research and, more importantly, their publications.
Freelance science writer in Cagliari, Italy. E-mail:
[email protected] DOI 10.1002/embr.201438659
342
EMBO reports Vol 15 | No 4 | 2014
ª 2014 The Author
Andrea Rinaldi
EMBO reports
Spinning the web of open science
Communicating science online
Who do you want to talk to? The general public!
Colleagues
n cate m nic y to commu dy ady I’m rea Science!
My tech-savvy colleagues who have their own blogs and 8,000 Twitter followers! You know your colleagues don’t read blogs. Isn’t there a listserv somewhere?
By this I mean the relatively small subset of the general public who are reading about science on the Internet, who are not necessarily representative of the overall public.
Why do you want to talk to them? So much cool stuff but it’s scattered all over the Internet!
I’m lonely. I want to find people that care as much about my area as I do.
Curation
Nobody’s talking about what I care about! I want to create new content!
Twitter Time needed: minimal Pinterest : minimal Time needed
Facebook Time needed: medium
Google+ Time needed: me
dium
Tumblr Time needed: medium
Community
Own blog Time needed: maximum
Creation
Guest blog Time needed: medium
Supported PR Time needed: medium
Figure 1. Flowchart showing a decision tree for scientists who are interested in communicating online. Redrawn from [1].
“[U]ltimately that is what ‘publish or perish’ comes down to: how are people going to evaluate my work when I am
ª 2014 The Author
applying for job or a grant?”, said Richard Price, Founder and CEO of Academia.edu. “That is where the new set of social
networks for scientists come in, like Academia.edu: developing new reputation metrics for scientists. [. . .] Citation metrics often take a few years to emerge after a paper is published. During that time the author wants to demonstrate impact, and we help them do that by showing how many people have read their paper, and from which countries.” ResearchGate is another social networking site “built by scientists, for scientists” that is rapidly gaining popularity (www.research gate.net). Founded in 2008, it is currently used by more than 3 million researchers to share papers or data, ask questions, connect to new collaborators or find a job. ResearchGate developed a new metric, the RG Score, to measure scientific reputation based on the number and quality of interactions and on how peers receive and evaluate contributions, such as papers or experimental data. The company, based in Berlin, Germany, has attracted significant investments, including US$35 million from Bill Gates, and is planning to raise money from a marketplace for scientific products and other services on the site (http:// venturebeat.com/2013/06/04/researchgate-billgates/). Another major promise of online research networks is that mutual interest in one’s work and sharing expertise, data and ideas could promote more scientific collaboration. “With large teams of scientists, often based at remote institutions, increasingly needing to work together to tackle biggest-ever research challenges there will be a demand for new tools to help facilitate collaboration,” said Elisabeth Iorns, Co-founder and CEO of Science Exchange, an online marketplace for science experiments (www.scienceexchange.com/). “Specifically, there will be an increasing need for tools that allow researchers to easily find and access other scientists with the expertise required to advance their research projects. In our view, to operate most efficiently these tools also need new methods to reward researchers for participating in these collaborations.” Social networks—and possibly associated payment systems to offer financial incentives—could provide a mechanism for scientists to list their expertise and available infrastructure so that other researchers can easily find them and ask for collaborations. The idea is built on successful examples from other industries where “collaborative economies” have been created to increase the sharing of
EMBO reports Vol 15 | No 4 | 2014
343
EMBO reports
Spinning the web of open science
Figure 2. Networking scientists. This image shows the co-authorship network of 8,500 doctors and scientists publishing on hepatitis C virus between 2008 and 2012, and the almost 60,000 co-authorship relationships between them. The data was gathered from the Medline database, processed using a custom Python script and visualized using Gephi. Credit: Andy Lamb.
resources and expertise between peers. “The success of this model could be a major accelerator of innovation in biomedicine by facilitating increased speed and quality, while drastically reducing costs,” said Iorns.
N
eedless to say, not everyone is convinced that online networking as provided by these start-up companies will become a cornerstone of a future ‘open science’. “It’s easy to measure total users or total PDFs uploaded or other metrics and claim some success. [. . .] But why haven’t any of these platforms truly caught on in the scientific community? Fundamentally, it’s because they are addons to “the way things get done” and not replacements for the way scientists work day-to-day or how their careers are judged (i.e., you don’t get promoted for great
344
EMBO reports Vol 15 | No 4 | 2014
science tweeting),” wrote Mark Drapeau, neurobiologist and now Director for Innovative Engagement for Microsoft’s public sector business (http://www.huffingtonpost.com/mark-drapeau/social-networks-for scientists_b_1282692.html). “And while there are some well-intentioned, smart people discussing Science 2.0 and what it would take for that to happen, it is in my opinion extremely unlikely that the entire system of how academic science operates in the U.S. will change within the venture capital-backed funding cycle of one of the science social networking companies like ResearchGate.” In addition to skepticism, established publishers are another hurdle to networking, limiting the free exchange of research papers. Last December, Elsevier requested Academia.edu to drop thousands of papers
Andrea Rinaldi
from the site (http://svpow.com/2013/12/ 06/elsevier-istaking-down-papers-from-academia-edu/). “Academic social networks can provide the opportunity for increasing collaborations between researchers, but when hosting your data on commercial services, even the most academic friendly services don’t put commercial success at risk to defend academic freedom,” commented Michelle Brook, Community Coordinator at the Open Knowledge Foundation, a non-profit organization dedicated to promoting open access (http://okfn.org/). At the same time, Elsevier itself is investing into social networks for researchers. On April 2013, the company announced the acquisition of Mendeley, an academic social network offering softwareand-paper-sharing services (www.mendeley. com/). Official sources spoke of a “joint vision” shared by the two companies and big advantages for authors from the integration of Mendeley services into Elsevier’s platform, but, to put it in the words of David Dobbs in The New Yorker, “[M]any Mendeley users felt as if the Galactic Empire had coopted the Rebel Alliance” (http:// www.newyorker.com/online/blogs/elements/ 2013/04/elsevier-mendeley-journals-sciencesoftware.html). The reason for the acquisition, some believe, could be to counteract a potential threat to Elsevier’s business model while tapping into Mendeley’s aggregated data generated by its more than two million users through their searching and sharing patterns. Indeed, one of the cornerstones of open science is free access to journal articles, but also other forms of scientific output, such as raw data sets that are usually not made available to the broader scientific community through traditional research articles. “Good science is reproducible. When primary data and documentation code are not available, as is often the case, the results of the paper are not reproducible, and cannot be confirmed,” said Brook. “For science to function effectively, and for society to reap the full benefits, it is crucial that outputs from publicly funded research including data and underlying code are made openly available.” Brook also highlighted the major roadblocks to open science: “Two key obstacles at present are a lack of training for open science practices in undergraduate and postgraduate courses, and a lack of incentives for researchers to publish their data openly.”
ª 2014 The Author
Andrea Rinaldi
EMBO reports
Spinning the web of open science
B
ut as data sharing and the need for open data policies become more important, are the traditional publishers of scientific literature willing to do their part? A recent initiative called Journal Research Data (JoRD) Policy Bank (http:// jordproject.wordpress.com) has analyzed the policies of academic publishers to promote linkage between journal articles and underlying research data. The JoRD project was conducted at the Centre for Research Communications at Nottingham University, UK, in partnership with the Research Information Network (RIN). “The aim was to assess the value of an international service which would provide researchers, managers of research data and other stakeholders with an easy source of reference and enable understanding and compliance with these policies,” explained project officer Marianne Bamkin. “The study showed that a service that would summarise and collate journal data policies would be useful and used, and may encourage the sharing and open deposit of data.” “At the time of the study—that ran from November 2012 through May 2013—we found out that around half of the top 100 and bottom 100 journals from the Thomson Reuter citation index (2011 edition) had no
data policy. That means that authors received no guidance of where, when and what data to deposit, related to results stated in a published article,” summarized Bamkin, adding that only a small percentage of the surveyed journals make data sharing a requirement of publication (Fig 3). “We also noted that many researchers, some of them early in their careers, wanted to share data,” Bamkin explained. Yet, surveyed researchers were prevented from doing so for several reasons: not being aware that they could upload data to a repository; being worried about ethical considerations for research participants; being worried about established practices, such as the requirement for original data in doctoral research; and the fact that raw data in most cases would need a certain level of translation to make it understandable. A few publishers have been paying attention to source data and open data, however. Both Nature Publishing Group and EMBO, in particular, have a strong history of promoting data standards and data release. [Ed: EMBO reports is published by EMBO Press] EMBO Press has its SourceData project, which aims to integrate biological data and structured metadata and make published data freely available to the scientific community in a
searchable and re-usable form. “[EMBO Press] will also establish internal curation and semantic enrichment steps with data editors who are embedded in the production process,” commented Thomas Lemberger, Chief Editor of Molecular Systems Biology and responsible for the SourceData platform [3].
......................................................
“One of the cornerstones of open science is free access to journal articles, but also other forms of scientific output, such as raw data sets. . .” ...................................................... Nature Publishing Group is about to launch Scientific Data, a new, open-access publication for data sets. Key to the concept is a new type of content called Data Descriptor, a combination of traditional content and structured information curated in-house, to maximize reuse and enable searching, linking and data mining. “Data Descriptors include detailed descriptions of the methods used to collect the data and technical analyses supporting the quality of the measurements, but do not contain tests of
Number of journals with a policy Openness Data storage
6%
Accountability
Increase access
Societal norms
30% 49% Verifiable
15%
Drivers for data sharing
Ease of colloboration
Single policy
No policy
Multiple policies
Unknown
Increase research efficiency
Allows replication Increase quality
Promotion of knowledge (ideas)
Figure 3. Can journal data policies encourage the deposition and sharing of research data? Results from the Journal Research Data (JoRD) project (see main text for explanation). Left: out of a total 371 journals surveyed, the figure shows which percentage of the top 100 and bottom 100 journals from the Thomson Reuters Science Citation Index (2011 edition) that had a policy for data sharing (full survey data are available at http://jordproject.wordpress.com/). Right: drivers for data sharing indicated by surveyed researchers. Credit: Marianne Bamkin, JoRD.
ª 2014 The Author
EMBO reports Vol 15 | No 4 | 2014
345
EMBO reports
Spinning the web of open science
new scientific hypotheses, extensive analyses aimed at providing new scientific insights, or descriptions of fundamentally new scientific methods” (www.nature.com/ scientificdata/). Scientific Data will initially focus on data sets from the life science communities, but plans to expand to a wider range of experimental disciplines. In addition, it plans to provide incentives for scientists willing to openly share their experimental data. “There are real barriers to data sharing, one of the biggest being a lack of credit. Data Descriptors will be peer-reviewed, citable publications,” said Andrew Hufton, Scientific Data Managing Editor. “And we are developing a data citation system that will help integrate data into our publications, and help others track data reuse. We want to ensure that scientists who share are recognized and rewarded.”
Chapel Hill (NC, USA), performed a large multivariate analysis of the citation differential for studies in which gene expression microarray data was or was not made available in a public repository. They found that studies, which make data available in a public repository, received 9% more citations: a robust citation benefit albeit a smaller one than previously reported [4].
S
“Our study found that research is actively reused if the authors post it openly online. [. . .] Citations from this reuse means that openly posting research data also helps the authors,” commented Piwowar. “Studies that reuse data begin appearing a few years after the data is posted online, then continue to be published for at least 5 years in growing numbers. [. . .] We found that authors who collect data have usually finished publishing their related papers within the first 2 years—so they don’t have to worry about competition
......................................................
“Most scientific social
networks intend to increase the flow of research information by offering new reputation metrics that might be used by granting agencies and hiring committees.” ......................................................
Andrea Rinaldi
from other authors because they’ve moved on to other projects by the time people start reusing their data.” And yet, achieving full openness and transparency for the biomedical literature and experimental data is not a given and it largely hinges on whether and how scientists can benefit from being open. Most scientific social networks intend to increase the flow of research information by offering new reputation metrics that might be used by granting agencies and hiring committees. Whether this approach will be sufficient remains to be seen. Yet a changing social and political atmosphere that promotes and even requires openness for all publicly funded research may eventually drive the evolution of open science.
Conflict of interest The author declares that he has no conflict of interest.
ome other benefits of data sharing and openness are not straightforward. One claim has been that papers with publicly available data sets attract a higher number of citations than similar studies without available data: the so-called citation benefit. However, accurately estimating this citation differential is difficult because many variables may influence citation rate. Refining previous work on this topic, Heather Piwowar, at the National Evolutionary Synthesis Center in Durham (NC, USA), and Todd Vision, University of North Carolina at
346
EMBO reports Vol 15 | No 4 | 2014
References 1.
Bik HM, Goldstein MC (2013) An introduction to social media for scientists. PLoS Biol 11: e1001535
2.
Wilcox C (2012) It’s time to e-volve: taking responsibility for science communication in a digital age. Biol Bull 222: 85 – 87
3.
Lemberger T (2014) Tools of discovery. Mol Syst Biol 10: 715
4.
Piwowar HA, Vision TJ (2013) Data reuse and the open data citation advantage. PeerJ 1: e175
ª 2014 The Author