D-Lib MagazineJanuary/February 2015 Science 2.0 Repositories: Time for a Change in Scholarly Communication
Massimiliano Assante, Leonardo Candela, Donatella Castelli, Paolo Manghi and Pasquale Pagano AbstractInformation and communication technology (ICT) advances in research infrastructures are continuously changing the way research and scientific communication are performed. Scientists, funders, and organizations are moving the paradigm of "research publishing" well beyond traditional articles. The aim is to pursue an holistic approach where publishing includes any product (e.g. publications, datasets, experiments, software, web sites, blogs) resulting from a research activity and relevant to the interpretation, evaluation, and reuse of the activity or part of it. The implementation of this vision is today mainly inspired by literature scientific communication workflows, which separate the "where" research is conducted from the "where" research is published and shared. In this paper we claim that this model cannot fit well with scientific communication practice envisaged in Science 2.0 settings. We present the idea of Science 2.0 Repositories (SciRepos), which meet publishing requirements arising in Science 2.0 by blurring the distinction between research life-cycle and research publishing. SciRepos interface with the ICT services of research infrastructures to intercept and publish research products while providing researchers with social networking tools for discovery, notification, sharing, discussion, and assessment of research products. 1 IntroductionIn the last decade, information and communication technology (ICT) advances have deeply changed the way research is conducted within research infrastructures (RIs). A Research Infrastructure is intended as the compound of elements regarding the organization (roles, procedures, etc.), the structure (buildings, laboratories, etc.), and the technology (microscopes, telescopes, sensors, computers, Internet, applications, etc.) underpinning the implementation of scientific research. Research is based on digital research products, such as datasets, software, services, and generates further digital products. Along the same line scientific communication has mutated in order to adapt the underlying business models and mission to such new scenarios. Indeed, the traditional paradigm of research publishing by articles cannot cope with the increasing demands of immediate access and effective reuse of research results. Scientists, funders, and organizations are therefore pushing for innovative scientific communication workflows (deposition, quality assessment and dissemination), marrying an holistic approach where publishing includes in principle any product (e.g. publications, datasets, experiments, software, web sites, blogs) resulting from a research activity, that is relevant to the interpretation, evaluation, and reuse of the activity or part of it. The implementation of this vision is today mainly inspired by literature scientific communication workflows, which separate the place where research is conducted, i.e. RIs, from the place where research is published and shared. In particular, research products are published "elsewhere" and "on date", i.e. when the scientists believes the products obtained so far are mature enough. In our opinion, this model cannot fit well when other kinds of research products are involved, for which effective interpretation, evaluation, and reuse can be ensured only if publishing has the properties of "within" the RIs and "during" the research activity. In this paper we present the notion of Science 2.0 Repository (SciRepo). Living in synergy with RIs, SciRepos meet research publishing requirements arising in Science 2.0 settings by blurring the distinction between research life-cycle and research publishing. In particular, by relying on social networking practices they provide researchers with collaboration oriented facilities enabling a seamless and complete access to any research product in the context leading to it. Finally, we present the idea of a SciRepos platform, a system facilitating the realization of SciRepos on top of existing RIs. 2 Research infrastructures and modern scientific communication workflowsResearch Infrastructures are the setting supporting scientists at performing their research activities, which generally consist in running experiments relying on existing research products (e.g. publications, datasets, software, manuals, services, processes, web sites, blogs) in order to yield new research products. In such scenarios, ICT services are becoming increasingly essential to perform research activities. They may range from simple computers and connection to the Internet (e.g. web and email) to data centres offering computational resources (e.g. web servers), services for data management (e.g. document stores, column stores) and processing (e.g. workflow management). ICT services are intended not only for supporting scientific investigation, but also for publishing and re-using the resulting research products. Today's scientific communication workflows are based on the availability of Internet connection and devices, which make drafting, publishing, and accessing scientific publications in digital form the norm for the average scientists. Moreover, ICT services have been playing a central role in shaping up modern forms of scientific communication, which are today reaching beyond publishing articles in digital format. For example, many RIs provide scientists with ICT tools for the elaboration of large quantities of data, and the community invest energies into collecting, curating, and creating research data. Such trends, stimulated funding agencies, organizations, and researchers to find ways to publish research data [8][10]. Evidence of this is provided by the diffusion of data repositories (e.g., GigaDB, Dryad, FigShare, Pangaea) and by the establishment of initiatives studying data citation format and data citation indexing [13]. Recent investigations are reinforcing such new paradigms by studying the problem of publishing research experiments, intended as the methodological processes or ICT-based workflows necessary to achieve given scientific conclusions [14]. The objective is to offer researchers all the elements to repeat ("same experiment, same lab"), replicate ("same experiment, different lab"), reproduce ("same experiment, different configuration"), or reuse ("include part of the experiment into another experiment") [3][5]. Finally, ICT services offer scientists tools through which they can create and share alternative forms of research products, which are not generally intended as valuable for publishing. Examples are software, web sites, blogs, notes, chats discussions, electronic notebooks, etc. Several studies on "altmetrics" are today being conducted to understand how to enable certificated evaluation and citation methodologies for such products [16]. In summary, the advent of ICT facilities are today paving the way towards modern scientific communication workflows, where the act of "publishing" is invested of a newer holistic interpretation. Researchers should be able to publish literature, datasets, experiments, any form of research outcome they perceive to be important for the interpretation and reuse of their scientific results. The benefits are clear:
3 Methodological barriers to modern scientific communication: research product de-contextualizationWhen referring to the action of "publishing", most people would refer to the scientific communication practices that are typical of research literature. These practices are (a) supported by policies and services of a "research marketplace", intended as the set of online services thanks to which publications can be shared (e.g. discovered, accessed, cited, referred, interlinked, tagged) by scientists, (b) applied to selected research products while research activity is still ongoing, i.e. it is up to the scientists involved in a research activity to decide "what" is a candidate research product and "when" to publish it. Publishing consists of the following phases:
Publishing is usually conceived as the concluding step of the research activity lifecycle (cf. Figure 1). It comes conceptually after the research activity step i.e. the phase leading to the production of research results although this does not imply that this step is complete. It is expected that a new research activity lifecycle starts by using the results of previous lifecycles manifested in published products. |