Publishing Scientific Articles in the Digital Era

In the digital era in which over 4 billion people regularly access the internet, the conventional process of publishing scientific articles in academic journals following peer review is undergoing profound changes. Following physics and mathematics scholars who started to publish their work on the freely accessible arXiv server in the early 1990s, researchers of all disciplines increasingly publish scientific articles in the form of freely accessible and fully citeable preprints before or in parallel to conventional submission to academic journals for peer review. The full transition to open science, I argue in this study, requires to expand the education of students and young researchers to include scholarly communication in the digital era.


Introduction
In the digital era in which by mid-2020 about 60 percent of the global population (close to 4.57 billion people) actively use the internet [1], publishing scientific articles in scientific journals to share and disseminate scientific advances in principle is no longer necessary. Following physics, scholars who publish their work online on the arXiv server since the early 1990s, scholars of all disciplines increasingly publish the outcomes of their work in the form of scientific papers called "preprints" in one of the numerous preprint servers offering free hosting (and free access) to scientific articles before or in parallel to conventional submission of the manuscript to academic journals for peer review.
Cutting time to publication, establishing priority, and eliminating subjective assessments of significance or scope, preprints allow scholars to publish the results of their work in a fully citable form immediately after its completion. Routinely used and cited by physicists, astronomers, computer scientists and mathematicians since the launch of arXiv preprint server in 1991 [2], preprints Citation: Pagliaro M. (2020) Publishing Scientific Articles in the Digital Era. Open Science Journal 5 (3) are now commonly used also by biologists and life scientists [3]. Eventually, their slow uptake by chemists, forecasted to inevitably accelerate in 2017 [4], has lately recorded its first inflection point with some 400 preprints published each month by ChemRxiv since April 2020 and 5,690 preprints published by August 8, 2020, three years after launch (on August 2017) [5].
The transition to open science, however, requires to expand the education of students and young researchers to include scholarly communication in the digital era. Driven by the existing recruitment and career advancement system based on citations and journal impact factor [6], both tenure-seeking young scholars and tenured professors ("principal investigators" in the research jargon) are chiefly interested in getting citations to their research articles. Preprints -research articles permanently published online with a digital object identifier (DOI) -are rapidly becoming highly cited scientific papers. For example, bioRxiv preprints are directly cited in journal articles, regardless of whether the preprint has been subsequently published in a peer reviewed journal or not [7].
After showing how the advent of the internet, in an almost opposite fashion to what happened to newspaper and magazine publishing, has led to further flourishing of the $25 billion scholarly publishing industry, I show how the unexpected expansion of preprints to all scientific disciplines beyond physics, mathematics and computer science is actually reshaping scientific publishing. The process starts with understanding the system of scientific publishing.

The scientific publishing industry
Published since the late 1600s (the world's oldest scientific journal appeared in France, on January 5, 1665 as a twelve page pamphlet called the Journal des sçavans [8] followed by the Philosophical Transactions of the Royal Society of London published without interruption since March 1665 [9]), today's scholarly journals are the products of a large industry, mostly based in western Europe and North America, comprised of for-profit and not-for-profit organizations.
Selected figures show the relevance of this industry whose global annual turnover exceeds $25 billion ($25.2 billion in 2015) [10]. The annual revenues generated in 2017 only from English-language scientific, technical and medical (STM) journal publishing were about $10 billion. Not-for-profit publishers such as scientific associations do make profits like any for-profit organizations, the only difference lies in the way these profits are used. Table 1 lists the main scientific publishers in 2019 in terms of number of journals published [11]. However, ranking publishers in terms of articles published shows that with 106,152 peer-reviewed articles published in 2019 (a 64% increase compared to 2018), MDPI has become the 5th largest academic publisher [12]. In 2017, out of 33,100 academic journals the number of peer reviewed Englishlanguage journals was about 10,000 [11]. The latter journals published over 3 million articles. The annual growth rate in the number of articles published increased to 4% per year due to the rising number of publishing researchers [11], mostly originating from China and India.
In one of the first studies to include a discussion on the economics of scholarly publishing, Larivière and co-workers found in 2015 that the industry is actually an oligopoly in which the five major publishers in natural and medical sciences in 2013 accounted for 53% of all papers published [13]. In 1973 the share of the five major publishers was slightly more than 20%. Out of 110,000 people employed by the industry, about 40% employees are based in three countries only: Great Britain, Germany, and The Netherlands [11].
Nicely explaining the unique nature of the scientific publishing market in which consumers (scholars) are isolated from the purchase (because purchase and use are not directly linked and thus price fluctuations do not influence demand), the Larivière's study was published in a so called "open access" (OA) journal owned by a not-for-profit STM publisher charging authors a fee (article processing charge (APC) of USD 1,695 as of 2020 [14]) billed upon acceptance of the article.
The revenues of STM journal publishing, indeed, chiefly originate from subscriptions paid by universities and research institutions to access "pay-walled" research articles, as well as from the industry, especially pharmaceutical and chemical industry but also engineering, which usually pay much higher license fees than academia.
Developed since the mid 1990s, the main alternative economic model for scholarly publishing is based on OA journals in which authors either publish their articles for free in journals supported by external funders (74% of OA journals listed in the Directory of Open Access Journals charge no APC [15]), or are charged with an APC. Depending on the journal, the latter APCs can vary from $1,250 for publishing in ACS Omega [16]) through $3,000 for publishing in PLOS Biology [14] and $5,000 in Advanced Science [17].
Publishing scientific papers on the Web eliminates the need for printing journals and disseminating them via postal mail to subscribers across the world. Indeed, many journals today no longer print on paper scientific articles but only electronically on the Web. Articles are produced and published online in different formats including hypertexts in HTML (hypertext markup language), PDF (portable document format) and ePub (open e-book standard format). Some journals exist both in print and electronic, with print copies printed according to the number of print subscriptions. Notwithstanding the above, most journals licensed by a library are licensed as electronic copy. Customers and authors wishing to receive a printed copy of a journal's issue are billed with its cost, and the selected journal's issue is printed "on demand" and sent via postal mail. Finally, certain publishers earn extra revenues by selling the journal covers (front and back covers) to the authors of selected articles willing to pay the fee requested by the publisher.
"Everything points to the fact" Bartling and Friesike wrote in 2014 introducing one of the most complete (open access) books on open science, "that we are on the brink of a new scientific revolution" [18]. In 2007, a group of major science journal publishers had hired a "public relations" (PR) agent "to combat the open access movement" [19] that aims to make scientific articles freely accessible on the internet. "We're like any firm under siege" commented the manager of a publishers association organization. "It's common to hire a PR firm when you're under siege" [19]. Said "siege" of the open science movement apparently had little effects on publishers if a scholar based in Canada, commenting on a social network his refusal to write for free another book chapter, in late 2019 emphasized how: «The science publishing industry which charges us to publish papers, and then charges us again for content access (through our universities or personal licenses), and not compensating us as associate editors, or being a reviewer, or to write book chapters as an expert, has to change [20]» In an almost opposite fashion to what happened to general interest magazines or to newspapers, the scientific publishing industry not only was not financially hurt by digitization process followed by the widespread adoption of the internet, but it actually greatly benefited from it. Both production and distribution costs, indeed, dramatically decreased.
Publishers offer numerous services relevant for authors, like semantic annotation and search engine optimization for better retrieval of articles, submitting metadata to many database providers for indexing, and ensuring outreach through many technical systems. Similarly, many of these services are offered by the owners of preprint platforms.
The main service offered by scientific journals to authors submitting their work is peer review following the editorial office decision whether to send the manuscript out for review or not. Manuscript editing and formatting services offered by publishers today are far less important. Accustomed to digital technology, today's scholars are able to produce well formatted manuscripts of their studies using article templates and word processing software programs (word processors). Peer review is generally provided for free by scholars on a voluntary basis following a request from a journal's editor to review a manuscript. The aim is to provide authors with a critical and constructive review helping them in improving their work prior to publication. The process starts with the assessment of the editorial office whether or not to send the manuscript out to reviewers for peer review (Figure 1). Very often, excellent manuscripts are refused at this stage.
The process further lacks transparency because, for manuscripts sent for review, the reviews are usually not published and reviewers remain anonymous. Besides double-blind peer-review in which the reviewers of the paper do not know the identity of the authors, and the authors do not know the identity of the reviewers, this lack of transparency of the conventional peer review process has led to the introduction at selected publishers of open peer review in which reviews are published online next to the scientific article accepted for publication, and even rated by peers [21].

Open access journals and preprints
Noting how progress to open access recently stalled, with only 20% of new papers being published as OA articles, Green has lately called for a digital transformation of scientific publishing going beyond simple digitization of the scientific publication process "using internet-era principles to deliver value" [22]. The main reason for which OA journals did not replace pay-walled journals is likely due to the fact that only few of them have achieved high impact factor (IF) values. Some actually reached high IF but, almost invariably, they are those with the highest article publication charges (one exception being Chemical Science, a journal "free to read and free to publish with no APCs" with an IF of 9.346 in 2019 [23]). The IF, however, is a worthless criterion to forecast the impact of a specific article. For example, up to 75% of the articles in any given journal has lower citation counts than a journal's IF [24]. Still, a recent study on the use of the journal IF in academic review, promotion, and tenure evaluations at North America universities found that 40% of universities explicitly mention the impact factor in their review, promotion, and tenure documents [25].
The same misuse of the impact factor to evaluate researchers was reported in 2019 by the editor of a reputable chemistry journal published in Europe in a plea to authors calling for a better and fairer use of citations: «Since the Impact Factor, and similar citation-based metrics, are likely to prevail and will continue to be used for evaluating not only journals but also researchers who publish in them, could you as an author please pay more attention to the citations, bearing in mind their implications. [26]» As long as the academic reward system across the world will continue to evaluate scholars based on the IF of journals in which they publish, professors and young researchers will continue to opt for publishing in high IF journals, even though open access journals have significantly more citations compared to non-OA journals, with payment of high APCs for publishing in high IF OA journals being not justified since gain in terms of increased number of citations will be minimal [27].
In brief, professors and principal investigators guiding the work of undergraduate and doctoral students continue to prefer to submit their team's manuscripts to high IF journals because both their own promotion and that of their tenure-seeking post-docs is driven by journal impact factor. This academic closed-loop system explains why most high IF scientific journals and their publishers not only were not impacted by the advent of the internet, but continue to launch new journals at fast rate. All would remain unvaried in scholarly communication if it not were for the unexpected recent uptake of preprints by scholars beyond physicists, mathematicians and computer scientists, most notably by life scientists.
In principle, it is already possible for a tenure-seeking candidate in the life or medical sciences to present a list of scientific articles and their citations including only preprints posted on one of the 44 preprint servers found by scholars in a systematic analysis [28]   If scientific quality is associated with citations from peers, why should a selection committee or a funding agency make a difference between a citation to a preprint and another to a publication in a peer reviewed journal? Showing full awareness of this possibility translating into a loss of citations from journal to preprint servers, the authors of the 2018 report of a leading scientific publishers association, emphasized in the conclusions how: «There is some concern that preprints (which can be brought up to date) may become a go-to place for the version of record, undermining publisher business models. Concerns have also been raised over the loss of citations from journals to preprints servers, with well over 8,000 citations to bioRxiv reported on Web of Science.
[29]» By publishing at no cost the outcomes of their work in the form of scientific articles (including color pictures and videos) first as fully citable preprints and then in free or low-cost OA journals, researchers can get all the visibility (citations) needed for their career and funding. This shift, inter alia, will aid to progressively free the large amount of money (most of the $25 billion yearly revenues of the scientific publishing industry) currently spent by governmentfunded universities and research centres for accessing scientific articles reporting the outcomes of work mostly financed by the very same governments [30].
Educational guidelines I suggest five main guidelines to shape a course offering new education on scientific publishing in the digital era (Table 2). Table 2. Guidelines for a course on scientific publishing in the digital era 1. Teach students the relevance of preprints and how to use prepublication 2. Teach students how to copy-editing their work 3. Teach students key aspects of bibliometrics and impact-based metrics 4. Teach students how to effectively use social media to engage with the public 5. Teach students to keep key rights to their research First, PhD students and young researchers need to understand the central relevance of preprints for their own career and for society, as preprints dramatically speed up time to publication, thereby accelerating innovation, while allowing scholars to regain control of their own research work [2,3,4,7]. Studies first published as preprints are received and evaluated by the broad scientific community based on their own intrinsic quality, independent of the hosting publication platform (i.e., scientific journal). Numerous university hiring and promotion committees, and the most important national and international funding agencies, now regularly require candidates and applicants to include preprints [31] in their applications and proposals. It is enough to insert the preprint publication with its DOI in a separate list from peer reviewed articles.
Second, students and junior researchers need to learn how to copy-editing their own work, starting from answering the main question suggested by the editor of another reputable chemistry journal: "is this research meaningful?" [32]. Training PhD students how to write a paper, in other words, should be an important task of every PhD student supervisor [33].
Third, young scholars need to know more closely science evaluation practices and citation-based metrics such as the h-index [34]. By learning how to effectively use statistical data concerning one's own research, a young researcher will better understand the impact of her/his research, and how it is used by peers. Eventually, rather than getting rid of bibliometric indicators she/he will learn how to expand and improve their use in a useful and critical fashion [35].
Fourth, undergraduate and doctoral students need to be trained on how to effectively use social media (for example Twitter, ResearchGate, Linkedin, Vk, Instagram, etc.) to share their research with the public and with researchers within and outside their research field [36]. The aim is not "to engage with the social media" but rather to use social media to engage with the public "in a world that increasingly values public engagement and impact" [37]. By doing so, young researchers will also expand their own professional network. Using professionally a news/social medium like Twitter, for example, scholars in medicine, chemistry, physics, life and earth sciences will be surprised by the number of peers actively using a Twitter account.
Fifth, scholars usually unfamiliar with copyright legal aspects, need to regain control on their work using tools such as the Scholarly Publishing and Academic Rights Coalition Author Addendum [38]. The latter is a legal instrument that modifies the publisher's agreement and allows authors to keep key rights to their research, "instead of blindly giving it away to publishers" [38].

Outlook and perspective
Trying to answer the question why the Web created by Berners-Lee in 1991 "to disrupt scientific publishing" [39] actually did not radically change it in the course of the subsequent two decades, Clarke in late 2016 concluded that this was mostly due to the journal's role of "designation", namely a cultural function ("the hardest to replicate through other means") for which, based on a scientists' publication record in existing scientific journals of high reputation (i.e., of high impact factor), academic institutions and funding agencies base career advancement and award decisions.
Thanks to the advent of preprints in all main scientific fields the number of citations and thus the impact of a scientist's work is now independent of the publishing platform. For the very first time, in principle a scientist can become a highly cited scholar without having her/his work published in conventional scientific journals following submission and peer review. Having received no formal training on scientific publication, however, most today's scholars are unaware that "the practice of sending manuscripts to experts outside of the journal's editorial offices for review was not routine until the last half of the 20th century" [39].
Publishers often invoke their key "quality control" role, objecting that preprints would allow everyone to upload any contents. Yet, the seminal article in which Mullis published the discovery of the polymerase chain reaction (PCR) was rejected by Science [40], a journal with a high impact factor, and eventually was published in Methods in Enzymology [41]. Three years later, emphasized Mullis in his Nobel Lecture [40], Science proclaimed PCR "Molecule of the Year".
As to the value of the impact factor, it is instructive to learn that the IF of Methods in Enzymology in 2018 was 1.984, while Mullis' article reporting the PCR discovery up to September 2019 had been cited 7,876 times (Google Scholar). By the same token, the four revolutionary scientific papers on the theory of the photoelectric effect, Brownian motion, special relativity, and massenergy equivalence published by Einstein in 1905 --the so called annus mirabilis [42] --appeared in Annalen der Physik, namely a journal whose impact factor in 2018 was slightly more than 3 (3.276), and in 1905 was probably not higher than 1.
«We now receive many more interesting papers than we can publish…» typically reads the e-mail (a letter when Mullis in 1986-1987 repeatedly submitted his revolutionary molecular biology work describing the PCR) with which an editor communicates her/his decision to not send the submitted manuscript to reviewers. «Hence, we regret that we are unable to process your manuscript further and suggest that you consider submission to a more specialized journal».
The time wasting cycle of conventional scientific publication starts, with the manuscript going from a journal to another, and peer review starting each time afresh. Eventually, after months or even more than a year, the manuscript is accepted for publication. Whether or not manuscripts sent out for peer review are "selected on the basis of methodological rigor, novelty, quality of writing, and general significance", until the advent of preprints there was little or nothing scholars could do to object the subjective opinion of journal's editors and reviewers. Now, posting online her/his preprint, an author or a team of authors is granted priority on the discoveries and the novel ideas reported therein. Novelty is established, and any media embargo until the date of online publication for which prior to embargo lifting there can be no public mention of the upcoming paper, becomes unnecessary.
The infrastructure for preprints and its relatively modest costs will be faced, as it happens today, by public and private universities, foundations, scholarly societies, and even by publishers of OA journals requiring an APC. The global scientific output is expected to double approximately every nine years [43]. This huge scientific output --most of which will shortly originate from China and India due to their huge population (amounting to 37% of world's current population) and excellent scientific schools --will appear first and foremost as preprints. Today's academics, however, "work to the system in which they find themselves" [44]. Hence, further argued Laurillard calling for a better academic system facilitating and rewarding excellence in teaching and not only in research, promotion of excellence in teaching requires to change the rewarding system [44]. Said reformed academic system will comprise the ability to evaluate scholars for hiring and career advancements based on true scientific merit, and thus also on citation-based metrics, independent of the publishing platform in which the candidate's papers have been published.
I agree with Clarke [39]: "designation" of a scientist by scientific journals is a cultural trait. The etimology of a word reveals its origin and meaning. The word "submission" originates from the Latin verb submittere and was first recorded in English around mid 15th century with a clear meaning: "humble obedience" [45]. As we enter the third decade of the 21st century, the time has come for world's scholars to replace journal article "submissions" with a free and open scholarly publishing system widely based on freely accessible and freely reproducible preprints. In this new system the value (i.e., quality) of scientific articles no longer needs peer review but is open to the evaluation and use of the whole scientific community which has all the tools and an obvious incentive to separate "wheat from chaff", identifying quality work for further valorization via subsequent utilization (and citation). The time foreseen by Évariste Galois in 1831 during which «on s'associera pour étudier, au lieu d'envoyer aux académies des plis cachetés, on s'empressera de publier ses moindres observations pour peu qu'elles soient nouvelles, et on ajoutera: 'Je ne sais pas le reste'» [46] («scientists will team up to study, instead of sending sealed envelopes to the academies, hastening to publish their slightest observations as long as they are new, adding: 'I do not know the rest'») has finally come.