Separation of feedback, publishing and assessment of scientific studies

I once asked a friend and colleague about a wrong sentence in one of his scientific articles. He is a smart cookie and should have known better than that. His answer was that he knew it was wrong, but the peer reviewer requested that claim. The error was small and completely inconsequential for the results; no real harm was done. I wondered what I would have done.

Peer review has two roles: it provides detailed feedback on your work and it advises the editor on whether the article is good enough for the journal. This feedback normally makes the article better, but it is somewhat uncomfortable to discuss with reviewers who have a lot of power because of their second role.

My experience is that normally you can argue your case with a reviewer. Still to reach a common understanding can take an additional round of review, which means that the paper is published a few months later. In the worst case, not agreeing with a reviewer can mean that the paper is rejected and you have to submit to another journal.

It is quite common for reviewers to abuse their power by requesting their work to be cited (more). Mostly this is somewhat subtle and the citation more or less relevant. However, an anonymous reviewer once requested that I’d cite four article by one author, one of which was somewhat relevant. That does not hurt the article, but is disgusting power abuse and rewards bad behavior. My impression is that these are not all head fakes; when I write a critical review I make sure not to ask for citations to my work, but recommend some articles of colleagues instead. Multiple colleagues, not to get them into trouble.

Grassroots journals

I have started a grassroots journal on homogenization of climate data and only recently started to realize that this also produces a valuable separation of feedback, publishing and assessment of scientific studies. That by itself can lead to a much more healthy and productive quality control system.

A grassroots journal assesses published articles and manuscripts in a field of study. One could also see it as a continually up-to-date review article. At least two reviewers write a review on the strengths and weaknesses of an article, everyone can comments on parts of the article and the editors write a synthesis of the reviews. A grassroots journal does not publish the articles themselves, but collects articles published everywhere.

Every article also gets a quantitative assessment. This is similar to the current estimate of how important an article is by the journal it was able to get into. However, it does not reward people submitting the articles to a too big journal, hoping to get lucky, making unnecessary work for double reviews. For example, the publisher Frontiers reviews 2.4 million manuscripts and has to bounce about 1 million valid papers.

In case of traditional journals your manuscript only has to pass the threshold at the time of publishing. With an up-to-date rolling review of grassroots journals articles are rewarded that are of lasting value.

I would not have minded making a system without a quantitative assessment, but there are real differences between articles, the reader needs to prioritize their reading and funding agencies would likely not accept grassroots journals as replacement of the current system without it.

That is the final aim: getting rid of the current publishing system that holds science back. That grassroots journals immediately provide value is hopefully what makes the transition easier.

The more assessments made by grassroots journals are accepted the less it matters where you publish. Currently there is typically one journal, sometimes two, that have the right topic and prestige to publish in. The situation for the reader is even more terrible: you often need a specific paper and not just some paper on the topic. For this one specific paper there is one (legal) supplier. This near-monopolistic market leads to Elsevier making profits of 30 to 50% and it suppresses innovation.

Another symbol of the monopolistic market are the manuscript submission systems, which combine the worst of pre-internet paper submissions (every figure a separate file, captions in a separate file) with the internet age adage “save labor costs by letting your customers do the work” (adding the captions a second time when uploading a figure with a neat pop-up for special characters).

Separation of powers

Publishing is easy nowadays. ArXiv does this for about one dollar per manuscript. Once scientists can freely chose where to publish, the publishers will have to provide good services at reasonable costs. The most important service would be to provide a broad readership by publishing Open Access.

Maybe it will even go one step further and scientists will simply publish their manuscript on a pre-print server and tell the relevant grassroots journals where to find it. Such scientists likely still would like get some feedback from their colleagues on the manuscript. Several initiatives are currently springing up to review manuscripts before they are submitted to journals, for example, Peer Community In (PCI). Currently PCI makes several rounds until the reviewers “endorse” a manuscript so that in principle a journal could publish such a manuscript without further peer review.

With a separate independent assessment of the published article there would no longer be any need for the “feedback peer reviewers” to give their endorsement. (It doesn’t hurt.) The authors would have much more freedom to decide whether the changes peer reviewers suggest are actually improvements. The authors, and not the reviewers, would decide when the manuscript is finished and can be published. If they make the wrong decisions that would naturally be reflected in the assessment. If they do not not add four citations to a peer reviewer that would not be any problem.

There is a similar initiative in the life sciences called APPRAISE, but this will only review manuscripts published on pre-print servers. Once the journals are gone, this will be the same, but I feel that grassroots journals add more immediate value by reviewing all articles on one topic. Just like a review article should review the entire literature and not a random part.

A vigorously debated topic is whether peer reviews should be open or closed. Recently ASAPbio had this discussion and comprehensively summarized the advantages and disadvantages (well worth reading). Both systems have their strengths and I do not see one of them winning.

This discussion may change when we separate feedback and assessment. Giving feedback is mostly doing the authors a favor and could more easily be done in the open. Rather than cumbersome month-long rounds of review, it would be possible to simply write an email and pick up the phone and clarify contentious points. On the other hand anonymity makes it easier to give an honest assessment and I expect this part to be mostly performed anonymously. The editors of a grassroots journal determine what is published and can thus ensure that no one abuses their anonymity.

The future

Concluding, in a decade a researcher writes an article and asks their colleagues for feedback. Once the manuscript no longer changes that much it is send to an independent proof reading service. Another firm or person takes care of the lay-out and ensures that the article can still be read in a century by making versions using open standards.

The authors decide when their manuscript is ready to be published and can be uploaded to the article repository. They send a notice to the journals that cover the topic. Journal A makes an assessment. Journals B and C copy this assessment, while journal D also uses it, but requests an additional review for a part that is important to them and they write another synthesis.

Readers add comments to the article using web annotations and the authors reply to them with clarifications. Also authors can add comments to share new insights on what was good and bad about the article.

Two years later a new study shows that one of the choices of the article was not optimal. This part was important for journal C and D and they update their assessment. The authors decide that it is relatively easy to redo their article with a better choice and that the article is sufficiently important to put in some work, they upload the updated study to the repository and the journals update their assessment.

https://www.youtube-nocookie.com/embed/8qrriKcwvlY

(Cross posted with my general blog: Variable Variability.)

Related reading

APPRAISE (A Post-Publication Review and Assessment In Science Experiment). A similar idea to grassroots journals, but they only want to to review pre-prints and will thus only review part of the literature. See also NPR on this initiative.

A related proposal by Gavin Schmidt: Someone C.A.R.E.S. Commentary And Replication in Earth Science (C.A.R.E.S.). Do we need a new venue for post-publication comments and replications?

* Photo of scientific journals by Tobias von der Haar used under a Attribution 2.0 Generic (https://creativecommons.org/licenses/by/2.0/) license.
* Graph of publishing costs by Dave Gray used under a Attribution-NonCommercial-NoDerivs 2.0 Generic (CC BY-NC-ND 2.0) license.

A community of community journals

Previous posts were about how one grassroots scientific journal could look like, but if it all works out we will have thousands of them. They can strengthen each other: Sharing reviews reduces the work load and also creates a network of trust that shows which journals are reliable.

Peer review is a relatively new invention. It is often seen as gate keeping, but it actually helps new and fringe authors to gain some initial credibility that makes others willing to invest time in understanding their work. This has become more important with the increasing scales of science and the internationalization of science, which makes it harder to know who is doing good work. People working in related fields will find it harder to assess the quality of studies and benefit from peer review as readers. With science becoming more interdisciplinary this role has become more important as well.

Credibility of journals

Just like articles need credibility also scientific journals need credibility. Again this is important for the authors and readers to find worthwhile studies. Again this is most important for new journals by normal scientists. In the beginning that would be the typical situation for grassroots journals.

In today’s age of science micromanagement the credibility of journals is also important to determine the output of researchers and which articles are listed in the big databases such as the [[Web of Science]], [[Scopus]] or Google Scholar.

Journals sharing reviews shows that these scientists trust each other and weaves a network of reliable science. Also showing how well the assessments of the general importance of papers correlates with future citations can demonstrate that editors are doing a good job. We could also ask the authors of the reviewed articles, which are the relevant community, how they assess the quality of the journal.

Caveats of credibility metrics

People could build up a credible journal in their field and occasionally publish ideologically motivated bunk science. That could hurt trust of colleagues, but the penalty would be limited if these articles would be in another field and a few bad quality reviews would be hard to see in future citations. People have build up [[large networks of fake websites]] linking to each other to get better rankings in Google. That could theoretically also happen here, but is something we could detect in the citation scores of the journals.

Some journals may legitimate not have any (or many) ties to other journals. Maybe the editors are new or young, maybe there are no other journals (yet) in related fields, maybe there are conflicts, but both groups are scientifically credible. Such cases would make it hard to assess the credibility of the journal. The only information would be metrics based on future citations to their articles, which takes years to become informative.

Human judgement

Google likes to make everything automatic. (And makes it impossible to reach a human when things go wrong.) The automation step is good to reduce the work load, but I feel human judgement is key. That is also why I propose grassroots journals and not just one big database where every scientist can vote articles up and down. Editors assessing the expertise of the reviewers is important and editors making sure every manuscript is reviewed is important.

In case a traditional journal does not function the publisher would jump in to save its reputation. Maybe editors of several journals could team up and together with science societies in their field build some sort of accreditation organisation to replace that role of publishers. Or editors could elect the members of such a group. One traditional journal only has one publisher but one grassroots journal could be member of several such accreditation groups, thus reducing dangers of power abuse by the “publisher”. This group could also be helpful for mediation in case there are conflicts among editors of a journal.

How to create such a network of trust among the journals may be the most difficult part of grassroots publishing. Thus especially here I welcome feedback in the comments below and fresh ideas on how to make this work.

Pros and cons of editors that are part of their scientific community

Scientific publishing is becoming an industry. Last year there was a conference on “peer review in 2030” in London and their first suggestion was to use Artificial Intelligence (AI) to select and verify the reviewers.

Find and invent new ways of identifying, verifying and inviting peer reviewers, focusing on closely matching expertise with the research being reviewed to increase uptake. Artificial intelligence could be a valuable tool in this.

I would argue we should go the other way, go to smaller scales and involve the scientific communities the journals should be serving. Editors are supposed to know their community and know who to ask. The suggestion to use AI and that an increasing number of journals ask the authors to propose potential reviewers shows that the scale of the industry has become so large that this is no longer the case. That makes it possible for authors to cheat the system and suggest reviewers with fake emails that need to be “verified”.

The large scales also make it harder for the editor to assess the reviews. Especially in American journals reviews are often very poorly done, in my experience, and one regularly gets the impression a reviewer read only a few paragraphs. Reviews also often contradict each other, apparently without the editor noticing, just passing it on to the authors.

In a grassroots scientific journal the editor would write a synthesis of the reviews. That is something that many current editors would not be able to do because they are too far away from the topic, running a journal with a wide range of topics.

There is currently a trend to macro scientific journals that publish anything that is technically okay, publishing all scientific fields and not looking at the importance. A grassroots scientific journal could also be called a “micro journal”, although I do not know if there is such a contrast. Macro journals also have many editors. Also micro journals would/could publish everything that is technically okay, independent of whether it is seen as important. They would be focussed on one topic, but all micro journals combined could also be seen again as a macro journal. The main difference is that a macro journal is more top down and grassroots journals are bottom up.

As an aside, I am still looking for a good name. “Grassroots journal” is nice because it emphasises that it is a bottom up initiative from the scientific community. But I think in English it also has some connotations of political activism, which I hope does not turn people off. “Micro journal” could be an alternative, but if a large editorial team comes together also micro journals could have a quite broad range of topics.

A related concept are “[[overlay journals]]”. These are journals that review manuscripts in repositories, typically ArXiv. The journal Discrete Analysis started by Timothy Gowers is a normal journal in most aspects, except that it is free and uses ArXiv to host the articles. A French group set up Episciences provides Overlay Journal Support and facilitate the publican and peer review of informatics and applied mathematics manuscripts hosted by ArXiv and thus make it easy to set-up a journal; they also welcome existing journals to move to their platform. You have to apply to be accepted. The Lund Medical Faculty Monthly highlights an article written by the group every month and writes a small summary.

The term “journal” smells like old paper. But in this case, they could share reviews, they could merge, they could split up, etc., which is not possible with copyrighted, paper journals. Collection could be an alternative term, but sounds a bit passive. Suggestions for a good term are welcome in the comments below.

Journals sharing reviews builds a network of trust, which will be the topic of my next post. The ability to use (all) reviews of an existing journal or existing journals also makes it easy to start a journal by reducing the barrier of entry.

A low barrier to entry and the openness of the review process help reduce any abuses of power. A micro journal being close to the community also means that it is easier to have conflicts of interest. Thus being able to start an alternative journal should be easy.

A new journal should thus be able to copy the content of an existing journal and then edit it and add to it. It would also be good to have multiple domain names, so that multiple journals on the same topic can exist (using variations on the full name of the journal: journal of statistical homogenisation, journal of homogenisation, homogenisation journal, international journal of homogenisation, homogenisation, homogenisation science, statistics and homogenisation, …).

Related reading

SpotOn report: What might peer review look like in 2030? A report from BioMed Central and Digital Science.

Josh Brown: An Introduction to Overlay Journals.

Participating in grassroots reviewing

There are three ways to participate in grassroots reviewing.

  • Make a comment on (a detail of) the article.
  • Write a review.
  • Write a synthesis assessment.

Comments

Everyone can write comments on the article, the reviews or the synthesis. These can be comments just like ones below a blog post, but with the software of Hypothesis it is very easy to select a part of the manuscript (PDF file or web page) and write a comment about it.

A comment can be short and only make a remark about one sentence or as long as a typical review and discuss the full manuscript. I would suggest that anonymous comments should be possible and that all comments will be pre-moderated. But that should be settings so that different communities can handle this differently.

I just read an open review manuscript of an EGU journal and thought something was wrong. Here the reviewers are anonymous, but anyone else making comments need to do this named. For a named comment I would have had to invest more time to be sure my comment made sense. Had an anonymous comment been possible, I had probably asked a question. The downside is that anonymous comments will likely have less quality and increase the number of comments and thus the work for the editors and authors.

Reviews

Reviews would look similar to those written for traditional journals, but they would be published. Furthermore the reviewer would be asked to assess the importance of the paper, whether it should be published and to classify it.

The classification would mostly be journal dependent. The main categories could be reviews, methodological studies, replications, experimental and observational work. In statistical homogenisation I would suggest the categories: homogenisation methods, validation and error quantification, properties of inhomogeneities and applications. They can be detailed again.

Reviews can be published anonymously, but the editor of the article would need to know who it is to assess their expertise. (The other editors of the journal should not know this because they could be authors or have other conflicts of interests.) With named reviews the reviewer can take credit for their work, but my impression is that most prefer to be anonymous, that makes it easier to critique colleagues with which one still needs to collaborate in future. In addition, like for comments, writing a named review takes more time. After some time, for example ones a year, a tally can be made of how many articles someone published and reviewed to create some social reward for reviewing.

For the specific comments also reviewers can use hypothesis to make this easier. Reviews can be updated and are always possible (post-publication review).

Also partial reviews could be worthwhile, reviews that only assess part of a study, but do not make an overall assessment. For example, in case of a paper that homogenizes a climate dataset and goes on to analyse climatic changes, the reviewers of the journal on statistical homogenization could make an assessment on the quality of the homogenization, which can inform the editors of a more general climate journal in making their assessment of the complete study.

Synthesis

The synthesis would be written by one of the editors based on the reviews and their own expertise. The synthesis would look like the reviews: a general assessment, a classification and an assessment of the importance of the study.

In case the article is not published, I would suggest to keep the page up and list it on an index of rejected papers, rejections and its justification is useful information.

The synthesis does not have to be an average of the reviews, the editor can assess the strength of the reviews and reviewers often only have expertise on part of the paper.

The synthesis could be named or anonymous. I presume different communities have different preferences.

A system for the assessment of the importance of scientific papers

The final assessment in a traditional journal is just a yes or no. The journal it is published in gives some information on the importance of the article and helps somewhat in finding relevant articles. There is no post publication review beyond rare retractions of papers, traditionally in case of fraud, nowadays increasingly also in case of big mistakes.

A grassroots journal would be able to provide more information on the value of an article by publishing the reviews of the experts in the field. A grassroots assessment would include

  • A written review that discusses the article’s place in the literature, its strengths and weaknesses and detailed comments.
  • A categorisation of the article to make it easier to find relevant articles.
  • An assessment of the importance of the article.

This blog post is on the assessment of the importance of an article, the rest follows later. The importance of an article has at least five aspects.

  1. Contribution to the scientific field of the journal.
  2. Impact on the larger scientific community.
  3. The technical quality of the paper.
  4. Importance at the time of publishing.
  5. Importance of the research program.

Let me try to explain these 5 aspects below. The example I have in the back of my head is my own field, statistical homogenisation of climate station data as part of the large climate change research community. This is a field working much on methodological problems, which may have made my proposal less suitable for other fields.

It would be good when this system would fit for all fields, or at least many. It should also be as simple as possible; can we merge/skip aspects? Are there aspects missing (that are important for other fields)?

Importance for the field

This metric assesses how large the contribution of the paper to the scientific field is. A paper can be important for the homogenisation community when it helps understanding the algorithms we use, or the problems we have in the data, or proposes a better homogenisation method. Scientists need to know which the important papers in the field are to prioritise their reading.

Impact

The traditional peer review mainly measures the expected impact of an article via the Impact Factor of the journal: how often an article in that journal is cited in the first two or five years after publishing. Ideally journals make a more complete assessment of the importance of submitted manuscripts and the impact is just an emergent property, but now that the Impact Factor is published and taken in to account in the micro-management of science a high Impact Factor has become a goal in itself.

I would love to get rid of this system, but an assessment of the impact cannot be avoided as long as publish-or-perish micro-management systems are prescribed by politicians. Without something like an impact factor bureaucrats will not be satisfied with a publication in a grassroots journal and scientists would thus not submit their work to such journals.

It would also have some role for science. It would inform scientists in the larger community what the important papers to read are. In case of homogenisation this would be papers that change our assessment of the size of climatic changes or papers on how much used datasets have been homogenised. These do not necessarily have to be the papers that bring the field itself forward most.

This metric could also include the importance of a paper for the public or maybe that could be a metric by itself. Because media attention often also leads to more citations, but also to stimulate this kind of research. In the public climate “debate” the upper air warming estimated by satellites plays a large role. They are scientifically not that important because the time series is short and the data is expected to be unreliable, but studies improving this dataset are important for the public debate. Studies on the relationship between vaccines and autism are also scientifically no longer needed, but could still help inform the public and increase vaccination rates. This kind of research is important and we tend to do too little of it, focussing on scientific importance.

Technical quality

This would assess the technical quality of the work. A paper may not give many new insights and then score low on the previous metrics, but may dot all the i’s and cross all the t’s very carefully to increase our confidence in our assessment. Paper of high technical quality are good for citing. An extensive balanced review article would also score high in this aspect.

Important when published

Hopefully grassroots journals would not only review current articles, but also important classical ones. Some of these may not be strictly important any more, for example papers that introduce methods that have now been superseded by better ones, but were important innovations as the time. It would feel bad to give them a low assessment on the previous metrics without at least acknowledging their contribution to the field as classical papers.

Important research program

Some papers will be important as part of a series, for example as the last work of an important series. This could be about a new paper on an important method that makes only a small further improvement and would thus by itself not be too important.

Hopefully this will discourage publishing studies in thin salami slices, as only the last paper would be marked as important in this metric and the reviewer can feel to give lower assessments on the previous ones. It will often be completely legitimate to publish papers that only make a small additional contribution; improving the best methods/studies is hard.

To summarise and be more concrete: In my own field, a paper can be important

  1. for the homogenisation community when it helps understanding the algorithms we use or proposes a better method;
  2. for the broader climatological community (and general public) because it changes the assessment of climate trends;
  3. for the homogenisation community because it improves our confidence, for example an analytic study helping us understand a previous numerical result.
  4. for the history of the field, for example this first study using the relative homogenisation principle or the papers on the much used Standard Normal Homogeneity Test.
  5. for the users of homogenisation methods because it improves one of the best homogenisation methods.

The systems used for these assessments should preferably be useful for all (or many) sciences to make it easier for scientists from other disciplines (and for micro-managers) to judge papers. Thus I would very much appreciate feedback on these five aspects, especially from researchers from other fields to see if this would also work there.

If anyone knows of scholarly work on the assessment of the importance of papers please also leave a comment. (I am not thinking of paper on computing a bibliographic index based on citations, but on assessment of the intrinsic value of papers.)

An important advantage of such an assessment of the importance of a paper is that all papers on one topic, but of different importance, can be published in the same journal. In this way also replication studies could be published, which in traditional journals often do not reach the publication threshold. Also a grassroots journal has a lower quality limit: A paper should at least be technically sound.

The importance of a paper can change in time. Maybe it is found that a certain method has important application previously not appreciated. Maybe a problem is found in the paper. Maybe it is superseded by newer work.

The assessment of the papers in a grassroots journal should thus be updated if new comments and reviews come in and also periodically because the field is making progress. These changes should be visible (if only to be able to demonstrate that earlier assessments can predict impact) and should be justified by the editors.

I would suggest to make the quality assessments on a percentile scale. Once enough reviews are in, the software should present the reader with calibrated percentiles to avoid that the reviews call 90% of the studies to be in the best 50%. (Except for classical papers, where likely only the best ones are reviewed and thus no calibration should be performed.)

So far for the general ideas. I hope to get some feedback before writing a more concrete proposal later, as well as posts on the other parts of the assessment and on the work of editors and reviewers.

Grassroots scientific publishing (repost)

This is a repost of my first blog post on grassroots scientific publishing written for my science blog Variable Variability.

These were the weeks of peer review. Sophie Lewis wrote her farewell to peer reviewing. Climate Feedback is making it easy for scientists to review journalistic articles with nifty new annotation technology. And Carbon Brief showed that while there is a grey area, it is pretty easy to distinguish between science and nonsense in the climate “debate”, which is one of the functions of peer review. And John Christy and Richard McNider managed to get an article published, which I would have advised to reject as reviewer. A little longer ago we had the open review of the Hansen sea level rise paper, where the publicity circus resulted in a-scientific elements spraying their graffiti on the journal wall.

Sophie Lewis writes about two recent reviews she was asked to make. One where the reviewers were negative, but the article was published anyway by the volunteer editor and one case where the reviewers were quite positive, but the manuscript was rejected by a salaried editor.

I have had similar experiences. As reviewer you invest your time and heart in a manuscript and root for the ones you like to make it in print. Making the final decision naturally is the task of the editor, but it is very annoying as a reviewer to have the feeling your review is ignored. There are many interesting things you could have done in that time. At least nowadays you get to see the other reviews and hear the final decision more often, which is motivating.

The European Geophysical Union has a range of journals with open review, where you can see the first round of reviews and anyone can contribute reviews. This kind of open review could benefit from the annotation system used by Climate Feedback to review journalistic articles; it makes reviewing easier and the reader can immediately see the text the review refers to. The open annotation system allows you to add comments to any webpage or PDF article or manuscript. You can see it as an extra layer on top of the web.

The reviewer can select a part of the text and add comments, including figures and links to references. Here is an annotated article in the New York Times that Climate Feedback found to be scientifically very credible, where you can see the annotate system in action. You can click on the text with a yellow background to see the corresponding comment or click on the small symbol at the top right to see all comments. (Examples of articles with low scientific credibility are somehow mostly pay-walled; one would think that the dark money behind these articles would want them to be read widely.)

I got to know annotation via Climate Feedback. We use the annotation system of Hypothes.is and this system was actually not developed to annotate journalistic articles, but for reviewing scientific articles.

The annotation system makes writing a review easier for the reviewer and makes it easier to read reviews. The difference between writing some notes on an article for yourself and a peer review becomes gradual this way. It cannot take away having to read the manuscript and trying to understand it. That takes most time, but this is the fun part, reducing time time for the tedious part makes it more attractive to review.

Publishing and peer review

Is there a better way to review and publish? The difficult part is no longer the publishing. The central part that remains is the trust of a reader in a source.

It starts to become ironic that the owners of the scientific journals are called “scientific publishers” because the main task of a publisher is nowadays no longer the publishing. Everyone can do that nowadays with a (free) word processor and a (free) web page. The publishers and their journals are mostly brands nowadays. The scientific publisher, the journal is a trusted name. Trust is slow to build up (and easy to lose), producing huge barriers to entry and leading to near monopoly profits of scientific publishing houses of 30 to 40%. That is tax-payer money that is not spend on science and promotes organization that prefer to keep science unused behind pay-walls.

Peer review performs various functions. It helps to give a manuscript the initial credibility that makes people trust it, that makes people willing to invest time in it to study its ideas. If the scientific literature would be as abominable as the mitigation skeptical blog Watts Up With That (WUWT) scientific progress would slow down enormously. At WUWT the unqualified readers are supposed to find out themselves whether they are being conned or not. Even if they would do so: having every reader do a thorough review is wasteful; it is much more efficient to ask a few experts to first vet manuscripts.

Without peer review it would be harder for new people to get others to read their work, especially if they would make a spectacular claim and use unfamiliar methods. My colleagues will likely be happy to read my homogenization papers without peer review. Gavin Schmidt’s colleagues will be happy to read his climate modelling papers and Michel Mann’s colleagues his papers on climate reconstructions. But for new people it would be harder to be heard, for me it would be harder to be heard if I would publish something about another topic and for outsiders it would be harder to judge who is credible. The latter is increasingly important the more interdisciplinary sciences becomes.

Improving peer review

When I was dreaming of a future review system where scientific articles were all in one global database, I used to think of a system without journals or editors. The readers would simply judge the articles and comments, like on Ars Technica or Slashdot. The very active open science movement in Spain has implemented such a peer review system for institutional repositories, where the manuscripts and reviews are judged and reputation metrics are estimated. Let me try to explain why I changed my mind and how important editors and journals are for science.

One of my main worries for a flat database would be that there would be many manuscripts that never got any review. In the current system the editor makes sure that every reasonable manuscript gets a review. Without an editor explicitly asking a scientist to write a review, I would expect that many articles would never get a review. Personal relations are important.

Science is not a democracy, but a meritocracy. Just voting an article up or down does not do the job. It is important that this decision is made carefully. You could try to statistically determine which readers are good at predicting the quality of an article, where quality could be determined by later votes or citations. This would be difficult, however, because it is important that the assessment is made by people with the right expertise, often by people from multiple backgrounds; we have seen how much even something as basic as the scientific consensus on climate change depends on expertise. Try determining expertise algorithmically. The editor knows the reviewers.

While it is not a democracy, the scientific enterprise should naturally be open. Everyone is welcome to submit manuscripts. But editors and reviewers need to be trusted and level headed individuals.

More openness in publishing could in future come from everyone being able to start a “journal” by becoming editor (or better by organization a group of editors) and try to convince their colleagues that they do a good job. The fun thing about the annotation system is that you can demonstrate that you do a good job using existing articles and manuscripts.

This could provide real value for the reader. Not only would the reviews be visible, but it would also be possible to explain why an article was accepted, was it speculative, but really interesting if true (something for experts) or was it simply solid (something for outsiders). Which parts do the experts debate about. The debate would also continue after acceptance.

The code and the data of every “journal” should be open so that everyone can start a new “journal” with reviewed articles. So that when Heartland offers me a nice amount of dark money to start accepting WUWT-quality articles, a group of colleagues can start a new journal and fix my dark-money “mistakes”, but otherwise have a complete portfolio from the beginning. If they would have to start from scratch that would be a large barrier to entry, which like the traditional system encourages sloppy work, corruption and power abuse.

Peer review is also not just for selecting articles, but also to help making them better. Theoretically the author can also ask colleagues to do so, but in practice reviewers are better in finding errors. Maybe because the colleagues who will put in most effort are your friends who have to same blind spots? These improvements of the manuscript would also be missing in a pure voting system of “finished” articles. Having a manuscript phase is helpful.

Finally, an editor makes anonymous reviews a lot less problematic because the editor could delete comment where the anonymity seduced people into inappropriate behavior. Anonymity could be abused to make false attacks with impunity. On the other hand anonymity can also provide protection in case of large power differences in case of real problems.

The advantage of internet publishing is that there is no need for an editor to reject technically correct manuscripts. If the contribution to science is small or if the result is very speculative and quite likely to be found to be wrong in future, the manuscript can still be accepted but simply be given a corresponding grade.

This also points to a main disadvantage of the current dead-tree-inspired system: you get either a yes or a no. There is a bit more information in the journal the author chooses, but that is about it. A digital system can communicate much more subtly with a prospective reader. A speculative article is interesting for experts, but may be best avoided by outsiders until the issues are better understood. Some articles mainly review the state-of-the-art, others provide original research. Some articles have a specific audience: for example the users of a specific dataset or model. Some articles are expected to be more important for scientific progress than others or discuss issues that are more urgent than others. And so on. This information can be communicated to the reader.

The nice thing about the open annotate system is that we can begin reviewing articles before authors start submitting their articles. We can simply review existing articles as well as manuscripts, such as the ones uploaded to ArXiv. The editors could reject articles that should not have been published in the traditional journals and accept manuscripts from archives. I would judge this assessment of a knowledgeable editor (team) more than the acceptance by a traditional journal.

In this way we can produce collections of existing articles. If the new system provides a better reviewing service to science, the authors at some moment can stop submitting their manuscripts to traditional journals and submit them directly to the editors of a collection. Then we have real grassroots scientific journals that serve science.

For colleagues in the communities it would be clear which of these collections have credibility. However, for outsiders we would also need some system that communicates this, which would traditionally be the role of publishing houses and the high barriers to entry. This could be assessed where collections have overlap. Preferably again by humans and not by algorithms. For some articles there may be legitimate reasons why there are differences (hard to assess, other topic of collection), for other articles an editor not having noticed problems may be a sign of bad editorship. This problem is likely not too hard, in a recent analysis of twitter discussions on climate change there was a very clear distinction between science and nonsense.

There is still a lot to do, but with the ease of modern publishing and the open annotate system a lot of software is already there. Larger improvements would be tools for editors to moderate review comments (or at least to collapse less valuable comments); Hypothes.is is working on it. A grassroots journal would need a grading system; standardized when possible. More practical tools would include some help in tracking the manuscripts under review and for sending reminders, and the editors of one collection should be able to communicate with each other. The grassroots journal should remain visible even if the editor team stops; that will need collaboration with libraries or science societies.

If we get this working

  • we can say goodbye to frustrated reviewers (well mostly),
  • goodbye to pay-walled journals in which publicly financed research is hidden for the public and many scientists alike and
  • goodbye to wasting limited research money on monopolistic profits by publishing houses, while
  • we can welcoming better review and selection and
  • we are building a system that inherently allows for post-publication peer review.

What do you think?

Related reading

There is now an “arXiv overlay journal”, Discrete Analysis. Articles are published/hosted by ArXiv, otherwise traditional peer review. The announcement mentions three software initiative that make starting a digital journal easy: ScienceOpen, Scholastica, Episciences.org and Open Journal Systems.

Annotating the scholarly web

A coalition to Annotating All Knowledge A new open layer is being created over all knowledge

Brian A. Nosek and Yoav Bar-Anan describe a scientific utopia: Scientific Utopia: I. Opening scientific communication. I hope the ideas in the above post makes this transition possible.

Climate Feedback has started a crowed funding campaign to be able to review more media articles on climate science

Farewell peer reviewing

7 Crazy Realities of Scientific Publishing (The Director’s Cut!)

Mapped: The climate change conversation on Twitter

I would trust most scientists to use annotation responsibly, but it can also be used to harass vulnerable voices on the web. Genius Web Annotator vs. One Young Woman With a Blog. Hypothesis is discussing how to handle such situations.

Nature Chemistry blog: Post-publication peer review is a reality, so what should the rules be?

Report from the Knowledge Exchange event: Pathways to open scholarship gives an overview of the different initiative to make science more open.

Magnificent BBC Reith lecture: A question of trust