Rethinking Peer Review As The World Peer Reviews Claimed Proof That P≠NP

from the there-are-a-lot-more-peers dept

Thu, Aug 12th 2010 3:09pm — Mike Masnick

We recently discussed how incredibly broken the traditional scientific journal system is, in terms of how they tend to lock up access to information. However, that left out a much bigger concern: the peer review system they use doesn't always work very well. There is, of course, the famous case of Hendrik Schön, who was the toast of the physics world, until it was discovered that his "breakthroughs" were frauds -- even though they were peer reviewed. But that, of course, is an extreme case. Even outside of that, though, peer review has always been somewhat questionable, and many have warned in the past that it's not particularly reliable or consistent in judging the quality of research.

This week, the world has been taken by storm by claims from Vinay Deolalikar, that he has proved P≠NP, one of (if not the) biggest problem in math and computer science which has potentially huge implications (pdf). However, what's interesting is that the paper started getting a ton of attention prior to any sort of peer review... but all of the attention around it has resulted in people (experts and non-experts alike) around the world beginning to take part in a self-organizing peer review on the fly.

This is leading some to point out that this seems to be a much better method of peer review and should be looked at more seriously (found via Glyn Moody). Apparently, people are realizing that a much more open post-publication peer review process, where anyone can take part, is a lot more effective:

We are starting to see examples of post-publication peer review and see it radically out-perform traditional pre-publication peer review. The rapid demolition [1, 2, 3] of the JACS hydride oxidation paper last year (not least pointing out that the result wasn't even novel) demonstrated the chemical blogosphere was more effective than peer review of one of the premiere chemistry journals. More recently 23andMe issued a detailed, and at least from an outside perspective devastating, peer review (with an attempt at replication!) of a widely reported Science paper describing the identification of genes associated with longevity. This followed detailed critiques from a number of online writers.

The post goes on to discuss some of the pros and cons of this kind of post-publication peer review, and to respond to some of the claimed challenges to it as well. Obviously, this particular paper is a bit different in that the fame around the problem, and the interest in the story itself has a lot more people willing to just go and dig into the proof, but that doesn't mean there aren't some useful lessons to be learned here about getting past the old and increasingly ineffective method of peer review used in many places today.

Filed Under: p=np, peer review, post publication

21 Comments

If you liked this post, you may also be interested in...

Reader Comments

Subscribe: RSS

View by: Time | Thread

Anonymous Coward, 12 Aug 2010 @ 4:08pm

This isn't just something that happens with huge questions like P=NP; if you look at the math section on arxiv.org, there are plenty of math papers on much smaller problems that got retracted due to errors caught by others in the field reading the preprint before the peer review process caught any errors.

Of course, there are also plenty of papers that got through the peer review process but needed to be retracted later due to errors. There are also surely many published papers out there with serious errors that no one has caught yet, but if it's a topic that people care about, eventually they will be (and if it's not, then the errors aren't such a big deal in any case).

In the sciences, it's much harder than in math to detect fraud, since if a researcher claims they performed a complicated experiment and got certain results, it may be very difficult to catch by any means short of running a similar experiment. The peer review process there is more about catching major flaws in the design of the experiment, analysis of data, etc. than to check that the data isn't pure fabrication. (If it's an area that people care about, subsequent work will do that.)

In any case, I see no problem with the peer review system here; it just may not be meant to do what you think it's meant to do.
[ link to this | view in chronology ]
The Declinator, 12 Aug 2010 @ 4:22pm

This is actually not that a surprise. It is cool that it proven though. Anyway, there are some problems with the Victoria Gill/BBC Article. Quoted from the BBC:

P vs NP is asking - can creativity be automated?". Dr Deolalikar claims that his proof shows that it cannot. It may seem esoteric, but solving P vs NP could have "enormous applications", according to Dr Aaronson. From cracking codes to airline scheduling - any computational problem where you could recognise the right answer, this would tell us if there were a way to automatically find that answer. ''

The above reads as if the prove actually makes all these scary things possible, but in reality it does exactly the opposite: the proof assures people that there can be some trust in current cryptographic approaches ! Secondly, a bit lower there was an odd paragraph on a simple test that would ensure that the proof is correct: it goes as follows: if the author can prove that his proof will not contradict current facts then it is a good proof. So how does one go about proving the innocense of this proof. The burden of proof (hehe) in this case lays with the accusor: those that claim that his proof is wrong should come forward and explain where it is wrong.

Anyway: Taken from the BBC website:

One way to test a mathematical proof, he said, is to ensure that it only proves things we know are true. "It had better not also prove something that we know to be false." Other mathematicians have responded to Dr Deolikar's paper by asking him to show that his proof passes this test. "Everyone agrees, said Dr Aaronson, "if he can't answer this, the proof is toast."

Maybe somebody can shed light on this last paragraph ?
[ link to this | view in chronology ]
- Hyman Rosen, 16 Aug 2010 @ 9:30am
  
  Don't prove a falsehood
  (I'm just guessing, but...) In theoretical computer science, one can set up computing systems with an "oracle" which is defined to be able to solve a certain problem in a single step. Given properly constructed oracles (that is, the problems which they solve), one can create a mathematical universe in which P = NP or one in which P != NP. Therefore, the proof technique used to demonstrate P != NP must not be independent of whether the computing system includes oracles, or else it would apply to both augmented universes and therefore prove a falsehood in one of them.
  
  Various proof techniques have undergone this analysis, excluding them from being able to solve the P vs. NP question, because it is possible to construct versions of the mathematical universe where the P vs. NP problem is either known true or know false and the proof techniques cannot separate the augmentations involved.
  [ link to this | view in chronology ]
Eileen (profile), 12 Aug 2010 @ 4:23pm

from a scientist...
While I agree that the way science is published is badly in need of a makeover (largely being answered by open source journals, however slowly they are being adopted), I agree with the above AC that you are somewhat misunderstanding what peer review is for. Peer review is partly to determine if the science is "worthy" of a publication (e.g., most rejections from Nature are not because the science is faulty but rather it is not noteworthy enough), and of course to ensure the scientist has not done anything egregiously wrong (which would include lacking controls, erronious conclusions, etc).

But it is certainly not expected to be an exhaustive review of the science for veracity. Something key here: not all published science is expected to be correct! All kinds of slightly incorrect/misinterpreted data get published all the time. A lay person would be a fool for accepting a result simply because it is published, even if it is in Nature. This is how science *really* operates - you build concensus through many researchers and many experiments. Each piece of evidence (publication) is weighed in the balance.
[ link to this | view in chronology ]
- ChurchHatesTucker (profile), 12 Aug 2010 @ 5:58pm
  
  Re: from a scientist...
  "Something key here: not all published science is expected to be correct! All kinds of slightly incorrect/misinterpreted data get published all the time."
  
  Well, yeah, so there's room for improvement. That's the point of crowd-sourcing review. It's not going to be perfect either (interpreting the crowd is an art in itself) but it'll likely improve things.
  [ link to this | view in chronology ]
  - dude, 13 Aug 2010 @ 5:11am
    
    Re: Re: from a scientist...
    There are many reasons that things are not caught by the reviewer and are not likely to change from crowd sourcing. Two of them are:
    
    1. The lack of knowledge in the minutae of the work. Usually articles focus on some minor aspect of some phenomenon that occurs in X case. It's just so specific that very few know what's going on.
    
    2. The raw data presented in papers is usually condensed. Basically, data sets can be/are large and almost never presented in whole. So the reviewer reviews the condensed data and sees if your conclusions are logical. But the original data can be screwy or "condensed" poorly, there is little way of knowing.
    [ link to this | view in chronology ]
    - ChurchHatesTucker (profile), 13 Aug 2010 @ 11:08am
      
      Re: Re: Re: from a scientist...
      1) You'd be surprised at the expertise that is available in the crowd. Teasing out the actual expertise is part of the art that I mentioned above.
      
      2) You'd be surprised at how tenacious the crowd can be at following the chain of evidence (assuming it's present, and at this point why wouldn't it be?)
      [ link to this | view in chronology ]
Anonymous Coward, 12 Aug 2010 @ 4:47pm

I think some people misunderstand what the article point is.

There may be a better way to do it, that can improve the quality of the reviews why not do it?
[ link to this | view in chronology ]
- Anonymous Coward, 13 Aug 2010 @ 7:47am
  
  Re:
  I think some people misunderstand what the article point is.
  
  There may be a better way to do it, that can improve the quality of the reviews why not do it?
  
  And what is this way exactly? We've already explained how the described "crowdsourced review" already happens automatically in the current system and serves a different purpose than the peer review process.
  
  The above reads as if the prove actually makes all these scary things possible, but in reality it does exactly the opposite: the proof assures people that there can be some trust in current cryptographic approaches
  
  This isn't quite true either: a lot of cryptography is based on the difficulty of factoring large numbers (which are the product of two appropriately chosen random large primes). This problem is in NP (i.e. a factorization can be checked in polynomial time by just multiplying the factors). Thus if P=NP, there's a "fast" (polynomial-time) algorithm for factoring.
  
  Factoring is not, however, known to be NP-hard (i.e. it's not known that there's a polynomial reduction from any NP problem to factoring), or for all we know, it may not be as hard as some other NP problems. It could be the case that P!=NP, yet factoring is "easy" anyway.
  [ link to this | view in chronology ]
Jason, 12 Aug 2010 @ 5:46pm

scaling?
Does this scale to less popular papers? This one attracted a ton of attention but would a less known topic get enough eyes to find anything?
[ link to this | view in chronology ]
CharlieM (profile), 12 Aug 2010 @ 5:46pm

You are forgetting about the little people...
Mike, while I agree almost completely with your stated problems concerning the peer review process, your example of P≠NP doesn't address a larger concern. World or crowd sourced peer review certainly works for the 2-3 HUGE manuscripts that get out every year, what about the literally THOUSANDS of smaller publications?

How many people are really going to be interested in reading a new proposed enzymatic mechanism for some arctic fish?

Finally, most of what goes on in the review process and when you get back a manuscript, are subtle changes (assuming its not a down right objection) that require minor amendments. Everyone has an opinion, under the current process I only have to address 2-3 before its accepted. With a crowd sourced process, how is one to deal with conflicting changes, or just erroneous ones?

Finally - while personal bias and conflicting interests are notorious in the review process, it is currently someone moderated by the editor and other reviewers. With a crowd sourced review - you can easily get 'scooped' by a big well funded lab who can make the changes and republish the same work in days, when it might take you months.

Its an interesting idea, but IMO the flaws outweigh the benefits.
[ link to this | view in chronology ]
- Anonymous Coward, 12 Aug 2010 @ 8:26pm
  
  Re: You are forgetting about the little people...
  "you can easily get 'scooped' by a big well funded lab who can make the changes and republish the same work in days, when it might take you months."
  
  and why is this bad exactly?
  [ link to this | view in chronology ]
Jay (profile), 12 Aug 2010 @ 11:32pm

Wikipedia pwns all
Not much to say. But when wikipedia is doing a lot better than Encyclopedia Britannica, you know the world wants to socialize. Great to see that others realize that perhaps other professions can have good input as well.
[ link to this | view in chronology ]
Anonymous Coward, 13 Aug 2010 @ 3:57am

I predict this story will be picked up and applied to Global Warming very soon.
[ link to this | view in chronology ]
freak, 13 Aug 2010 @ 5:57am

Already happening . . .
A few points.

1) I go on a few private forums in my field, math, and pretty much everyone there does share their papers ahead of time, and does ask for criticism. I have little doubt that other fields have similar forums or meeting points, (I know computer science does).

2) A lot of us will not share results publicly, not for fear of the scientific disapproval, but of public disapproval. Even on the math forums, even with screening and constant banning, we still get a lot of religious nuts who try to join, or actually do. Who argue, and argue, about math, and try their best to discredit mathematicians.
If the mathematician is in the wrong place, they can succeed. I know of one case where a professor in Arkansas was fired after posting a draft of a paper that some religious nuts argued about. And they were enough to threaten the university and get him fired.
(A mathematician, of all studies. Physics, it's possible to argue about. physicists argue about it all the time. Math? Math follows logically from axioms. If there are no contradictions in a theorem, it cannot be false given its axioms)

Looking at other pseudo-science, there's no doubt in my mind that anyone studying, say, Quantum mechanics, any medical to do with vaccines, evolution, biology, DNA, astronomy, etc. etc. would want to be very careful when releasing a paper to the public before to a journal.
[ link to this | view in chronology ]
Chris Ball (profile), 13 Aug 2010 @ 7:12am

A journal editor's perspective
As a former editor of a peer-reviewed academic journal (admittedly in the social sciences), I have a slightly different take on peer review. The main role of our peer review process was not to determine what was "good" or "right", but to determine what was worth publishing. High quality journals get a lot of submissions, and, sadly, a lot of them aren't very good. But, since most of them deal with very specialized subject matter, it is often hard for a generalist editor to tell the good papers from the duds. So, for those papers that pass our basic internal review, we get the opinion of someone who knows about the specific area to help us tell the good from the bad.
Most of the submissions end up being published somewhere, as they should be--heck, most of them are already posted in some form on SSRN before we even get them--so the traditional system of peer reviewed journals doesn't really serve a gatekeeper role per se. Rather, the journal's role is mostly editorial (i.e. making the papers the best that they can be) and curatorial (i.e. bringing attention to them). This, I think, is an important value-added service.
So while I agree that the peer review that goes on after publication is more important than the traditional process, this realization hardly heralds the end of traditional peer review. That said, it's probably important for people to be aware of the difference between pre- and post-publication peer review, and not assume that because a paper has been published in a reputable journal that it has been thoroughly vetted for correctness.
[ link to this | view in chronology ]
Anonymous Coward, 13 Aug 2010 @ 8:38am

Peer review is by far the best way to verify research. It isn't perfect but nothing else comes close for reliability, neutrality, and exposing fraud.
[ link to this | view in chronology ]
Len Ellis, 13 Aug 2010 @ 12:38pm

Like Wikipedia in reverse? Instead of lots of independent folks building an article, they take it apart? Same strengths and weaknesses?
[ link to this | view in chronology ]
orbitalinsertion (profile), 13 Aug 2010 @ 2:47pm

In related news,
http://scienceblogs.com/pharyngula/2010/08/another_publisher_stonewalls_o.php
"Another publisher stonewalls on how he screwed up"

Epically bad "peer review" on an epically bad "paper". +1 for crowd review.
[ link to this | view in chronology ]
Michael Winikoff, 14 Aug 2010 @ 1:57pm

Peer reviewer (like democracy) isn't ideal. But it's better than publishing things without any sort of review!

The examples you give for "crowd reviewing" are exceptional because the papers in question were high-publicity papers with highly significant (revolutionary even?) results.

Most research papers make relatively modest and incremental contributions. Many papers are barely cited when they are published. Putting them up and waiting for reviewers isn't likely to work.

Crowd source reviewing appears to work very well for high-profile papers, but I can't see it working for all papers, unless the volume of papers published per year drops significantly, or researchers decide to spend a lot less time writing research papers, and a lot more time finding papers on the web to review.
[ link to this | view in chronology ]
Cameron Neylon, 14 Aug 2010 @ 11:36pm

What is peer review anyway?
I'm the author of the linked post so thank to Mike for bringing it to a wider audience and thanks all for the discussion. Just to establish bona fides I am both an academic editor on a couple of journals and have 20-odd peer reviewed papers so I've seen the process from both sides of the fence.

One of the things that strikes me about a lot of the comments here is that they betray very different views of what peer review is for and what it does. Is it supposed to determine how good the evidence is? Or how important the paper is? Or to protect the mainstream media from making a fool of themselves. The bottom line is that we have very little evidence that pre-publication closed peer review is any good at any of these things.

In a sense, as scientists this should be obvious. You would never make a strong claim based on a very small number of measurements, which rarely come to the same result, and then hide the actual data away where no-one can see it. Yet this is exactly what we do with pre-publication peer review in most cases. It also costs a fortune and as was pointed out on my post, its increasingly difficult for editors to find people willing to take the time to do a good job.

What I think this episode shows is that for high profile papers this approach can work very well (also pointed out on my post tho by a commenter it really requires that there be a positive convenor/moderator to guide the discussion, in the case of the Deolalikar paper, RJ Lipton has been playing this role very ably). The question is whether similar approaches can be applied to the 1.5M other papers published each year.

My view is that, for the ones that matter, it could work, and that for the rest it is better to just publish them without review. If no-one is ever going to cite them then why bother to review them? And in particular why get two people who are probably not the right people to review them rather than putting them out so that the right people can find them themselves.

To argue that not publishing is better you'd have to make the case that the cost of review, and the opportunity costs of preventing publication, are outweighed by the quality improvement, or the positive effects of not publishing. I haven't seen any evidence that supports that assertion.
[ link to this | view in chronology ]