In response to failed replications, some researchers argue that replication studies are especially convincing when the people who performed the replication are ‘competent’ ‘experts’.
Paul Bloom
has recently remarked: “Plainly, a failure to replicate means a lot
when it’s done by careful and competent experimenters, and when it’s clear that
the methods are sensitive enough to find an effect if one exists. Many failures
to replicate are of this sort, and these are of considerable scientific value.
But I’ve read enough descriptions of failed replications to know how badly some
of them are done. I’m aware as well that some attempts at replication are done
by undergraduates who have never run a study before. Such replication attempts
are a great way to train students to do psychological research, but when they
fail to get an effect, the response of the scientific community should be: Meh.”
This mirrors the response by John Bargh after replications of the elderly priming studies yielded no significant
effects: “The attitude that just anyone can have the expertise to conduct
research in our area strikes me as more than a bit arrogant and condescending,
as if designing the conducting these studies were mere child's play.” “Believe
it or not, folks, a PhD in social psychology actually means something; the four
or five years of training actually matters.”
So where is the evidence we should ‘meh’ replications by novices that
show no effect? And how do we define a ‘competent’ experimenter? And can we
justify the intuition that a non-significant finding by undergraduate students
is ‘meh’, when we are more than willing to submit the work by the same
undergraduate when the outcome is statistically significant?
One way to define a competent experimenter is simply by
looking who managed to observe the effect in the past. However, this won’t do.
If we look at the elderly
priming literature, a p-curve
analysis gives no reason to assume anything more is going on than p-hacking. Thus, merely finding a
significant result in the past should not be our definition of competence. It
is a good definition of an ‘expert’, where the difference between an expert and
novice is the amount of expertise one has in researching a topic. But I see no reason
to believe expertise and competence are perfectly correlated.
There are cases where competence matters, as Paul Meehl reminds
us in his lecture series (video
2, 46:30 minutes). He discusses a situation where David Miller provided
evidence in support of the ether drift, long after Einstein’s relativity theory
explained it away. This is perhaps the opposite as replication showing a null
effect, but the competence of Miller, who had the reputation of being a very
reliable experimenter, is clearly being taken into account by Meehl. It took
until 1955 before the ‘occult result’ observed by Miller was explained by a
temperature confound.
Showing that you can reliably reproduce findings is an
important sign of competence – if this has been done without relying on
publication bias and researchers’ degrees of freedom. This could easily be done
in a single well-powered pre-registered replication study, but over the last
years, I am not aware of researchers demonstrating their competence in reproducing contested findings in a pre-registered study. I definitely
understand researchers prefer to spend their time in other ways than defending
their past research. At the same time, I’ve seen many researchers who spend a lot
of time writing papers criticizing replications that yield null results.
Personally, I would say that if you are going to invest in defending your
study, and data collection doesn’t take too much time, the most convincing
demonstration of competence is a pre-registered study showing the effect.
So, the idea that there are competent researchers who can
reliably demonstrate the presence of effects, which are not observed by others, is
not supported by empirical data (so far). In the extreme case of clear
incompetence, there is no need for an empirical justification, as the importance
of competence to observe an effect is trivially true. It might very well be true
under less trivial circumstances. These circumstances are probably not
experiments that occur completely in computer cubicles, where people are guided
through the experiment by a computer program. I can’t see how the expertise of
experimenters has a large influence on psychological effects in these
situations. This is also one of the reasons (along with the 50 participants randomly
assigned to four between subject conditions) why I don’t think the ‘experimenter
bias’ explanation for the elderly priming studies by Doyen
and colleagues is particularly convincing (see Lakens & Evers,
2014).
In a recent pre-registered replication project re-examining the
ego-depletion effect, both experts and novices performed replication studies. Although this paper
is still in press, preliminary reports at conferences and on social media tell
us the overall effect is not reliably different from 0. Is expertise a
moderator? I have it on good authority that the answer is: No.
This last set of studies shows the importance of getting
experts involved in replication efforts, since it allows us to empirically
examine the idea that competence plays a big role in replication success. There
are, apparently, people who will go ‘meh’ whenever non-experts perform
replications. As is clear from my post, I am not convinced the correlation
between expertise and competence is 1, but in light of the importance of social
aspects of science, I think experts in specific research areas should get more
involved in registered replication efforts of contested findings. In my book, and
regardless of the outcome of such studies, performing pre-registered studies examining the robustness of your findings is a clear sign of competence.