David Oliver, a partner of the Vorys Houston office, authored a column entitled “W.Va. Doesn't Understand The Scientific Method: Part 1”, which appeared in the February 13, 2014 edition of Product Liability Law360. The full text of the column is included below.


W.Va. Doesn't Understand The Scientific Method: Part 1

Milward v. Acuity has spawned another troubling anti-science opinion: Harris v. CSX Transportation Inc.Whereas Milward held that credentialed wise men should be allowed to testify that an effect that has never been observed (indeed one that could not be detected by any analytical method known to man) actually exists, Harris holds that such seers may further testify that an effect that would be observable (if it existed) and which has been repeatedly sought in the wake of its putative cause yet invariably not observed actually exists nonetheless.

 How are plaintiffs pulling it off? By convincing some judges that testing, the essence of the scientific method, need not be done in laboratories and need not be independently reproducible.

These courts have decided that biomedical discoveries can reliably be made, at least in the courtroom, merely by having an expert witness test in his mind a suggested causal association by running it through whatever causal criteria he thinks appropriate and weighing the resulting evidence according to his subjective judgment — really, it's that bad.

Two courts have now granted paid-for hypotheses a status equal to or higher than — depending on the jury's verdict — that of scientific knowledge (conjectures that have been severely and repeatedly tested and have repeatedly passed the tests).

Now, we could point out that hypotheses generated by the process endorsed by these courts pan out perhaps 1-in-100 times (i.e. the method's rate of error is 99 percent) and so flunks a key Daubert factor, but that ignores the real ugliness here — an attack on the scientific method itself.

It has been said that the point of the scientific method is to let nature speak for herself. By observing, measuring and recording scientists listen to her. By generating hypotheses about the order in which the observations occurred they attempt to make sense of what she's saying. By testing their hypotheses (i.e., by attempting to reproduce the observed effect in a controlled setting) scientists ask if they've misunderstood her. By publishing their results scientists communicate what they've learned and invite others to try to reproduce and build upon it.

This method of objectively assessing a phenomenon, guessing at what it implies about how the world works, testing that guess and then reporting the results along with the method, materials and measurements involved ushered in the world we know today. It also dislodged those who in the past had sought to speak for nature; those whose power and place had been derived from their ability to explain the world by way of plausible and compelling stories that served some narrative. They were dislodged first because the scientific method proved a better lodestone and second because the method, once applied to ourselves, revealed human intuition and judgment to be woefully prone to bias, fear, superstition and prejudice.

Luckily for the would-be oracles who made their living as expert witnesses it took a long time and extreme abuse of the word scientific before the law came to terms with the scientific method. Finally, Daubert accepted the central tenet of the scientific method (i.e., that to be scientific a theory must be testable, and thus necessarily accepted that the law could not lead science as it was obviously unequipped and ill-suited for testing theories).

The law would have to follow science. Other opinions refined the arrangement until we got to where we are today (at least where we are in Texas). Now an expert's opinion must be founded on scientific knowledge and may not reach beyond what can reasonably be inferred from it (i.e., the analytical gap between what is known and what follows from that knowledge doesn't require much of a leap, it's really just a matter of deduction). A case that's kept us busy the last month provides an example.

The plaintiffs' decedent had died of acute myelogenous leukemia ("AML") and his family blamed benzene exposure. The battle was fought not over whether benzene can cause AML — though there are some interesting arguments to be made on the subject — but rather over whether plaintiff was exposed to enough and whether the risk posed by the exposure was considerable.

The experts did have some leeway on issues like retrospective exposure estimation and whether the latency period was too long as on both sides there were scientific studies demonstrating the effect in question. Yet in the main the experts' opinions mostly overlapped; differing only according to the testimony of the fact witnesses on which their side relied. The jury thus was to decide which of two competing pictures of plaintiff's workplace made the most sense and not whether benzene causes AML. Surely that's the sort of case for which trial by jury was designed.

However, many still chafe against nature's tyranny and argue for the old ways; for human judgment unconstrained by measurement, testing and thus the embarrassing possibility (likelihood actually) of having their beliefs publicly refuted.

So some argue that nature is too coy and that she refuses to reveal what they're sure must be true. Others just don't like what she has to say. And of course there's the whole financial angle given that a lot more lawsuits could be filed and won if nature could be made to speak on command or if the subjective judgment of experts could be re-elevated to the status of pronouncements by nature.

So what to do? One solution is to adopt the "if you can't beat 'em, join 'em" motto and bank on the truism that if you can't find a way to generate a statistically significant association between an effect and what you suspect is its cause then you're too stupid to be a scientist. But that plan first ran afoul of the courts when it was recognized that, for example, improper epidemiological methodology had been employed to generate the results (see e.g. Merrell Dow v. Havner) and more recently as it has become evident that there's a crisis in the biomedical sciences — that many if not most statistically significant results cannot be reproduced and it's because many and probably most reported findings involving small effects (generally an increased risk of 3-fold or less) are false.

What to do then? You need a method a court will say is valid and you need a test that can't be mathematically demonstrated to generally produce bad results and that also can't be run by someone else lest she falsify your theory and ruin all the fun.

What about equating a decision-theory process like weighing the evidence or applying the so-called A? Bradford Hill criteria to say significance testing of statistical inferences (e.g., epidemiology) or to fluorescent labeling of macromolecules for quantitative analysis of biochemical reactions? Now you're on to something!

Because the weights assigned to bits of scientific evidence are necessarily matters of judgment experts can now test their own theories by weighing the evidence in the scales of their own judgment. And any theory that passes this test gets to be called scientific knowledge and, best of all, can never be refuted. A jury can then decide which of two competing pictures of say the anthrax disease propagation process (e.g., miasma vs. germ theory) is the correct one. Robert Koch would be appalled but the Harris court bought it.

The decedent in Harris worked for a railroad and claimed his multiple myeloma (MM) had been caused by exposure to diesel fumes. The problem was that every epidemiological study of railroad workers (i.e., every known test of the potential relationship between working for a railroad and MM failed to show that MM was associated with railroad work). In fact, every study designed specifically to test the theory that MM follows diesel exhaust exposure by railroad workers has failed to demonstrate an association, much less causation.

The plaintiff tried to reframe the question by saying there's benzene in diesel exhaust smoke and that benzene has been associated with MM, but the problem was that there's benzene in cigarette smoke too; far more in fact than in diesel smoke, and yet MM risk is not increased with cigarette smoking. The plaintiff then re-reframed the question by arguing that some molecules found in diesel exhaust had been associated with lung cancer and "oh, by the way," some of the chromosomal changes found in Harris' pathology were sometimes seen in people (with a different disease) exposed to benzene.

In sum, there was no evidence drawn from observations of the world (i.e., the scientific method to demonstrate that diesel exhaust was a cause of MM in railroad workers) and the trial court excluded the experts' opinions.

On appeal the West Virginia Supreme Court of Appeals latched onto the following quote from Milward which I'll break into its three points:

  • "The fact that the role of judgment in the weight of the evidence approach is more readily apparent than it is in other methodologies does not mean that the approach is any less scientific." This is where the need for independently verifiable testing is deleted from the scientific method.
  • "No matter what methodology is used, an evaluation of data and scientific evidence to determine whether an inference of causation is appropriate requires judgment and interpretation." This is where the need for a theory to have passed a serious test (i.e., that the effect has been observed to actually follow the putative cause in an experiment or retrospective epidemiological study, is eliminated as a requirement for a theory to constitute "scientific knowledge").
  • "The use of judgment in the weight of the evidence methodology is similar to that in differential diagnosis, which we have repeatedly found to be a reliable method of medical diagnosis." This is the punch line. A method for ruling out known diseases to infer the one from which a patient is actually suffering is transformed into a way to rule in by human judgment heretofore unknown causes of that disease without any objective evidence that, to paraphrase Hume, whenever the putative cause occurred the effect has routinely been observed to follow.

Given the foregoing as the court's misunderstanding of the scientific method it should come as no surprise that it concluded "the experts in the instant case did not offer new or novel methodologies. The epidemiological, toxicological, weight of the evidence and Bradford Hill methodologies they used are recognized and highly respected in the scientific community."

The effort to conflate statistical hypothesis testing and pharmacokinetic assays with subjective human judgment was complete and the trial court's ruling was reversed.

So now, in West Virginia, it's enough for an expert to say in response to the question: Has your hypothesis been tested? "Yes, I have weighed the data that gave rise to the hunch in my brain pan and I can now report that it convincingly passed that test and may reliably be considered 'scientific knowledge'". Ugh.