Our court system is all about law, obviously, but sometimes the legal field intersects with research fields. Two points where the law overlaps with research methods are the use of scientific testimony and the conduct of pretrial research projects like mock trials or focus groups. In both settings, a failure to appreciate basic research norms can lead to misunderstanding and misuse. Problems are most acute when statistics are used by those who aren't statistical experts and don't have statistical experts on their teams. As part of their, "Ten Simple Rules," series, the online academic journal PLOS has published a widely-read essay by Carnegie Mellon University's Robert E. Kass entitled, "Ten Simple Rules to Use Statistics Effectively." The paper is aimed at researchers who, finding themselves with larger and larger amounts of data, are increasingly tempted to "go fishing" to find relationships within the data, but it also carries some important implications for consumers of research. "The paper is an instant 'must-read' for anyone who cares about good and reproducible science," CMU psychologist Michael Tarr says. In this post, I will share Kass's ten rules and discuss the implications for both scientific testimony and for pretrial research.
One, Statistical Methods Should Enable Data to Answer Scientific Questions
Collaborating with statisticians is often most helpful early in an investigation because inexperienced users of statistics often focus on which technique to use to analyze data, rather than considering all of the ways the data may answer the underlying scientific question.
The questions lead, and the statistics follow. That means that any investigation ought to begin with research questions, and only then employ statistics to look for answers to those questions. The alternative is to simply sift through the data looking for any and all correlations there might be. Remembering that a .05 level of statistical significance means that you have a one in twenty chance of a false positive, that means if you're simply trying a large number of correlations, you're bound to find many that won't necessarily be meaningful. Both your expert's analysis as well as your mock trial should begin with research questions which then guide the use of statistics.
Two, Signals Always Come With Noise
Variability comes in many forms, but it is crucial to understand when it is good and when it is noise in order to express uncertainty. It also helps to identify likely sources of systematic error.
Whenever you are dealing with research or the fruits of research, there is an important practical need to separate the reliable findings from the accidental associations: the signal versus the noise. As the current political contest focuses on polling, we are reminded that a single poll can be off. That is why the single-state polls were often a little less than reliable during the primaries, and also why the aggregate results are becoming more reliable as we move into the general election. In scientific testimony, there is a difference between what is observed and what rises to the level of a conclusion. Similarly, in a mock trial, some take-aways are observed across several groups while other observations are idiosyncratic to one person or one group.
Three, Plan Ahead, Really Ahead
Asking questions at the design stage can save headaches at the analysis stage. Careful data collection also can greatly simplify analysis and make it more rigorous.
All research ought to begin with a research plan. For the testifying expert, this means having a road map in advance focusing on the planned analysis, methods, and data. Laying that out in advance, and communicating that plan during testimony, helps to avoid the perception of foregone conclusions and cherry-picking. For the mock trial researcher, planning ahead means identifying the research questions, the messages and scenarios to test, and the reporting approach in advance.
Four, Worry About Data Quality
When it comes to data analysis, "garbage in produces garbage out." The complexity of modern data collection requires many assumptions about the function of technology, often including data preprocessing technology, which can have profound effects that can easily go unnoticed.
In a mock trial, one factor that carries a profound effect that can easily go unnoticed is the method used to recruit mock jurors. When companies rely on "frequent flyers," or individuals who answer advertisements or volunteer to list themselves in databases in order to join many focus groups, the results can suffer due to the unrepresentativeness of the participants. In scientific testimony, data quality can be hard for jurors to appreciate, and that creates a need to teach some basic concepts like sample size, generalization, and random selection. Ultimately, you will want to clearly answer a fundamental comparative question for the jurors: What understandable thing did our expert do that their expert didn't?
Five, Statistical Analysis Is More Than a Set of Computations
Statistical software provides tools to assist analysis, not define them. The scientific context is critical, and the key to principled statistical analysis is to bring analytical methods into close correspondence with scientific questions.
Statistics convey an automatic aura of respectability, and can be meaningful even to people who don't really understand what they mean. In the polls we're hearing about as part of the political season, for example, we hear about the margin of error, but then in the next breath we hear the anchor attributing meaning to differences that are within that margin. When they're used as a way of just saying, "Trust me, it's science" statistics can do more harm than good. Ultimately, it all comes back to the research question, and an analysis is useful only if it is meaningfully and reliably answering that question.
Six, Keep It Simple
Simplicity trumps complexity. Large numbers of measurements, interactions among explanatory variables, nonlinear mechanisms of action, missing data, confounding, sampling biases and other factors can require an increase in model complexity. But, keep in mind that a good design, implemented well, can often allow simple methods of analysis to produce strong results.
In published scientific articles, there often seems to be a race to see how complex it can be: how dense the model, how sophisticated the analysis, how exotic the statistical tests. These kinds of advances are helpful, but scientists cannot lose sight of the ultimate goal to communicate. This is nowhere more true than in court testimony, where the final filter on methodology and utility is likely to be a jury's understanding. The goal is simplicity. The same goes for a mock trial: Simplify a case down to its core, then get a reaction to that core. The more you introduce issues that aren't fully developed or contested, or that won't necessarily be part of the trial, the more you are introducing noise that dulls the signal.
Seven, Provide Assessments of Variability
A basic purpose of statistical analysis is to help assess uncertainty, often in the form of a standard error or confidence interval, and one of the great successes of statistical modeling and inference is that it can provide estimates of standard errors from the same data that produce estimates of the quantity of interest. When reporting results, it is essential to supply some notion of statistical uncertainty.
There is a basic principle that applies to any research result: You could have gotten a different result. Mock trial clients should remember that, and steer clear of the notion that their mock trial is reliably predictive of their actual trial results. Because the result depends so strongly on the sample, we have lately adopted the practice of including a risk assessment. For example, if 25 percent of our mock jurors ended up fitting a strongly pro-plaintiff profile, we ask, "How would the results have been different if 30 percent, or 50 percent of the jurors who showed up that day had fit that profile?" The testifying expert can convey that same variability by spelling out those contingencies or by being explicit about their own standard for "reasonable certainty."
Eight, Check Your Assumptions
Widely available statistical software makes it easy to perform analyses without careful attention to inherent assumptions, and this risks inaccurate, or even misleading, results. It is therefore important to understand the assumptions embodied in the methods and to do whatever possible to understand and assess those assumptions.
In science, when you use a particular tool, that tool carries assumptions. In olden times (and that means grad school for me) when you calculated statistics by hand, those assumptions were often more clear. Now, when using sophisticated statistical software packages, you just click a box before selecting "run." As a result, users can end up not being entirely sure what they're running, and may be violating any number of assumptions made by a particular tool. The lesson for experts, then, is to know what you're assuming and to make those assumptions clear. In a mock trial, researchers also make assumptions about the representativeness of the panel, and about the representativeness of the presentations (particularly the attempt to mock the other side). Make it as realistic as possible, then be clear-eyed about anything that fails to match what you're trying to test.
Nine, When Possible, Replicate!
Ideally, replication is performed by an independent investigator. The scientific results that stand the test of time are those that get confirmed across a variety of different, but closely-related, situations. In many contexts, complete replication is very difficult or impossible, as in large-scale experiments such as multi-center clinical trials. In those cases, a minimum standard would be to follow Rule 10.
Any analysis carries a possibility of error. It can be measurable and can be small, but it is still there. That is why large-scale studies will often take several approaches, relying on "multi-method convergence" to show that many paths all lead to the same destination. While rules against cumulative testimony generally prevent loading up multiple experts on the same conclusion, in cases with many experts, there are often productive points of overlap where one expert can reinforce another's conclusions. Replication can also work in a mock trial. It isn't always economical to run multiple mock trials, but in some cases, it is. When you can repeat the project on different days with different samples, you enjoy a big advantage in seeing which conclusions stick and which are transitory.
Ten, Make Your Analysis Reproducible
Given the same set of data, together with a complete description of the analysis, it should be possible to reproduce the tables, figures and statistical inferences. Dramatically improve the ability to reproduce findings by being very systematic about the steps in the analysis, by sharing the data and code used to produce the results and by following accepted statistics best practices.
This last piece of advice is specific to scholarly journals. Sometimes to preserve space, they will publish only a bare bones description of what was actually done, leaving unanswered questions about the stimulus, the method, and the analysis. That undercuts credibility by making it impossible for another researcher to reproduce the study. Expert testimony, even when it doesn't rest on an experiment or a survey, carries that same burden to be reproducible. How would another expert come to the same conclusion? It comes down to the same "Show your work" advice you received in school. In a mock trial, there is a similar need to be clear. How does something rise to the level of being reportable and not idiosyncratic? Careful researchers will make sure they aren't introducing the bias of reporting findings consistent with what's expected and omitting findings that aren't.
Bottom line, research in any context needs to be conducted honestly and communicated clearly.