This article was published in the Environmental, Product Liability, and Public Policy sections of Law360 on January 20, 2016. © Copyright 2016, Portfolio Media, Inc., publisher of Law360.

Selecting a toxicity value for use in setting regulatory limits raises questions “on the frontiers of science,” which require a delicate balancing of the science with “policy judgments.” Such calculations can well “lead to risk estimates that, although plausible, are believed to be more likely to overestimate than to underestimate the risk to human health and the environment.”1 Increasingly, such regulatory risk decisions may involve unarticulated, nontransparent agency practices and preferences. Because all the draft bills reforming the Toxic Substance Control Act (TSCA) contemplateincreasing the number of substances whose risk will be assessed or reassessed, and other U.S. Environmental Protection Agency and state regulatory programs commonly utilize EPA reference doses (RfDs) and reference concentrations (RfCs), it is timely to reflect on what portion of these regulatory decisions should be science-based, what portion should be policy-based and whether the nontransparent application of agency lore is appropriate at all. This article uses EPA’s development of an RfC for the noncancer effects from exposure to trichloroethylene (TCE) in air as a case in point.

Background

In 2011, EPA’s Toxicological Review of TCE concluded that “an overall review of the weight of evidence in humans and experimental animals is suggestive of the potential for developmental toxicity with TCE exposure” (the cancer classification determination) and selected 2 µg/m3 as the residential inhalation RfC (the cancer risk quantification decision). The RfC level selected was based primarily (but not exclusively) on a 2003 drinking water ingestion study allegedly showing a statistically significant association between exposure and heart malformations in rat fetuses (the Johnson study). Specifically, the Johnson study was used by EPA to quantify the RfC and justify the use of an average concentration over 24 hours to trigger regulatory action. There is little dispute that TCE exposures should be classified as having noncancer effects at some levels, but there is significant scientific debate concerning the concentration and the time period over which the concentration should be averaged for short-term exposure.

Unless specified otherwise, RfCs are derived for “chronic” exposures (up to a lifetime), but they may also be derived for acute (less or equal to 24 hours), short-term (greater than 24 hours, up to 30 days) and subchronic (greater than 30 days, up to 10 percent of a lifetime). The 2011 TCE RfC was “derived for chronic exposure duration,” but “chronic” was not defined. However, EPA’s Regions 3 and 9, some states and, most recently, EPA’s Office of Toxic Substance have used a “novel” 24-hour averaging period based on a “presumption that a single exposure of a chemical at a crucial window of fetal development, as in the case of cardiac development,” may produce adverse developmental effects (see, for example, the 2014 Toxic Substance Control Act Work Plan for TCE: Degreasing, Spot Cleaning and Arts & Craft Uses, citing one sentence in a 1991 EPA Developmental Guideline). The leap from the traditional 30-year exposure period utilized previously for TCE and most other chemicals increases significantly the frequency and stringency of the TCE remediation trigger for hazardous waste cleanups. It has also resulted in this novel approach being misunderstood by some regulators, many environmentalists and most members of the plaintiffs’ bar (in remediation or product liability litigation) as meaning injury is likely to occur if residents are exposed to more than 2 µg/m3 for 24 hours (or if workers are exposed to more than 8 µg/m3 for 24 hours). The impact of EPA’s novel theory has already impacted some companies and federal agencies and is likely to have increasing economic impacts.

The primary focus of this article is the unusual level of scientific uncertainty surrounding the TCE RfC level and the averaging time for exposure to TCE in air and how to address this changing set of regulatory assumptions.

Reliance on the Johnson Study to Derive the RfC 

EPA’s reliance on the Johnson study raises significant scientific questions for the following reasons:

  • Two of the four exposure levels reported in the Johnson “study” are data from apreviously published study from the same laboratory conducted at a different time. The original study found no statistically significant increase in heart defects in rat pups. Generally, combining data from different studies can be a valuable method for summarizing evidence, but many respected toxicologists caution against combining such data where two sets of controls lead to substantially different results.  
  • The quality and failure to follow generally accepted laboratory methods has been documented (in most cases, by the EPA itself), e.g.:
    • The research was conducted over a six-year period and combined control data was used for comparison to treated groups.
    • The precise dates that each individual control animal was in study, even the year in which some of the data was taken, are unknown.
    • There are possible imprecisions of exposure characterization.
    • The biological mechanism explaining how TCE might cause fetal heart defects was not in the 2011 EPA Toxicological Review, has not been peer reviewed and disagrees with some of the prior independent expert reviews of modes of action for TCE and similar chemicals.
    • The dose-response curve is atypical (i.e., there are effects at the lowest and highest exposure levels, but not the intermediate exposure levels).
    • The study disagrees with five other published studies reporting the results of oral administration of TCE to rodents during fetal developmental.
    • No researcher (including one of the lead authors of the Johnson study) has replicated the results of this drinking water rat study.
    • None of the five separate inhalation studies of TCE exposure to rats reported cardiac effects in fetuses.  
  • Many regulators and independent experts decided not to rely on the Johnson fetal heart malformation study, including:
    • EPA’s past evaluations of TCE and EPA’s evaluations of 1,1-dichloroethylene and trichloroacetic acid, which evaluated and rejected the Johnson study. Indeed, seven of the 11 EPA toxicologists who updated EPA’s Toxicological Review of TCE in 2014 rated their confidence in the Johnson study as low (four more rated their confidence low to medium, and one EPA toxicologist dissented in the findings).
    • A 2009 National Academy of Science (NAS) study of TCE.
    • A California regulatory review.
    • 2013 and 2014 Agency for Toxic Substances and Disease Registry (ATSDR) studies.
    • Even the 2006 NAS committee that generally agreed to the EPA’s use of the Johnson study recommended that a laboratory replicate the Johnson study results. This replication has not occurred, and EPA has repeatedly rejected offers by industry groups to perform a replication of the study that would presumably address the scientific concerns raised about this study.

Similarly, EPA’s use of an average TCE air concentration measured over a 24-hour period is even more questionable because:

  • This time period is not based on data, but rather a long-extant, but rarely utilized, sentence in a 1991 guidance document and a relatively unusual study that involved long-term exposure.
  • As far as the author is aware, neither the EPA nor any other governmental authority has previously applied the 1991 EPA Developmental Guidance in this manner to other chemicals. This 1991 guideline, in fact, is archived on EPA’s website.
  • The 1991 guidance is ambiguous because it uses the undefined term “single exposure” (not 24 hours).
  • More generally, 24-hour reference doses are not utilized as regulatory action levels. Neither the EPA nor an independent expert review group has defined or proposed that a single exposure is 24 hours or any particular time period for use in determining when the TCE indoor air levels trigger action.
  • No NAS committee or other expert group recommended this approach.
  • The ATSDR has noted that there is no scientific support for the use of a 24-hour period.
  • The RfC resulting from the use of the Johnson study is inconsistent with EPA’s existing acute exposure guideline of 5,000 µg/m3 and existing experience with worker exposures well in excess of 2 µg/m3.

In summary, the TCE RfC and its application by regulators, and potentially through its misapplication in litigation, are likely to have significant impacts. However, the degree of scientific uncertainty with the EPA’s determinations is unusually great, as seen just by examining prior EPA reviews, reviews by other agencies and peer reviews by independent experts.

The Gordian Knot: When Does the Scientific Uncertainty Exceed What Is Allowable

The number of contradictions concerning the setting of the TCE RfC based on developmental toxicity suggests at a minimum caution, if not skepticism. Thus, the EPA’s decisions concerning TCE raise legal and policy issues that warrant careful consideration and debate. One issue is whether EPA’s TCE decision exhibits a degree of scientific uncertainty exceeding that which is allowable based on general principles of administrative law and historic EPA practice. Further, what tools can EPA (or, if necessary, a court) use to determine when the scientific uncertainty is “too great?”

Historically, courts have long held that agencies may “err” on the side of overprotection2 and are not necessarily required to support their findings with anything approaching scientific certainty.3 However, judicial deference to the EPA’s judgment has never been unbounded. The TCE RfC bears all of the indicia necessitating a judicial hard look into the EPA’s decision-making.

Courts have refused to defer to an agency’s decision (a) when the agency’s interpretation conflicts with a prior interpretation, (b) when there is reason to suspect that the agency’s interpretation “does not reflect the agency’s fair and considered judgment on the matter in question” or (c) when the agency’s interpretation is “plainly erroneous or inconsistent with the regulation.”4 Additionally, courts have refused to uphold agency decisions when they ignore the findings and recommendations of independent, expert bodies.5 Each of these factors applies in this case.

The EPA, in its 2011 TCE Toxicological Review, changed course concerning how much weight to give to the Johnson study. Similarly, the EPA has changed its view concerning the method by which TCE might cause fetal heart deformation. In fact, the nearly simultaneously released EPA toxicology reviews of trichloroacetic acid and TCE (which reviewed the Johnson study) reached opposite conclusions concerning this study. In essence, it appears that there is a disagreement among EPA scientists on how to weigh the evidence (without explicit acknowledgement or explanation of the change or disagreement). Thus, these decisions should be subject to heightened judicial scrutiny.

Certainly, the findings and recommendations concerning the Johnson study by the 2009 NAS committee, the state of California, the 2006 NAS committee and the ATSDR undercut the normal justification for deference to the EPA.

The decision-making in this matter should be viewed in light of the 2011 finding of the NAS formaldehyde committee (whose members collectively had peer reviewed many EPA toxicological reviews) that there have been persistent “problems encountered with ... [the EPA’s risk] assessments over the years” that have been “identified by multiple groups.” In fact, sometimes EPA “conclusions appear to be based on a subjective view of the overall data and the absence of a causal framework.” The fact that many EPA and other state and federal agency EPA scientists reached different conclusions, as well as this overall pattern, suggest that the EPA decision-making on TCE may not reflect a fair and considered scientific judgment of many EPA scientists.

Even if the EPA’s substantive TCE decision were not subject to probing judicial scrutiny, the EPA may not ignore a study’s weaknesses, focus exclusively (or primarily) on positive findings, decide that a lack of mechanism or dose response can be ignored, or, in effect, adopt a zero-risk goal, particularly if the decision appears to be driven by an unofficial practice neither publicly vetted nor grounded in the statute being administered.

The weight-of-the-evidence analysis for TCE is at best poorly and inconsistently articulated and focused only on statistical associations (not causation) and it fails to address the highly unusual and persuasive number of non-EPA scientific judgments to the contrary. As the Reference Manual on Scientific Evidence notes, an “association is not causation” and the precautionary calculation of risk for regulatory purposes is not the same as personal injury. The extensive literature on causation requires systematic identification of relevant evidence, criteria for evaluating the strength of evidence and language for describing the strength of evidence of causation. EPA official guidelines also acknowledge that determining whether an observed association (risk) is causal rather than spurious involves consideration of a number of factors (citing strength, experiment, consistency, plausibility and coherence). The EPA’s toxicological review of TCE fails to articulate a clear rationale for its decisions in weighing the evidence.

Potential Solutions 

The TCE RfC presents an unusual degree of scientific uncertainties, but these problems are more widespread than TCE.

First, the EPA should replicate the Johnson study with the modification that the replicate study should follow good laboratory practice guidelines. It is troubling that EPA has shied away from such a reasonable proposal.

Second, more generally, the key issues in the TCE noncancer risk assessment should be reviewed by an NAS committee.

Third, EPA should initiate a transparent process to develop science-based guidance concerning the period of time over which indoor air levels should be averaged to determine if the RfC is exceeded. The shortcomings individually and cumulatively suggest, at a minimum, the need for caution before relying on the Johnson study to make novel toxicological decisions because of this higher-than-normal degree of scientific uncertainty.

Fourth, more generally, there is a need for regulators to articulate which conclusions are based on science and which are based primarily on science policy, particularly when the scientific weight-of-the-evidence analysis may conclude that there is insufficient evidence of causation. There should be no place in regulatory risk decision-making for reliance on agency lore. The decision-making criteria and preferences used in judging the quality of studies and in weighing epidemiological data versus animal data versus mechanistic data must be clearer. In particular, there should be criteria for determining when suggestive data is simply not sufficient to rely upon in setting numerical limits for noncancer effects such as fetal deformation, i.e., when making regulatory decisions based on limited data of poor quality is more detrimental than relying upon assumptions in the absence of that data.

The starting points for this reform should be the recommendations from the 2011 formaldehyde NAS report and the EPA’s response to it. But the process still needs more reform. When quality or novel policy issues arise in the course of a review, the EPA should not wait (often years) to seek expert input, particularly where, as here, other EPA personnel performing similar reviews and governmental bodies’ or independent experts’ reports raise quality or interpretative issues with the same or similar data. Science should not be a game of hide and seek.

Fifth, when novel policy issues are identified, early public and expert review should be sought in parallel with the completion of the review. In the appropriate situations, expedited studies should be performed to promote scientific certainty on key issues, particularly where industry is willing to fund a joint government-industry study.

In the case of TCE, an independent group should be asked to advise the EPA concerning health-protective, yet workable, policy options that may be utilized as the scientific certainty in risk assessment decision-making varies along the spectrum from no reasonable evidence to virtual certainty. It is simply misleading to group chemicals that may have numerous epidemiological studies demonstrating substantial increases in relative risks with chemicals (such as TCE) where the evidence is, by any measure, weak and uncertain. More importantly, grouping chemicals, the toxicity of which have significantly different degrees of scientific support, is likely to misallocate finite regulatory and societal resources.

For example, the level of evidence might trigger more intensive research to fill data gaps on an expeditious timeframe (as recommended nine years ago by the 2006 NAS committee). Similarly, rather than propose a single concentration for the RfC, the extraordinary level of uncertainty may justify the use of a range of potential RfCs (as has been the practice in regulating carcinogens).

Sixth, the EPA peer-review process would benefit from a hard look designed to increase the selection of well-balanced peer reviewers who have actual experience in the key technical issues and the involvement of all stakeholders (governmental, environmental groups and industry) earlier in the process (including providing information and queries to the EPA and peer reviewers).

Each of these reforms could be developed after stakeholder involvement. However, to date, EPA has evinced little inclination to take the initiative, and thus it may be incumbent on independent expert bodies to lead the way.