How and How Not to Test
The key to any good hiring (or other employment) test is the connection between success on the test and success on the job. The Uniform Guidelines on Employee Selection Procedures (HERE) call this “validity” or “validation.” Under Title VII, as amended by the Civil Rights Act of 1991, a test that adversely impacts women or minorities – by a significant degree – must be “job-related and consistent with business necessity.” That phrase was handed down by the Supreme Court in Griggs v. Duke Power (1971) and was codified in the Civil Rights Act of 1991. A test will generally meet the business necessity standard if it has been properly validated under the Uniform Guidelines. The myriad details and nuances of those guidelines are not the subject here.
Those aside, there’s the tactical (oftentimes budget-driven) question of when to validate a test: before it’s rolled out, or after it’s been used in the event it gets challenged in a discrimination charge or lawsuit? From a legal defense standpoint, the answer is easy – validate on the front end, before the test is used to screen out actual applicants. This is true for at least two reasons: 1) if it turns out the test isn’t valid or can’t be validated, the damage will have caught (and hopefully corrected) by the time validation is attempted again; 2) courts have generally rejected the employer’s business necessity defense where attempted validation occurred after the fact – after hiring decisions had been made and a charge or lawsuit had been filed. In these cases, judges are skeptical of purported validity evidence because it was generated after the fact – after the defendant had been accused of discrimination, thus making any validity study appear patently results-oriented and highly suspect.
There is also the question of who should validate the test? Some large employers considering a test or multiple tests have the luxury of “in-house” experts: organizational psychologists or other OD professionals on staff whose job it is to develop and/or validate these selection tools. Employing such individuals may be money well spent, especially if they conduct or oversee job analyses that are the cornerstone of virtually all validation studies. Beyond that, however, the employer that can afford a full-time test developer should also spend the time and money it takes to have the developer’s work product validated by an outside professional – someone who does not depend on a single employer to pay the mortgage or feed her family. Granted, defense experts in the testing space (as they are in others) are typically paid; and a cynic says they are all “hired guns.” But most outside testing experts work for multiple companies at any given time, and almost certainly over the course of a year. This is not to suggest that in-house testing professionals are uniformly biased or less informed. However, just as judges are more likely to credit validity studies conducted before litigation than those conducted during litigation, they are more likely to credit “independent” experts who work for multiple employer clients (and perhaps even for plaintiffs and their counsel), as opposed to in-house experts whose livelihood depends primarily or exclusively on the hiring employer’s good graces.
Another hallmark of a good test (and a good study or “technical report” validating that test) is a meaningful/defensible cutoff score. The Uniform Guidelines address this in detail (HERE), which moots further cutoff discussion herein. A word of caution, however: “watch out” for a seemingly thorough validity report that: (i) details how the test (perhaps a customized or at least highly position-specific test) will consistently pick the right candidate; but then (ii) fails to recommend and justify a particular cutoff score, much less to explain how that cutoff score is better than others for the particular employer and position in question. See Easterling v. State of Connecticut Dept. of Correction, 3:08-cv-00526 (D. Conn. Aug. 24, 2013) ($3 million settling class claims of women disparately impacted by physical fitness test where corrections department had failed to prove business necessity due to lack of consistent cutoff score).
The Penultimate Question
What role should test results play in hiring decisions? (The ultimate question being whether to test or not to test.) Generalizations must be avoided when it comes to testing. That said, if a test disparately impacts women or minorities, even if it’s been validated, the employer should probably use the test only as a factor – and not the determining factor – in hiring decisions. It can be difficult for plaintiffs to meet their burden of proving that disparate impact was caused by a challenged test if the test is only one of several factors. Suppose one employer gives equal weight to test results, interview scores, and prior relevant experience. Suppose another uses a “pass or fail” interview, “passes” most candidates who interview, then administers the test to all who pass and bases final hiring decisions on test scores. It would be much easier under the second scenario for the plaintiff to prove that disparate impact was caused by the test, rather than the interview or some other part of the defendant’s hiring process.
Under the first scenario, the plaintiff might argue the defendant’s hiring process should be viewed as a whole “because the elements of respondent’s decision-making process are not capable of separation of analysis.” 42 U.S.C. § 2000e-2(k)(1)(B)(i). But that’s not correct. Just because it would be harder to prove causation under the first scenario does not mean it’s impossible. All this is a long way of saying the employer will be better able to successfully defend against disparate impact claims if the test is (i) only a factor in hiring decisions, and (ii) at no juncture making a determining factor.
A few words about dealing with test vendors. First, get beyond the sales reps. Many of these folks are unfamiliar with the Uniform Guidelines, much less with case authority applying those guidelines and the underlying statutes. Purchasing employers who ask enough questions, or who enlist a lawyer to ask them, may get referred to the vendor’s VP of Sales and Marketing or the like. S/he will have heard of the Uniform Guidelines, and will (probably truthfully) be able to say the desired test has been validated in accordance with those guidelines. Although true, that statement may be irrelevant at best and potentially misleading. Unless a test has been validated for a particular employer’s specific job, the test is not valid as to that job, i.e., it isn’t valid for the hiring employer’s purposes. Any reputable vendor has industrial or organizational psychologists on staff who know what validity means, whether the vendor’s instruments have been properly validated for other employers, and what it would take to validate the test at issue for the purchasing employer. Ask to speak with the organizational psychologist on staff.
That a test has been validated for other companies, perhaps in other industries, for use in filling positions that may or may not resemble the jobs in question, typically means nothing. The exception is when validity results for Employer A’s job or jobs are “transportable” to Employer B’s job or jobs. Even then, however, there must be a proper underlying study as to Employer A’s jobs, and a formal job analysis of Employer B’s jobs to determine if they are enough like A’s to make the A results transferable. In short, buyer beware.
Which makes contractual reps and warranties language in any vendor agreement important. A purchasing employer typically wants (and needs) assurances that the test it’s buying is valid as to the jobs for which it will be used. A knowledgeable vendor similarly knows it takes a proper validity study in order to represent and warrant that its test has been validated in relation to the purchasing employer’s jobs in question. That may incent the vendor to pitch a validation study at additional cost. The employer should welcome that offer as a sign the vendor knows what it’s doing. The employer may then negotiate hard, seek a volume discount, or inquire about transportability. But it shouldn’t balk because the vendor wants to sell not only the test but a validation study too. As discussed below an off-the-shelf test without applicable validity evidence may well be worse – from an employment risk management perspective – than no test at all.
Many “how-not-to’s” are the flip side of the “how to’s”:
- Do not buy a test off the shelf and assume it will be valid because it purports to be designed for/tailored to the positions you are looking to fill.
- Do not rely on generalized claims of “validity” or “validation.
- Do not assume that validation studies for another employer’s jobs will be transportable to yours — even if the job titles are the same or virtually identical.
Call center jobs illustrate this last point. Some of those jobs entail receiving inbound calls, answering customer questions, and providing particular data upon request. Other call center jobs are essentially sales jobs: they involve outbound calling, they require persistence and resilience; they could involve a scripted sales pitch, but they could also involve unscripted dialogue and require people with an outgoing personality and a certain “marketing flair.” In other words, don’t assume that seemingly similar jobs are alike.
Is a bad test really worse than no test? The lawyerly (and correct) answer is, “it depends.” It depends on just how bad the test is, for example, to what extent does it disparately impact women or minorities? It depends on how the test is used: for example, is it one of many factors used in a “holistic” assessment of candidates? Or, is it a “knock-out” – an identifiable hurdle that will be apparent to candidates as such? Lastly, it depends on the alternatives. If the alternative is a wide-ranging criminal background check (risky for use with most jobs and patently illegal in some jurisdictions), then an unvalidated test might be better – both for hiring purposes and in terms of legal risk. But that’s a Hobson’s choice. Informed employers have better choices, including tailor-made or customized tests validated by outside professionals before challenge in litigation. That approach may be somewhat costly, but then again so are class action lawsuits, EEOC pattern and practice cases, multiple discrimination charges, turnover and recruitment expenses, and the (probably) unquantifiable but nonetheless substantial cost of an under-performing workforce.