In Bruce Smith, et al. v. City of Boston, Case No. 12-CV-10291 (D. Mass. Nov. 16, 2015), Judge Young of the U.S. District Court for the District of Massachusetts held that the City of Boston Police Department’s (the “Department”) lieutenant-selection process — ranking candidates for promotion based on their scores on a facially neutral exam administered in 2008 (“2008 exam”) — had a racially disparate impact and was not sufficiently job-related to survive scrutiny under Title VII of the Civil Rights Act of 1964.

Judge Young’s 82 page ruling is a case study on how employment tests can be vulnerable to legal challenges, and in that respect, is required reading for corporate counsel and HR professionals facing legal issues relative to promotional practices.

The Court’s ruling on liability represented the first of two phases in the trial.  If the parties do not now agree on an appropriate remedy, they will proceed to phase two on damages.  Before the trial, the Court had denied the Plaintiffs’ motion for class certification without prejudice and the Plaintiffs will likely seek class certification following this decision on liability.

Background To The Case

Plaintiffs, ten African-American police sergeants, brought suit under Title VII, alleging that the multiple-choice examinations the Department administered in 2005 and 2008, to select which sergeants to promote to the rank of lieutenant, had a racially disparate impact on minority candidates and were not sufficiently job-related to pass muster under law.  Plaintiffs also asserted a pendent claim under the Massachusetts anti-discrimination statute, Chapter 151B (“Chapter 151B”). The Department disputed that the exams had a disparate impact on minority candidates and claimed that, even if they did, the exams were sufficiently job-related to survive a Title VII challenge.

The Court’s Ruling

As a threshold matter, the Court emphasized that Plaintiffs had not brought a disparate treatment case about conscious racial prejudice, but rather a case alleging that the seemingly benign multiple-choice examination process, while facially neutral, was slanted in favor of white candidates. Id. at 2.

After a ten day bench trial replete with “dense” statistical analyses on both sides, the Court held that Plaintiffs had met their prima facie burden of showing disparate impact.  In so doing, the Court held that promotion rates alone were not the only appropriate measurement for disparate impact.   The Court also considered the passing rate on exams, average exam scores, and delays in promotion. The Court found that “even in the absence of an adverse impact on promotion rates, an exam can lead to liability for the employer if it functions as a ‘gateway that has a disparate impact on minority hiring.” Id.

The Court also directly addressed whether statisticians should apply “one-tailed” or “two-tailed” analyses in reviewing disparate impact claims.  Without going into detail about the difference between these two tests, the Court opined that it is easier for Plaintiffs to prove disparate impact using a one-tailed test.  The Court noted that until now “the First Circuit has overtly avoided choosing between the two tests.”  Id. at 43. After a lengthy discussion of why the one-tailed test might be appropriate given the Department’s history of discrimination against minority police officers, the Court actually held that the weight of the case law favored using a two-tailed statistical test when reviewing disparate impact claims on facially neutral employment selection procedures.  Ultimately, the Court held that whether it used the one-tailed or two-tailed methodology for evaluating statistical significance was not dispositive in this case because the Plaintiffs had met their burden of demonstrating disparate impact on the promotion rates of minorities.

Once the Court concluded that Plaintiffs met their prima facie burden of showing that the test had a disparate impact on minorities, the burden then shifted to the Department to prove that its exam was job-related and consistent with business necessity.  On this issue, the Court found that the job analyses upon which the test was based were sufficiently current and thorough for the Department to build a valid test.  Nevertheless,  the Court agreed with Plaintiffs’ expert that those who performed better on the test would not necessarily perform better on the job, and found that the Department’s promotional test was not a fair way to rank candidates.

According to the Department, the questions on the 2008 exam were designed to test knowledge, skills, and abilities (“KSAs”) that would align with the actual work of a lieutenant.  The Court disagreed and found that the exam failed to measure critical skills and abilities, including interpersonal skills, reasoning skills, and the ability to quickly make sound decisions and therefore, the Department could not demonstrate its validity.  The Court also took issue with the fact that the City had not performed an item analysis, reliability analysis, and disparate impact analysis to ensure the quality of the test scores before administering the test.  As a result of all of this, the Court found that the 2008 promotional test at issue was not sufficiently reliable and the Department should not be permitted to use it to rank candidates for promotion.

Finally, the Court held that even if the 2008 exam was a valid screening tool, the Department had failed to meet its burden of showing that the 2008 exam was job related for the position in question and consistent with business necessity. Once the Court made that finding, it determined that the Plaintiffs had won their case.

Implications For Employers

When one cuts though all of the detailed and sophisticated statistical analysis in the Court’s 82 page decision, the lessons employers can glean from this ruling is that they should conduct item analysis, reliability analysis, and disparate impact analysis to ensure the quality of their tests before administering them.  Employers also should ensure that any test is “job related for the position in question and consistent with business necessity” and includes questions which address all of the skills and abilities necessary for performing the ultimate job.  In the end, an employer is better positioned to defeat disparate impact challenges to promotional testing if it can show that the test is sufficiently based on a valid job analyses before it is used as a ranking mechanism.