As mandated by the American Recovery and Reinvestment Act of 2009, on November 26, 2012, the U.S. Department of Health and Human Services’ Office for Civil Rights (OCR) issued guidance on the de-identification of protected health information (PHI) in accordance with the HIPAA Privacy Rule (Privacy Rule). This tool, which is set forth in a navigable Q&A format, is the result of stakeholders with practical, technical, and policy experience contributing to a Washington, D.C. workshop in March 2010. The resulting guidance does not change the Privacy Rule, but rather clarifies the existing regulation as it pertains to de-identification of PHI. The guidance specifically addresses the two acceptable de-identification methodologies under the Privacy Rule: (1) Expert Determination and (2) Safe Harbor. De-identified information does not constitute PHI and, therefore, it is not governed by the Privacy Rule. Guidance on the implementation specifications of these two methodologies therefore serves to assist covered entities in creating information that may be freely used for analysis and research without violating the Privacy Rule.

  1. Expert Determination

The Expert Determination method is a formal determination by a qualified expert, which uses the application of statistical or scientific principles and subsequent assessment that determines the risk of identification is “very small.” The term “very small” is not defined numerically; rather, it is determined by factors such as replicability, data source availability, distinguishability, and access risk. Although the guidance does not establish a bright line standard for acceptable risk, the guidance is clear that risk mitigation is necessary. Such mitigation may be applied through several broad classes, including suppression techniques, generalization, and perturbations. While no specific process is required to be used by the experts under this methodology, the expert’s process must be documented and made available to OCR upon request.

The guidance also notably provides the following clarifications:

  • A “qualified expert” is not required to be specially certified or have a specific professional degree. Rather, an expert requires relevant expertise, which may be gained through professional experience and/or other routes of education or training.
  • Expiration dates do not need to be attached to an expert determination; however, experts must determine whether time-limit certifications are appropriate for a de-identified data set. When such time-limit expires, an expert must examine whether future releases of the data should be subject to additional de-identification processes in order to meet the very small risk requirement under current conditions.
  • Risk of identification may be determined by computing the likelihood data is expected to be unique or “linkable” to only one person, thus identifying that person.
    • For example, an expert must evaluate the likelihood that two separate data sources can be “linked” to identify an individual. In conducting this evaluation, different risk features should be considered. For example, patient demographics data is a high risk feature because it is found in many places and is publicly available. On the other hand, clinical features such as a patient’s blood pressure data are a lower risk feature because such data is accessible to a much smaller set of people.
  1. Safe Harbor

In contrast to the Expert Determination’s statistical basis, the Safe Harbor methodology requires the removal of 18 specified identifiers of the individual or of relatives, employers, or household members of the individual. The latter also requires that covered entities have no actual knowledge that the remaining information could identify the subject of the information. The guidance states that “actual knowledge” means clear and direct knowledge that the remaining information could be used, alone or in combination with other information, to identify the subject of the information. The guidance provides examples where “actual knowledge” may exist such as the inclusion of a well-publicized clinical event, a revealing occupation, a clear familial relationship, or the recipient’s knowledge of the de-identification algorithm. The inclusion of any other unique characteristic may also prohibit the information from being construed as de-identified. The guidance further clarifies:

  • Data in standardized fields, as well as information in free text unstructured fields, must be properly de-identified.
    • For example, when de-identifying information in an electronic medical record, identifiers must be redacted in fields that include open-field notations, such as progress notes.
  • The inclusion of initials or the last four digits of an individual’s Social Security number fails to properly de-identify PHI.
  • An individual over 89 years of age must be listed as “90 or above.”
  • All unique identifying numbers, codes and characteristics must be removed to meet the Safe Harbor standard, except for a re-identification code meeting certain criteria specified in the Privacy Rule (i.e., the re-identification code is not derived from or related to information about the individual (e.g., does not include an individual’s initials or last four digits of Social Security Number), and the covered entity does not use or disclose the code for any purpose other than enabling the covered entity to re-identify the de-identified information, nor does the covered entity disclose the mechanism for re-identification).
    • Examples of unique identifying characteristics, numbers and codes provided in OCR’s guidance include characteristics such as a recognizable title (e.g., the “current President of State University”), a unique identifying number such as a clinical trial record number, and a unique identifying code such as a bar code unique to a patient.
  • Unlike patient names, the Privacy Rule does not expressly require the suppression of physician names from de-identified information; however, covered entities must consider the “actual knowledge” standard where such information exists within a data set.

Under either methodology, covered entities are not required to use data use agreements when sharing de-identified information; however, covered entities may nevertheless require third parties to enter into data use agreements for specific purposes, such as to prohibit the use of information for re-identification. The guidance is clear that merely entering into data use agreements will not be sufficient to meet the requirements under either methodology.

OCR has expressed that it intends for this guidance to assist covered entities in understanding “de-identification, the general process by which de-identified information is created, and the options available for performing de-identification.” Although the guidance indeed offers some clarification, it also leaves several elements open for additional interpretation. Possibly in acknowledgement of this lack of completeness, OCR continues to welcome comments and suggestions to improve this tool. The guidance, in its entirety, can be found at: coveredentities/De-identification/guidance.html.