Protecting Privacy When Advancing Medical Breakthroughs
By Sam Wehbe, Marketing Director, Privacy Analytics
Important medical breakthroughs increasingly rely on sophisticated analyses from multiple large databases. Data might reside in electronic medical record (EMR) systems and databases controlled by healthcare providers, hospital systems, government agencies, insurance companies, pharmacy chains, and many other entities.
There is incredible value in linking data from different sources to allow scientific investigators to gain insight and find answers. When Big Data is accessible for medical research and analyses, exciting things can happen. Unfortunately, the fact is oceans of data sit in data warehouses inaccessible and unexamined. There are two main reasons for this. First, many data custodians are concerned about violating federal privacy laws so they avoid all risk by simply not sharing any data. Second, data custodians who are willing to share data often apply rudimentary approaches that remove or redact most information that the data is rendered virtually useless to potential researchers.
There is no compromising data privacy. The risks associated with re-identifying patients are serious enough to persuade custodians to hold their data close. There are low-risk methods that can be used to de-identify patients’ data that avoid stripping away of the value to researchers. These methods address the areas that hold the most analytic value, the indirect identifiers. Indirect identifiers are fields within the data that, while not immediately identifying, when used in combination can re-identify a patient. Examples of this include gender, date of birth or age, geographic data, language, ethnic origin, high level diagnoses, and medical procedures.
Legally, HIPAA has specified two standards for the de-identification of health information. The first, Safe Harbor, specifies 18 data elements that must be removed or altered to make patients anonymous. This rules-based approach is easy to implement but limits how the data can be used. For instance, Safe Harbor indicates dates be reduced to the year. When looking at the effectiveness of treatments and progression of diseases, this change renders the data meaningless.
Leveraging Better Data Sharing Techniques
The second way of protecting patient privacy lies in the Expert Determination Method, or Statistical Method. This method requires an expert, familiar with the principles and techniques of de-identification, examine the data and determine risk by taking into consideration the sensitivity of the data, context for its release, and the controls in place.
Expert Determination provides a more specific and granular method for assessing risk and de-identifying health data. It not only protects individual privacy better than Safe Harbor but also ensures the value of the data remains high. Dates under Safe Harbor are reduced to year, but under Expert Determination dates can often be retained within a specified range. When tracking the progression of a disease or its treatment, having dates retained to even the week is crucial.
Use of this methodology has traditionally been inaccessible for many organizations due to lack of training, expertise, and time needed to apply it. However, today there are commercially available tools that enable health data stewards to onboard and support their de-identification efforts. New training courses and certification programs, such as the HITRUST de-identification framework training, exist to build a pool of experts in this methodology. Having an in-house specialist is the first step in the right direction. Privacy protection plans need to be revisited and adjusted to support the Expert Determination data sharing model. Leveraging tools such as software for de-identification, especially when used by a trained expert, is the best way for organizations to leverage their data for secondary use. Once they are able to automate data sharing, they can regularly unlock the value of their data for research and analytics.
Medical breakthroughs require high quality, granular data to power innovation. It’s clear that unlocking Big Data in Healthcare is possible and has incredible value. With the right approach, leading health organizations can position themselves to not only leverage this opportunity, but be leaders in innovation.