When Scientific Data is Transparent, We All Benefit

Story Stream
recent articles

The Environmental Protection Agency (EPA) recently finalized a rule to strengthen transparency in the science underlying regulatory health and safety roles. This transparency refers to making data from epidemiological studies available for evaluation by independent observers, since the studies often serve as the foundation for regulations that address exposure to substances that can affect health. Yet some scientists think this Trump-era rule should be repealed. These critics worry that people’s confidential personal health data may be revealed and as a result, deter people from participating in future studies. But, beyond regulations, access to personal health data is essential for us to improve public health as we move toward precision medicine.


This EPA rule addresses the so-called “secret science” underlying some regulations that are both the most expensive, but also allegedly the most beneficial in American history. Those regulations, for example, link particulate matter (PM) like different types of dust to premature deaths. These linkages are referred to as dose-response data, which estimate how much health harm results from various levels of exposure to a hazard like PM. The costs (primarily equipment to control PM) are estimated to be in billions of dollars, but the EPA claims that the rules save thousands of lives. One estimate shows $32 billion in benefits over two decades at a cost of about $14 billion, though others dispute the benefits and question the science behind the rules.


A main source for PM regulations has been a 1993 Harvard study that associated air pollution with mortality in six U.S. cities. Known as the “Six Cities Study,” its data has not been made publicly available for replication. Insisting on transparency for this and similar rules is not an isolated effort. With 90 percent of scientists agreeing that there is a “replication crisis” in science, the Center for Open Science has over 5,000 scientific journals signed on to follow their guidelines, which include making data available.


But this problem doesn’t stop at regulation, and arguably, regulation is not the most important place for this data to be available. As we move toward precision medicine — where treatments are prescribed for individuals based on their genetics, health status, environmental influences and microbiomes — it will be necessary for doctors and scientists to access large data sets from millions of people.


For precision medicine to move us away from population-based treatments, from drugs to surgeries, we will need so-called “big data” coupled with methods to collect, store, clean, process and interpret that data. To be effective, we will also need advances in artificial intelligence (AI) and machine learning (ML). Large data sets combined with AI and ML will generate new hypotheses that focus on cures for more targeted sets of individuals, including so-called “orphan” diseases — rare diseases that may affect fewer than 200,000 individuals.


One example would be drugs prescribed for cholesterol, which has over 20 possible medications and where the effectiveness for each “varies from person to person.” Physicians will typically prescribe one and then monitor its effectiveness as well as side effects. If that mediation doesn’t work or severe side effects emerge, they go on to the next one. With precision medicine, it may be possible to take the data from people with similar profiles to make more precise treatment recommendations.


This can only be done if people are willing to share their relevant information and experiences with those drugs, which circles back around to the same issue: patient privacy. While patient privacy is currently protected through the Health Insurance Portability and Accountability Act (HIPAA), many people are concerned about releasing their data.


In addition, there are extensive security and privacy protection measures that apply to most health information used in research. This includes limiting who can access the data as well as encrypting it or using techniques that “deidentify” personal information used either in the original research or in reanalyzing the data to check validity of findings.


But we need this data now. In 2016, the 21st Century Cures Act was signed into law, which “is designed to help accelerate medical product development and bring new innovations and advances to patients who need them faster and more efficiently.” Although the accuracy of the numbers cited in the Cures Act may not be precise, it’s worth noting that there are somewhere in the neighborhood of 10,000 diseases but only 500 cures. The type of precision medicine that can uncover cures cannot work without access to patient information.


Even if a new presidential administration finds the EPA transparency rule flawed, that does not change the need for both public and private health care to safely access dose-response data.


Richard Williams is a senior affiliated scholar with the Mercatus Center at George Mason University and former director for social sciences at the FDA’s Center for Food Safety and Applied Nutrition.

Show comments Hide Comments