HLS Library Book Talk: “Big Data, Health Law, and Bioethics”

HLS Library Book Talk: “Big Data, Health Law, and Bioethics”


Well, welcome. It’s so nice to have you here. Weather could be a
little bit better, but maybe if it was
a little bit better, you’d be playing outside. So maybe we should be
thankful for the weather. I’m Glenn Cohen. I’m a professor here
at the Law School. I’m the faculty director
for the Petrie-Flom Center for Health Law
Policy, Biotechnology, and Bioethics, AKA the
longest name center you’ve ever heard of. I want to thank the
Harvard Law School Library who helped
put this together with special thanks
to June Casey who organizes all these
great book talks and so many others coming up. Also, our co-sponsor,
the Berkman Klein Center for Internet and
Society at Harvard University. And I want to mention that
immediately after this book talk, we’ll go to
the Pub where we’ll have food and libation for
you, and you can hear more about health law at Harvard. So free beer, free wine,
I think, free food. A great place to
celebrate this event. The book we’re going to
talk about today is edited by myself, Professor Gasser, who
you’ll hear about in a moment and hear from, and
also Holly Lynch, who’s formerly at Petrie-Flom
as Executive Director, now a professor at Harvard– at Penn Medical School. That was a Freudian
slip, because I wish she was still at Harvard. And Effy Vayena
who’s at ETH Zurich. We also received amazing support
from Crissy Hutchison-Jones who’s in the back who also
helps run these amazing events. And my RAs, Ethan Stevenson,
Brian Yost, Colin Herd, and Wilfred Beaye who
helped do the line editing. Those introductions aside,
let me start with a claim I’ve made in a few
other places that I think is really
important, and that is that people talk
about big data the way adolescent boys talk about sex. Everybody says they’re
doing it to each other. Many of them are not quite
sure exactly what it is. If they are doing it,
they’re probably not doing it very well. And certainly, they’re
probably not doing it in a way that both parties to
the interaction enjoy. So what do we mean
when we say big data? Typically we refer to the
three Vs. Lots of hype here, but let’s try to
get down some facts. Many possible definitions, but
I think the common denominator are the three Vs. Volume, there’s a
vast amount of data. Variety, there’s
significant heterogeneity in the type of data available. And velocity, the speed at
which the data can be processed or analyzed is extremely fast. Now some would add a fourth V– value. But in many cases, I
would say that’s more aspirational than actual. And defined as such, health
care has become one of the key emerging use cases for big data. For example, Fitbit
and Apple’s ResearchKit can provide
researchers with access to vast stores of biometric
data on users from which you’d test hypotheses on
nutrition, fitness, disease progression, treatment
success, and the like. EHRs are quickly being analyzed
and, in some instances, attempts to monetize them. Countries around the world
are trying to use big data to improve public health. But for all of these wonderful
promises, this book– we are lawyers and ethicists,
after all– focuses a little bit also on the
gloom, the real big questions raised by big data from a
legal and ethical perspective. It has, I would say, at least
two cross-cutting themes. One is the question of
whether the regulation of medical record data,
non-medical record health care data, and non-health
care data that permits inferences about health,
whether a single regulatory regime will be good
for all of that, or whether we, instead, need
multiple regulatory regimes. And that’s important
to emphasize. Because while today–
and I would say HIPAA, our American system of
health care privacy– is focused on the
doctor-patient interaction, the health care record. That’s an extremely important
source of health data, but it’s far from
the only source. Your social media posts,
incredibly important source. Startling finding that
you could actually do fairly good
diagnosis of depression from Instagram filter
choice, for example, revolutionized I think how a
lot of people think about this, but whether it’s
social media, Fitbit, whether it’s health
records, whether it’s pharmacy records, whether
it’s purchasing records. The famous story. It’s true. I looked it up. It’s not apocryphal. Of somebody who
comes into Target. In my imagination,
it’s a Southern person. I don’t know if they have
to be a Southern person, but it makes the
story sound better. So this gentleman comes
into Target and says, I want to speak to the manager. And the manager
comes forth and says, why do you want to speak to me? And he says, what
business do you have sending my daughter
all this information about pregnancy? She’s 16 years old. The manager
apologizes profusely. Well, about a month later,
the man comes back to Target and says, I’m here to apologize. Say, why are you apologizing? He said, my daughter
is pregnant. Turns out, Target
knew his daughter was pregnant before he did. Imagine now it’s not
just the Target data set, but it’s everything you’ve
purchased on Amazon. It’s your electronic
health records. It’s your Facebook,
your social media, it’s the Nest in your home. How do we think all of
these can be put together or should not be put together? What will be the
effect for big data? This is the second theme
on, not only the health care professions, but medical
education, society, the way we interact with each other. There are 22
chapters in this book plus an introduction
plus an epilogue to discuss these things. They’re written by leading
scholars and practitioners. Cover vast terrain
of knowledge, but I’m just going to mention a
few of the topics covered. Is inferring
knowledge about people from data the same as
a privacy violation? To what extent can
existing regulatory regimes be arbitraged to the
world of big health data? What about pharmacovigilance
and False Claims Act? Can we use big data for that? The Americans with
Disabilities Act. Will we think of
discrimination– if you want to use
that word, maybe that’s pejorative–
but determinations made on the basis of data
that touch on disability or predictions about health
as violations of the Americans with Disability Act. If not, is that a problem? How do you deal with cases
where big data under samples minority populations in terms
of making predictive algorithms and the like for health? What about medical
malpractice or other forms of liability in these settings? Big data and research ethics? Are their duties for us to
share data and under what terms? And how do we think about the
intellectual property regime, in particular patent, but also
trade secrecy as interfacing with questions about big data? Big data. It’s a big topic. But we’ve got some big
people here to talk about it. I’m now going to introduce them. The first one who’s
going to speak to us is Professor Urs Gasser. These people, they
have mile long CVs, so I’m just picking a
couple of highlights. He is the Executive Director
of the Berkman Klein Center and a Professor of Practice
here at Harvard Law School. He’s published more
than 100 articles in professional
journals, and his books include Born Digital, Interop,
and a forthcoming book on the future of
digital privacy. We’ll then hear from
Ameet Sarpatwari. I always hesitate a little
bit to see if I got the name. I got it exactly right. Ameet is the Assistant Director
of the Program on Regulation, Therapeutics, and Law– PORTAL, best acronym ever– at the Brigham and
Women’s Hospital. He’s an instructor of medicine
at Harvard Medical School and associate epidemiologist
at the Brigham. He also teaches public
health law at the university. He studied epidemiology at
the University of Cambridge and studied law at the
University of Maryland. He is a well-known scholar. You’ll often catch
him either on TV or testifying
before legislatures or all of the above. Finally, Carmel Shachar,
the executive director of the Petrie-Flom
Center and a lecturer on law at Harvard Law School. She a cum laude graduate
of Harvard Law School, but also holds an MPH from
the public health school. She teaches here,
and she is really running a huge
number of projects at the Petrie-Flom Center. That if you stay
for the free booze– I was told to mention
this as much as possible– the free booze
and the free food, you can hear more about what
those exciting projects are. Everything from law and
neuroscience on the one hand to investment
questions on the other to even things we’ve recently
finished doing like working on the health of NFL players. So with that kind of
short introduction, I’m going to turn
it over to Professor Gasser who’s going to go first. [APPLAUSE] Hello, everyone. Good afternoon. I’m delighted to be here. First of all, a little
bit of a warning. I am not an expert
at all when it comes to this intersectionality– actually the previous slide– of health law, big
data, and bioethics. I’m the digital guy. You’ve heard it in
the introduction. I’m working at the
Berkman Klein Center. We look more broadly at the
impact of digital technologies on society. And so I’m relatively
new to this field, what’s happening when you
introduce tech to health. And therefore, my
comments will be a little bit more a
reflection on what have I learned as
I had the pleasure to help editing this book. Before I share a
few observations, I want to really say thanks to
Professor Cohen and the team at Petrie-Flom for
inviting us to this party. I learned a ton. And here are a few
things that I’ve learned. So essentially, what I’d
like to do is share with you six observations, six
takeaways that fall roughly into three categories. First category is
understanding the phenomenon. What’s going on as we
introduce advanced technologies such as big data analytics,
AI, the internet the things, you name it to health. What is happening on the ground? Second cluster of
observations is around what are the
normative issues, and Professor Cohen
already alluded to some of the big normative
questions that arise. And then the third
cluster is about, OK, where do we go from here if we
understand the opportunities but as well as the challenges. And we want to make sure we
embrace the opportunities, but also avoid some
of the pitfalls. First observation,
understanding what’s happening. We’ve already heard it
in the introduction. The big lesson
learned, number one. It’s complicated. A lot of things are happening. For those who are not working
in the health or in health care, the health system
itself, of course, is of incredible
complexity in many ways. And now, you layer
on top of that the technological complexity. It makes it really an
exercise in managing through a complex system. And that management of
complexity, I think, is a real challenge for
a number of reasons. So first of all, you take the
example of electronic health records, and this
long journey that we are in to increase
interoperability among health records
that then would enable us to analyze
the data from many, many different sources. The reasons why we haven’t
achieved more interoperability are really complex. Have to do with economics,
fear of liability, cultures within the health
systems, and the like. So in other words, as
you introduce technology to this already complicated
and complex system, it gets really hard to
understand even for experts. And one of the big lessons
learned from the book and from the
conversations around it is really how many
different experts you need to get
together in one room to better understand what
is happening right now, what are the promises, and
what are some of the pitfalls. So in that sense, it’s a deeply
interdisciplinary endeavor to talk about big data, and
health law, and bioethics. With that
interdisciplinarity, I think, a number of complications arise. The first one that
I observe is one of semantic interoperability. If you’re a statistician
or a computer scientist who understands the AI part of
whatever diagnostic system is developed, you may have
a very different language and use a very
different terminology from the physician
who ultimately will work with this particular
system or actually the patient who will, whatever,
get a treatment based on what the physician and/or
the AI system determine. So how do we deal with these
different vocabularies? A simple example is
computer scientists and engineers have a different
notion of privacy than lawyers or many of the
people in my space. And it takes a lot
of time and energy to define these vocabularies. Which is important if you want
to understand what’s actually going on, how do
we label things, and what are we
going to do about it. I would even take
it a step further– and Professor Cohen
alluded to it already– it’s also that within
each discipline, the vocabularies are challenged. So what is health data
in this new environment as you described it before? It’s no longer clear
what basic concepts mean. The [INAUDIBLE] would be
privacy in the legal discourse, so we have real
language challenges as we even try to triangulate
the phenomenon here. Last observation
within that class is also many of the
technologies that are now added make
it particularly hard to understand them. Think about AI systems that
reach a degree of complexity and that are really often black
box technologies where even experts can’t explain what the
algorithm exactly was doing leading to particular outcome. So that makes it very
hard to understand even as a descriptive
matter what is actually happening as we inject these
advanced technologies to health care, health discourse. The second observation in
this, or the second cluster of observation, as I said,
is about normative issues. So the book talks
a lot about these technological advancements
and the promises and pitfalls. But ultimately, as
you browse through it, I feel it’s really more a
story about big normative questions about human questions,
about societal questions, value issues. So yes, they are
amplified by technology, but ultimately, pull us
down to big questions that we as societies, we as
citizens, we as patients, we as professionals have to answer. This concerns both some
sort of new challenges, normative challenges– I will get to that in a second– but also existing norm sets. Professor Cohen already
mentioned it by way of example. Do we need to update certain
laws and regulations that have some, in the
past, established an equilibrium in terms of
balancing different interests of stakeholders involved. Take the example of
informed consent. If informed consent
doesn’t really work anymore as a
mechanism, or not as in past technological
environments, what does that mean? Do we need to have
something new, or do we need to double
down on informed consent? Or should we focus more on,
OK, forget about consent, but we put restrictions
on how data is used. So these are really, at its
core, normative questions. But it’s not only about
how existing norm sets are challenged by the disruptive
technology and the dynamics that flow from there. It’s also about some of the
hard new societal questions that we’re confronted with. Professor Cohen mentioned
the issue, for instance, of individual privacy versus
public health benefits. So maybe a big data set may
be challenging or infringing on a person’s
privacy, because you could re-identify the person
who appears in a big data set. But the value you can
get out of the data set and the public health
benefits may be enormous. So how do we think about
these value trade-offs in individual privacy versus
the use of technology, here big data, for
the public good? That’s not something where
technology should or can give the answer. That’s really a conversation
we as a society need to have. The second example
in the same category is the issue of inclusion. Turns out– and one chapter
describes it in quite detail– that many of the big data
sets we’re talking about are not really
inclusive data sets. They don’t necessarily
represent the entire deal for a population here in the US. They don’t necessarily have
data in it from immigrants, let’s say. Now, what does
that mean if we use that data later on to develop
treatment strategies to make recommendations of all sorts,
to discover some of the patterns that Glenn was mentioning? Are we, again,
excluding populations that would most
benefit potentially from these next
generation technologies as they are applied in health? That’s a big societal,
it’s a human question that, again, goes far beyond
internet of things or big data. This is something that
we need as a society need to sort out and figure out. The last cluster, third
set, is really about design. So if we have a long list
of opportunities which are, I think, well-represented
in the book as well as some of these challenges, whether
it’s privacy threats, whether it’s fear of
discriminatory practices, whether it’s bias, whether it’s
cybersecurity threats, if you think about the
internet of things. So it’s a long list. What can we do to embrace the
opportunities on the one hand side, but then also manage
some of the challenges as we enter this new
world of AI, IoT, data analytics, and health? The big takeaway from
the book is that there is no silver bullet solution. Otherwise, it would be
a short book, I guess. But rather that we will
need a lot of creativity and embrace essentially
the full instruments available in the toolbox. Professor Cohen, of
course, for obvious reasons put emphasis on law. I will also return to that
in just my last observation. But it’s worth highlighting
that in the book, there are also other
tools described. So for instance,
how could we work with market-based mechanisms to
address some of the challenges? For instance, there
is a chapter arguing we should be innovative and
create some sort of clearing house that would address some
of the problems that arise with the internet
of things as applied in the health care sector. We should think
about standards that are competing on a
marketplace, for instance, standards for privacy. So there are
market-based approaches. Some chapters highlight other
instruments of governance including technologies. So to what extent can we
deal with some of the, let’s say, privacy problems by
looking at privacy enhancing technologies, or
new frontiers, how advanced algorithms can help
to address privacy challenges? The key word is
differential privacy here. Could go more into that. So you get a sense it’s
not only about the law, but there are also
other tools available. And the challenge is
how can we find a mix to enable some of the good
uses of the technology while addressing some of
the risks and challenges. Now, the last observation
is really going back to law. One of the big learnings
for me from this chapter on looking at health as
one big huge case study is that health as an
industry or sector is not that different from other
sectors we’ve studied before whether it’s entertainment. Remember, the copyright
wars 15, 20 years ago. Or if you look at what’s
happening to transportation or any other industry
in that the reactions, the legal system,
and how lawyers think about
disruptive technology is actually quite similar
across industries. And so within
health, I think, you see quite clearly that there are
roughly three response modes. One is that we try to apply
the old rules we already have, whether it’s HIPAA
or any other norm set and try to apply that to the new
technology that enters health. And quite often, that
gets us quite far. But of course, we
also know there are limits how far that goes. And then the question arises,
do we need to change the law? We heard already an example,
the False Claims Act. Does that need to be upgraded? Because you can now start
to manipulate data to trick, to commit fraud,
health care fraud. So you can make radical changes. That’s another
response mode that you see in many different sectors. Or you can innovate
more dramatically. Maybe an example
of innovating more dramatically is
the GDPR in Europe. But also some of the
proposals coming out in the book including,
well, maybe we have to rethink privacy
more dramatically in this age of big data. Maybe we don’t think of privacy
anymore as an individual right, but as a group right. Because, in many cases, when
big data becomes relevant, it’s about categorizing certain
populations into groups. And that is what we
may be concerned about, for instance, when it comes
to discriminatory practices. So do we need to have a group
privacy right as opposed to the traditional
right to privacy that we think of as
an individual right? So these are some of the bigger
paradigm changes that are now discussed, and I think
are symptomatic also for conversations
in other areas. The last point, just quickly,
is also going through the book, you see another recurring
theme emerging about the law. Which is often we think
of the law as just a constraint on behavior. The law basically tells
you what not to do. You’re not allowed
to share this data. You’re not allowed
to collect it, or you have all sorts
of banned behavior. That’s some sort of the
common notion of law. But in the book, there
are several instances where it becomes clear
that the law can also play a leveling function
or an enabling function. The leveling function, I
think, is more nascent. We see a lot of discussion
outside health now looking at big companies, big tech
companies that have accumulated so much data that
their algorithms get better and better,
but what about new entrants into the
market, small companies that try to compete? If they don’t have access
to this huge volume of data, as we heard from Professor
Cohen, can they ever compete? So law, of course,
through competition, law and other mechanisms can
address this problem. We may see some of that
also unfolding in health. But certainly what we
see in the chapter, in particular also the one that
Professor Cohen wrote is maybe the law in some instances,
in this context, takes it too far in terms of
locking things down instead of maybe creating an environment
where we have more information flow to the benefit
of health care and to the benefit
of public health. And privacy is, of course,
one of these areas. So Professor Cohen
introduced the idea, could we flip it around
and think about a duty to share information given the
benefits of information sharing in a world where
any progress or much of the progress in
medicine and in health will come likely from data. So really provocative thought,
and I think a great example how law could not only
constrain but unlock and open up some of the
promises of the technology. So these are a few reflections,
again, from an outsider. You will hear now much more
detailed substantive analysis on some of these issues
that I may have mentioned. But I hope this at least
connects the discussion also a little bit to
other conversations we are having be it autonomous
vehicles, AI, ethics, and governance. So thank you so much. Thanks again, Glenn. Thank you. [APPLAUSE] Good afternoon, everyone. And I want to say
thanks so much for Glenn and for Carmel for inviting
me to talk about this, and in general for hosting
these types of great events where I really think it is
essential to bring together stakeholders from
different backgrounds. And here I stand a little bit
at the intersection of people who are epidemiologists,
who are really pushing for a lot of data
sharing as much as possible, but I also stand a little
bit in the world of law where the risk averse nature
of lawyers comes out as well. So what I’m going
to be talking about is the promise of big data. I love Glenn’s analogy to sex. Everybody is doing it. And the promise of big data
in post-approval research for drugs and devices and how
HIPAA and specifically high tech constrains and
allows that research, what are the ethical considerations
are of that, and then end with a little bit
of recommendations in terms of how this
can be facilitated in a responsible way. So I guess to begin
I guess is why the increasing need for
post-approval research for drugs and devices. We need to first
take a look and say, what we do know when a
drug or device is approved? There are limits of
pre-approval studies and that entails frequent exclusion of
key segments of a population– women, children,
ethnic minorities. Also, the fact
that these studies are conducted on short time
lines with small populations, so there’s the inability
to detect rare but serious adverse events. An example, Natalizumab
and progressive multifocal leukoencephalopathy,
a rare but fatal event that would have
never been caught in a pre-approval
clinical trial. It’s only once the
drug is on the market. Further, we need to
stand back and say, what is going on in
general with drug policy? And regardless of whether
or not you’re in Europe or in the United States,
there’s increasing pressure to get drugs out into
the market faster on the basis of
surrogate outcomes. Which we know after the fact
are on occasion later shown to be poorly predictive
and increasing use of accelerated review pathways. So there is an urgent
need here to be able to conduct these
studies in a fast process and a robust method. And what do we have to do that? We’ve got a considerable
amount of raw ingredients. We’ve got insurance
claims, consumer purchases, electronic medical records. So there you see a graph just
on the uptake of the usage of electronic medical records– wearable sensors, social media. So interestingly, people
saying on Twitter, well, I’ve taken this, and
I have this side effect has actually been
mined as a way, the FDA has explored those
opportunities to say, can we catch early adverse
events through that system, and increasingly,
biological registries. So there you just see the
decreasing cost of whole genome sequencing eventually getting
to a point where everybody can have this done if they want to. So this is a beautiful slide
by Isaac Kohane in JAMA, and I’m not going
to unfortunately be able to do it to justice. But here, you can
see the types of data that exist, the structured
and unstructured nature of it, and how all of these
pieces fit together. But really the key
of this is these are all isolated pieces of data. Their magic is
putting them together, and the question is,
what are the systems in law that allow us to do
this or that hinder that? And whether or not that balance
from an ethical perspective is appropriate. So what in terms of data sharing
to aid post-approval research here, what do we have? In terms of the facilitation
of observational study, what does this data sharing do? It enables enhanced capture. So let’s just go through
a couple examples here. So people with relevant
exposures and outcomes. We know in the United
States, there’s considerable churn
in health insurance. So I might have access
to certain people who start a medication
in one claim database. But if I can’t follow
them through over time, I don’t know if they actually
suffer an adverse event. So the only way
I’d be able to do that is by linking those
claims databases together in terms of variables
of interest. My claims data isn’t ever going
to have information on things like BMI or smoking,
because that doesn’t affect whether or
not I get reimbursed by the insurer for that. But the EHR, the
electronic health record, might have that information. So if I’m able to
link those together, now I have more
variables to study. So again, one problem
with claims data is that if I’m looking
just at claims data, once people go
into the hospital, it becomes a black box. And so I’d like
to be able to know what happens inside
the hospital, but also outside of the hospital. How does that happen? By sharing this
information together. So by enabling
this data sharing, I’ll be able to have improved
statistical power, more rigorous adjustment
for confounding, and more detailed
subgroup analyses. That is particularly
relevant in the context of precision medicine. So now I need robust
statistical power to be able to say which people
are particularly benefiting from this treatment and
which people are more prone to suffer adverse events? And in this era, with the
amount of data that’s out there, there’s the ability to do this. But the question
is, how does HIPAA constrain this or enable it? And one thing I want
to say is, I think, HIPAA gets a little bit of a bad
rap in the sense that, I think, there is considerable
flexibility built into HIPAA. Now, it remains an open
question of whether or not that flexibility
is sufficient, or is it proper in the
realm that it covers. Because, again, as
Glenn alluded to, we’re only talking about a
certain stream of information that it’s covering. So here– it’s a little
bit of a background here– the security rule covers
electronic protected health information, maintenance,
and transmission. And the privacy rule
covers conditions for the disclosure of PHI. And patient authorization is not
required, among other things, for treatment, payment,
and health care operations, for public health activities,
and what we’re talking about here, for research. So you can get an
IRB waiver if there’s not more than minimal risk. But in the context of
observational data, observational studies
where you can say the risk to the participant
is probably minimal– and we can debate that– the problem with
this process is there have been reviews that have
shown that we’re worried about inconsistent or
ambiguous interpretation of federal regulations. We’re worried about changes
required by certain IRBs and not by others, and what that
does to the overall research protocol. So with that in
mind, what does HIPAA enable in this context in terms
of data sharing independent of IRB authorization? So it permits covered entities
and their business associates to disclose PHI without
patient authorization or IRB waiver of authorization
through two paths. One is two general paths. And one is a limited
data set, which removes 16 specific
identifiers, and I’ll come back to this in just a second. The other path is
de-identification. And de-identification involves
turning PHI into non-PHI. And there’s two
paths within that. One is a safe harbor
mechanism, again, is just saying as long as you
don’t share 18 identifiers, you’re fine. The other one is a
bit more ambiguous. It says expert determination. So I can share any data
as long as an expert says that the risk is very small that
the information could be used alone or in combination
with other reasonably available information to
identify an individual. What does that mean? We’ll get to that
in just a second. So now, what about those
cookie cutter pathways, the 16 identifiers or
the 18 identifiers? Is this going to allow us to
do the research we want to do? Let’s back up even
further just a second and say, well, why don’t we
just get patient authorization? We’ve talked about
why IRB authorization might be problematic. Why not patient authorization? Here we’re talking about
millions and millions of patients, and the
process of doing that would also be constraining. So assuming that this is a
pathway we want to explore, we find public health value for
using, this safe harbor says, these are all the
identifiers you cannot share. And as long as you don’t
share this information, what you’re sharing is
de-identified information. So in the context of
post-approval drug research and surveillance,
there are some things that are flagged as
potentially problematic. A key element in any good
observational research is an element of date. We need to have
some sort of timing here as far as when
a drug was taken and when an adverse
event happened. Device identifiers if we’re
doing drug and device research. Also, addresses. Because addresses are oftentimes
the only pieces of information in which we have to understand
socioeconomic status, which could be an important confounder
in the type of research we want to do. So that being said, well,
there are some pathway problems with just these limited
data sets or the safe harbor pathways. The limited data set,
there is this issue of what I am converting. It’s only 16 of those
identifiers can’t be shared, and I can share
elements of dates. But under that pathway,
it remains PHI, and thus, there is still
liability associated with the disclosing party. And then with the safe harbor
pathway, we could perhaps– here’s the hidden flexibility. Now, if you’re a lawyer
and you say that it says, what can I not share? All elements of dates. Well, maybe I don’t need to
share an element of a date. Maybe I just needed to share
time to event information. So maybe I just say that from
an arbitrary reference point, somebody took a
drug 50 days later, and then suffered a myocardial
infarction 75 days after that. Could be sufficient. Could that perhaps get
us out of this conundrum? Think about what I
said about the need to conduct these studies faster
and in as much as real time as possible. As those studies become quicker
and quicker in duration, we are implicitly going
to violate that rule about sharing elements of date. And here’s just one example. So this is an example of a study
that was conducted in Italy. And we can assume, let’s say
it was conducted in the United States, and somebody was– this involved a vaccination
and whether or not there was an adverse event
associated with a vaccination. This study was conducted
from October 2009 to January 2010 inclusive. And so now, I say, well,
what happens if I actually say that from 200
days arbitrarily after some arbitrary
reference point, the person had a vaccine. And then 40 days after that,
they had an adverse event. By just knowing when
this study took place, I would know that
that person could not have been vaccinated in 2010. It had to be in the
second half of 2009. As a result, have I shared
more than an element of a date more specific or more
granular than a year? I have. So again, showing you
there’s flexibility, but that flexibility
still hits a little bit of a roadblock in terms of how
much information I can share. So then there is this
expert determination pathway and that expert
determination pathway says, well, maybe anything goes. Maybe I can craft
this specifically in terms of what
information I want to share. I just have to
have an expert say that there is a low risk
of re-identification here. Now, the problem
is really what does a low risk of identification
mean, who is an expert, and are there certification
standards around? Your answer to all
of those is these are still developing questions. And because these are
developing questions and because there is
potentially a risk of liability for certifying something
that is actually unsafe, there is the problematic nature
that this pathway is being– it exists– but
is not being used in a way that would be
conducive to the data sharing that HIPAA actually permits. So to back up one
second, and, again, say, well, what are the
ethical considerations at play? And we’ve heard this. There is this
inherent tension here between wanting to put
forward as much data sharing as possible
because that will be beneficial from a
public health standpoint. How much time? Oh, great. Thank you. Because that will be
beneficial from a public health standpoint. And that tension
between saying, we want to value people’s
privacy and their right to make determinations
about their data that theoretically they should
own or at least have some say in how that data is used. So here is where
epidemiologists, computer scientists come to the
situation with a little bit of a different
perspective than lawyers. A canonical piece
now in privacy law is the myth of
de-identification. I don’t know if you
guys have read it, but if you haven’t,
you definitely should. It’s by Paul Ohm. But he cites a couple of
examples, most notably one that I particularly enjoy
is by a professor here who, as a graduate student– Latanya Sweeney–
proved that information that Governor Weld
at that time wanted to release into
the public domain was actually not de-identified. So this related to
the hospitalization records of state employees. And what she did is she
basically combined that data with census data, and she
showed that she actually identified from that
de-identified database which records were the governor’s
and mailed them to him. So these examples are
shown in the law literature as the myth of
de-identification. But when you talk to computer
scientists who do data privacy, they’ll say, look, first of
all, that’s not a case of HIPAA de-identified data. And if you actually
take a look and try to model attack scenarios
about how you can re-identify what is HIPAA de-identified
data under the safe harbor, it is very, very difficult.
Even if you can identify a small subset of individuals,
then the question ethically is, what risk are we willing
to accept as a society as OK in terms of being able to
facilitate this data sharing? The real question is not
is there a case of no risk? There’s never really
a case of no risk. And what we need to have more
public discussion of is what is an acceptable level of risk? And so this slide
just shows examples of how difficult re-identifying
HIPAA de-identified information actually is. So here I just want
to say with that in context, what we put forward
is a couple of recommendations in terms of how
you can potentially start the ball rolling in
developing these certification standards. And we just come up with
a few general principles. Which is that in terms of
the risk thresholds setting, if we are saying what an
expert should be involved in, first, we should say that not
all risks are created equal. And so even though the expert
has a considerable degree of discretion given
regulations, given guidance on what an acceptable
level of risk is, or what a very small
level of risk is, the expert should select a
risk threshold proportional to the potential harm of
the re-identification. Not all re-identifications
are the same. If I re-identify
someone who’s taking a statin versus
re-identifying someone who is taking pre-exposure
prophylaxis for HIV and AIDS, the potential danger I
could cause to that person is different in each context. So let’s consider that as one. Two, let’s model different
types of attack scenarios. We don’t really know who
the attacker is going to be. It could be a prosecutor. It could be somebody
who’s looking for a specific known person. It could be someone
who’s trying to identify any person in the
database, or it could be a marketer,
someone who’s trying to re-identify as
many records as possible. When we are trying to
say what the risk is, we need to explore all of
these different options. And then in terms
of risk mitigation, we shouldn’t be overly
conservative in terms of the level of risk that
we’re willing to assume. And finally, in
terms of the experts, there are other ways
we can go about doing this in terms of furthering
the likelihood that there won’t be risk. And one of those is why
don’t we routinely use data sharing agreements
that conditions the way in which the recipient party
will use that information. So hopefully, this
gives you a little bit of a subset of some of
the samplings in the book. But thanks. [APPLAUSE] All right. Hello, everybody. My name is Carmel Shachar. And as Glenn mentioned,
I’m the Executive Director of the Petrie-Flom Center. So before I start
talking on the substance, a few completely
shameless plugs. First of all, working
on this chapter and then joining the
Petrie-Flom Center was not my first
encounter with the Center. I was a student fellow here. And for all of the students
sitting in the audience, checking us out, I
would highly recommend that you go that route. Because who knows, maybe one day
you might steal my job from me. The second thing is if you
are perhaps a little bit past the age of students– and I
certainly don’t see anybody here who appears to
be a day over 23– but there are many
ways for you to get involved in the Center that are
different than being a student fellow. I would highly recommend that
you sign up for a newsletter. It comes out only
every other Friday. We strongly believe
in not spamming. It’ll let you know when all
of these great events happen. So if you enjoy this, there
are many more events to come. I believe that we have
some information in the back that you can take
if you’re curious, or please feel
free to talk to me. All right. Now that I’ve finished the
shameless plug of my section, let’s talk about the
chapter that I contributed to the big data book. My chapter really looked at
a fairly recent Supreme Court case, Gobeille v. Liberty
Mutual Insurance Company, to look at what happens
when law impacts the collection of big data. Because that’s really step one. You can’t do all of
these cool things that you claim to
do with big data if you can’t collect
it in the first place. And here, I’m going
to walk you through, and this is going to
be short, 20 minutes, because I stand between
you and the questions and more importantly,
the free alcohol. I understand. I’m going to introduce
you to two concepts that are not particularly sexy– all-payer claims databases
as well as ERISA– and explain to you how
their intersection led to actually a very
interesting case that had even more interesting public
policy ramifications on how we try to get information
about our health care system and build a better health
care system for everybody. And then I’m going to talk to
you about potential solutions. So first, I said all-payer
claims database or APCDs. I mentioned ERISA. What are these two concepts? So the case I’m talking about
was based on the Vermont all-payer claims database. And that is actually a
great sample for us to use. All-payers claims databases
contain medical claims, pharmacy claims, dental
claims, as well as additional information
about patient and provider demographics in an area. And it includes public
payer information, which is relatively
easy for us to get– your Medicare, your Medicaid– but it also includes
private payers. In most states, until recently,
when they wanted to aggregate this information so they could
really get a sense of all the claims happening in a state
required all employer sponsored health insurance– that’s the
type of insurance I’m willing to bet most of you guys have– to enter their claims
information into their APCD. One thing to note is that
the end user of APCDs is usually not the patient. So you’re not going to
the Vermont APCD database to look up and remind
yourself, wait, when did I get that
prescription for antibiotics? Or when did I have that
burn that I got treated? The end users are policy
makers or researchers or people who are really trying to
understand what is going on with our health care system. This is a tool that especially,
in a fragmented system like ours, is really
valuable for getting these overarching data sets. I also mentioned
ERISA, and when I gave this presentation
at our annual conference a couple of years
ago, I used the joke that ERISA is so boring
even lawyers find it boring. I have not learned anything to
disprove the truth of that joke since then. So I will do this very quickly. ERISA came of age in
the 1970s, and it’s meant to regulate
employee benefit plans that employers sponsor. The idea is that
these plans should be consistent across the country. They shouldn’t be
overburdened by states, and there should be
some accountability so that employees know what
kind of benefit plans they have. One of the most important
and interesting concepts there is the concept
of preemption, saying that the
federal government’s laws trump the state laws. ERISA has one of the
broadest preemption clauses of any modern
federal statute, which means that there have been
a whole host of cases trying to figure out at what
point does ERISA stop and state law begins. I often liken ERISA to
the fat woman singing. When she sings,
everything’s over. Before Gobeille,
the case I’m going to talk about, the standard was
that a state law was preempted if it had a connection with
or to reference to such an employee benefit plan. Which was still fairly
broad, but had this idea of a strong connection. You’re really touching
something that ERISA should be regulating. One thing to note is that
ERISA does have a reporting requirement, but its
reporting requirement is really limited to
the financial health of the employee benefit plans,
including health insurance plans rather than the APCDs
reporting requirements, which is looking at health
care data explicitly. So now that I have gone
through those two concepts, let’s talk about the
Supreme Court case that inspired me to
write this chapter. The central issue
in this case is whether ERISA preempts the
Vermont state law requiring all health insurers,
including those who are self-funded
by employers, to report claims and health
care service data to the state in order to flesh out the
data set for the APCD. Specifically, the question
was whether ERISA preempted the Vermont statute in
establishing a unified health care database and requiring
health insurers to report health care claims data
that may not necessarily be financial reporting data. Again, I emphasize
ERISA wants you to report financial
data to look at overall is this plan healthy? Is it going to be around to
provide the benefits employees were promised? Whereas the APCD
could, in some ways, care less if your
health insurance plan was around tomorrow. It just wants to know
what your health insurance plan has already paid for. So the Supreme Court concluded
that ERISA did indeed preempt Vermont’s APCD. It said the Vermont law had a
connection with the ERISA plan, that it tried to govern
a central matter of plan administration– in this
case, reporting, disclosure, and record keeping– and that it interfered
with ERISA’s goal of having a national
uniform plan administration. One thing I want to highlight–
and we’re going to talk about it under potential solutions– is that Justice Breyer wrote
a concurrence in which he noted that this
decision would cause serious administrative problems. So he said, this is good law,
but perhaps not good public policy. And he suggested
that the states work with the Department
of Labor, which is the agency charged
with issuing regulations around ERISA as well as the
Department of Health and Human Services to remedy this issue. Justice Ginsburg
wrote the dissent, saying that Vermont’s law should
not be preempted by ERISA. And that was because she felt
the record collection were so different that
having the APCD and requiring to report
this information in did not impose a substantial
burden on ERISA, because they elicited different information
and serve distinct purposes. So now I’ve introduced
you to APCDs and ERISA and explained the
collision between the two. But you might be asking,
OK, why is this a concern? Why should I care about this? How does this
relate to big data? Let’s go back to the sexy
metaphor about teenage boys. Well, first let’s talk
about the overall impact. This is a map of what APCD
land looked like in 2016 right before the case came out. We had about 18
states with APCDs, about 12 interested
in developing them. And a lot of the reason that the
states wanted to do this work is because everybody’s
talking about the runaway cost of health care. You read a million
articles on that. And states felt if
we have an APCD, we can know what is
being spent and that transparency is the first
step to controlling costs. Now in 2018, the map
looks a lot grayer. And the gray are
for states that have decided I’m not really
interested in doing anything with APCDs. So already, you
have a lot of states that were going to try to
do these big uniform data sets that would have been really
interesting to play around with, and walked it back. I also want to say this map is
a little misleading, I think, because the blue states
are states that have APCDs. But after the
Supreme Court case, they don’t have
the complete data. They’re so far from being
a universal data set. So to give you a
sense of it, in 2017, about 151 million
people had employer sponsored insurance,
of which 60% were self-funded by their employer. So these are the plans
that ERISA protects, and that after Gobeille, we
couldn’t collect data on. In Vermont alone, we know that
the ruling eliminated about 20% of the total population
from the data set. That’s a really big chunk. Furthermore, the
people eliminated, it’s not consistent. So in Vermont we know
that a higher percentage of non-elderly women,
59%, as compared to non-elderly men, 55%, are
covered in these employer sponsored plans. So we’re losing more
women, and our data set is getting less representative. And the less
representative our data set is, the worse
the conclusions that we can draw from it is. So we also have
a problem when we have data sets now that
are composed largely of public payer data. Because Medicaid and Medicare
do cover a ton of people. You can get some
really interesting data out of that population alone. But Medicare, obviously
skews very old, and Medicaid skews towards
lower socioeconomic status. And so that’s very
different health care data than when you include
the employer sponsored people. So here I’m going to give
the pitch on why APCDs were really useful and exciting. Here were some of
the types of things that states were trying to do
before the Supreme Court case. So Colorado was trying to use
its APCD for price transparency to try to increase competition
for maternity services and hip and knee replacement,
to improve those prices and bring those costs down. In New England, a
researcher used an APCD to compare outcomes of
services for children enrolled in Medicaid
and children in commercially insured plans. Which now you can’t
necessarily do, because you’re missing that chunk. And they found that there
were serious problems and differences and outcomes
between these two populations. Which might be common
sense, but also is very important to study. Lastly, the opioid
epidemic is obviously a big issue in our country. It was in 2016. Unfortunately, it
continues in 2018. And in Maine, people were
using data from APCDs to try to see what
sort of patterns could indicate that somebody
either had a substance abuse issue or was at risk
of developing it, so that they could flag
those people for earlier and earlier interventions. These are all worthy topics. Some other issues
I want to raise when you try to do big data
work with APCDs when they’re incomplete data sets is
that you lose the ability to catch a lot of things. So Ameet actually mentioned
this drug earlier, but in a large
enough data set, you can see these really rare but
important side effects that can’t be caught as you
reduce the number of data points you have
in your data set. So just from a sheer numbers
point, we’re losing something. There’s also the issue of
the validity of the data after all of these
employers say, no, I want to keep my
information proprietary, especially when you’re
trying to do research on the impact of socioeconomic
status on health and health outcomes. Lastly, we’re losing a
key resource for gauging health status and needs. So for example,
researchers in Tennessee tried to see what the outcome
of using a particular antibiotic would be, and they found that
there were some serious health outcomes enough to
say you should not be using this antibiotic
to treat this condition. Danish researchers tried
to recreate the study using their equivalent of an APCD. And there, because of their
health care structure, it truly was universal
across all economic classes, and they didn’t
find this outcome. And they theorized it
was because the Tennessee researchers had used
Medicaid data only. Because of the confluence of
socioeconomic status and health and health outcomes, they were
only seeing the more vulnerable as opposed to the
general population. So hopefully at this point,
I’ve convinced you guys that having incomplete
data sets is actually a big problem for everybody and
not just for the kind of people who like to write
chapters in books titled Big Data and Health Care. Here, I want to suggest a
couple of potential solutions. And I referenced giving
this talk in 2016 at the Big Data Conference. And when I went back to my
slides to update it in 2018, I found that I needed
to do virtually no updating of
potential solutions, because nothing
really had happened. And I think this is, again,
the issue of sometimes an important development
happens in the law, but for whatever reason,
the public policy community isn’t as invested in
it, and so not much happens. So the biggest
potential solution, and the one that
Justice Breyer noted, was he said the Department of
Labor is responsible for ERISA. They should issue
regulations requiring the collection of this data. This did not happen. The Department of
Labor essentially did what I do when a friend that
I don’t really like texts me to ask if I want to
go out to dinner. They kind of said
yes, and let it die. So the Department
of Labor in 2016 issued proposed regulations
to collect this data, and there were problems
with those regulations. The data they wanted
was not as complete as the states were collecting. They said, let’s collect it
on an annual basis as opposed to a monthly basis. And a lot of people wrote in. There were about
300 comments, which is pretty good
for something that is a little bit
on the drier side, and then they did
absolutely nothing with it. They still have done
nothing with it. It is not on their
list of priorities. And I think they rightly say we
don’t really know health care. We don’t want to get in the
business of collecting health care data, and there just
isn’t the political appetite to do it. There are also some other
issues whether if the Department of Labor tries to
collect this data, are they able to
waive preemption and say, states just
collect it as you want. Would that open them
up to another round of follow-up litigation? The issue of if they
collect the types of data that fall under their
purview, which is just the self-employed plan, OK, then
they have an incomplete data set, and the states have
an incomplete data set, and can we necessarily hook
them up to create one whole one that is going to be as good
as if one entity had collected that data in the first place. And then, again, there’s just
the political appetite for it. It’s not their priority. Of course, private industry
has tried to step in here, and some health care datas
are volunteering their data. So Aetna, Humana,
United Health Care had put together a database
that some researchers have used. But Blue Cross Blue Shield,
which is a major insurer, has not volunteered their data. There’s also the issue
when you volunteer, you volunteer what you want,
and it might not necessarily be what’s needed. There is a group called
the APCD Council, which is trying to spearhead a common
layout to say all states should use this layout in
order to collect data, to make it more uniform, to
ease the administrative burden, and that’s a step in
the right direction. But without regulations
or real big push for people to adopt
that data sharing, that template remains
just a template. So I want to highlight that
saying voluntary contributions are the solution kind of stops
before you tweak it to make voluntary contributions
appealing enough to do on a widespread basis. Some things that we
need to think about are tax incentives to get
health insurers to do this. Incentives to third
party administrators who administer these
plans, again, to say we really should be
sharing our data. Offering incentives
to employers. But then we also need
to talk about what sort of changes to data privacy
regulations and liability regulations do we
need to have to allow these entities to disclose
information that might otherwise be tied up in
fears of violating HIPAA or be tied up in
non-disclosure agreements. And so I wish I could
end on a slide that said things we have accomplished
since 2016 to make sure that we have good
quality health care data. We haven’t. On the other hand, that
leaves an opportunity to really improve
this and make sure that we continue to have
the kind of big data sets that we need in order to do
some of the interesting work described in our book. [APPLAUSE] Hopefully, you got
something interesting out of that exchange. If you want to share what
you got out of that exchange, that would be great too. But we have the first
question, I think, over here. Adrian Gropper,
patient privacy rights. I feel to some extent
that I’m listening to a bunch of astronomers before
Copernicus sort of wondering why do the planets move,
these things in the sky move the way they
do in the sense that both the law and
everything around that we’re talking about here is focused
on institutions being where all the data processing, all the
information technology rests, and the individuals do not. People do not, the patients,
the people do not have agency, do not control
technology, do not have ways of automating
the authorization for how data is used for all
these wonderful purposes. And that to me, seems like it’s
rapidly obsolete as a concept, because as Facebook and
everybody else aggregates these data sets in
various ways, we realize how valuable they are. Roche paid $2,000 per patient
for cancer data on a million patients earlier this year. And yet Moore’s law is
driving the technology down to where compared to $2,000,
controlling this data yourself is trivial. How do we get to this next
phase of technology and law? Who wants to take
the first stab? So I will say, I think Facebook
is a really good example, not only of the
dangers of what happens when we think that companies
and entities own people’s data as opposed to
people themselves, but also in terms of
the market forces. We all know Facebook has
some really terrible privacy practices. But how many of you are pulling
your Facebook and Instagram profiles and joining
some social network that advertises itself as
having better privacy practices? I think I’ve seen
studies indicating that people say they value it,
but they have yet to act on it. And I think some of that
next stage of the data revolution you talk about
will really only happen when people start
saying, I will only see providers that
give me control and ownership of the data. Or I will only use
apps that explicitly give me ownership of the data. And I think you raise a
great point in the sense that I was focusing in on a
discrete situation of research, but we’re talking about– when we’re talking
about HIPAA– we’re talking about a
constrained area in terms of a universe in which
a lot of commercial data sharing is taking place and
a lot of frankly exploitation is taking place. And there are
definitely solutions that can happen there. And I think one framework
that Urs mentioned that is helpful to think
about is how do we operate within the existing law,
but how do we also now learn from the European experience
in terms of this wider realm of data privacy and
agency that we are going to give individuals in terms of
how their information is used. We’ve got a question over here. And I gotta say before
you ask your question– I’m sure it’s going
to be excellent– but that outfit is outstanding. OK, go ahead. Thank you. I have a question for Ameet. You showed a very
interesting slide where there were several streams
of data of medical records, variables, claims, all of that. One of the things
that I was actually wondering about that slide is
where is medical error data? In the US, medical error is the
third cause of patient death. In 1999, the National
Academy of Sciences published a report noting
that around 98,000 people die in the US because
of medical error. And two years ago,
our researchers from Johns Hopkins
University School of Medicine actually updated the
report and showing that the actual count is higher. So it’s not 98,000, but actually
between 250,000 and 400,000 death because of medical error. So I mean, those
numbers are very big, and they really remind us
about how inefficient health care in the US is, and
also how patient death due to medical error is an
under-recognized epidemic. So how do you integrate
that within that slide? So again, excellent question. I think that one of the things
to recognize about that 98 or whatever, that
Institute of Medicine report, and then this more
recent Hopkins’ study, it took a lot of heat among
the medical community, because a lot of that is
based on extrapolations from very small samples. And the problem of
medical error is huge. We don’t honestly know
specifically how huge it is. And one of the
problems is the ability to study that question
systematically using representative data is
part of the problem is access to a single
streams of information. So some of that information
could come from EHR data, but you need to have EHR
data from hospitals that are academic medical centers that
have great EHR systems plus rural medical
centers that don’t. And how do you start doing
this in the process of as we’re getting into a more
digital age, we need to be able to
combine those data streams in an effective process. And part of the barrier is not
necessarily what HIPAA says, but the way that
people interpret HIPAA and the risk averse nature
that a lot of people have. So I think it’s an
important recognition that medical error is one
of those types of things that could benefit greatly
from greater data sharing. Maybe I’ll just add
a point on this. So one good error
you did not make but sometimes people do make
is conflating medical error with medical malpractice. In that the best
studies we have, my old colleagues Michelle
Mello and Dave Studdert show that actually the
medical malpractice system does a terrible
job in both directions. It claims things are
medical error that are not medical error, and
it misses many things that are medical
error by labeling them no recourse available. And the other thing
I’ll say maybe is that even if you could
do this well and measure it through the big data, there’d
be a question about what you’d do with that data. People who think that
patients care about that data, I got to say, the results from
what we see about care quality and attempts to communicate this
through report cards in states like New York and Pennsylvania
are very upsetting if you’re like, data wants to be free,
empower the patient, dot, dot, dot. Turns out, patients tend to
still prefer word of mouth over data when presented it. Now maybe that’s a
fault of ours in the way we’ve presented the
data in the past, but I would not rely
on patients to be the engine for change here. If anything, it’s likely to
be accreditors, hospitals, insurers, and the like. And just one follow-up point
to that is one of the Vs that we don’t mention in
terms of what big data is is veracity, and a lot of
big data frankly is crap, and so we need to be
conscious of that. Please. So Professor Cohen
touched a little bit on under sampling of
minorities and data sets. And I guess something
related to that is that a lot of the time when
machine learning algorithms are making predictions based on
data sets, it’s a black box. So it’s unclear what
it’s doing and how. And I’m sort of interested
in how the law is going to react to that element of AI. Maybe I’ll start, because we
do have a project on precision medicine AI and the law five
year project that’s exactly about black box medicine. Here are a couple
of things that I think are interesting
about it, but I think is quite complicated. One is actually on the
intellectual property side about the trade-offs between
trade secrecy on the one hand and patent on the other, and
difficulties because of both the European and
the US jurisprudence about patent that prevents
people from patenting many of these algorithms. And therefore, pushes them
towards trade secrecy regimes, which are much less transparent. So that’s one issue. The other, I think, more
interesting question has to do with– maybe two more and then
I’ll turn it over to someone else– second one is about medical
education and the conception of the doctor’s own role. I’m not a physician. I have at least one
physician on the dais. But in a future where many of
the decisions and advice you’re giving are things that you
can’t explain in human language because it’s non-interpretable
machine learning, one wonders how one
trains a generation of physicians that will
interact with these systems. The last thing I’ll say
is just about liability. Very large amounts of
uncertainty about what happens when there’s an error. Does the device maker that
has one of these systems built into it have the liability? Does the hospital system have
it for adopting it without doing its due diligence? Does the physician have it? What we know about the
jurisprudence on computer decisioning, which is
probably the best analogy, is that physicians are basically
held to the standard of care they normally are and
have responsibility for this even when
it comes to a system that they’re pushed to use,
even when it comes to a system they don’t understand,
even when it comes to a system that’s built
in their workflow that’s quite difficult for them to override. So if you’re a
physician, I would say this is one of the things
that maybe should concern you about black box algorithms. Just one thought
I wanted to add. I really believe this
question of inclusion is one of the very big
challenges in health, but also more
generally as we talk about AI and the power of data
and algorithms going forward. And specifically here as you
mentioned in your presentations as well that I do
believe this is really an ecosystem level challenge. Some of the promise we’ve
heard is that we have Fitbits and we have all these
alternative sources that may give data points that
then can later be analyzed. But if you look at it who has
access to this technology? Who is actually affording
to wear Fitbits? And who has access also in terms
of institutions, hospitals, your example you just made. We still have real digital
divides and participation gaps, and so I do feel there is
a deeper challenge not only looking at the big data side
and how to create inclusive data sets, but even more
so who has access to the basic technology
and under what conditions. And then lastly, there is
also a challenge in who is designing these systems. It’s a debate that we are
very much interested in more generally. Is it Silicon Valley designing
now health applications, and what are the design
biases by our friends at the West Coast. Is that a representative
group of designers that thinks about under
represented communities, for instance? So I really believe
to focus on data and data sets is
hugely important. And there is also
much progress made to make more
inclusive data sets, but that alone will
not do the trick. This is really
ecosystem level problem from my perspective at least. So I will jump in to say that
1557 of the Affordable Care Act, Section 1557,
was designed to try to bring anti-discrimination
laws specifically to the health care context. It was the first time it was
explicitly applied there. And now I think 1557 has
some interesting potential to reshape the way that
we distribute health care. I think it will probably take a
change of administration and/or Congress and/or several
changes of administration before somebody dusts it off. It is still technically
on the books and does something
really creative with it. Having looked at trying to
bring discrimination claims in the context of
insurers benefit design, it’s really, really difficult. But there is a kernel
of an argument there. Just nobody has really been
able to effectively pull it out. And I think, at a
certain point, you could try to pull it out in the
context of if you’re developing algorithms that make
medical decisions, and these algorithms
are based on data sets that we know to be problematic
from a race standpoint then continuing forward
with that product may look discriminatory. But I would say that’s many
law review articles, many court cases, many clinic
students away from being a fully developed legal theory. So let me ask a question
to you, the panel, and this is kind of a general question. You look back at this
conversation 10 years ago– I doubt you’ll remember that
we sat here 10 years from now, but let’s just say you did– what do you think your
big Homer Simpson doh moment is going to be? What is it that
you’re going to think you got totally wrong about how
the future’s going to develop? Or what do you fear
you might get wrong in terms of the way
you’ve been thinking about this up until now? That you think the future might
be radically different than you think it’s going to be. So I will go first, and I
will sidestep the question a little bit to say– You can tell she
went to law school, avoiding the hypothetical. Go ahead. Yes. Exactly. I was taught by the best. I will say about 10 years
ago, I started law school. And I knew one person
who had an iPhone, and that was a big deal. That was my defining
characteristic. Oh, that’s the girl
with the iPhone. And I had no idea
that in 10 years, we were all going to carry
around these pocket computers, and I was going to tell
it every little detail of my life like an app. And when you look
at HIPAA, I think some of the challenge
of applying to HIPAA, it was written in 1996
way before there was even one girl with an iPhone. There were no one
with an iPhone, and nobody could really
think about data in the sense of this personal access point. They thought about it in terms
of computers and offices. And so I think probably
in 10 years from now, we are going to look back and
be like, remember when everybody carried around phones, and you
didn’t just speak to the air or blink your eyes twice and
that sent data to something? So I think it’s going to be a
completely different paradigm. I guess one thing I’m a
little bit worried about is if we don’t act to regulate
the outside of HIPAA space in terms of data privacy,
and we don’t give more rights to individuals that all of
the potential promise of data sharing that we may
enter an atmosphere and a society that is
very rigid in the way that it now wants to share data. And that if we don’t
get ahead of the curve that we are not going to
enter a world of the promise of big data, but rather
the loss of potential that was once there. Think I have two
answers, and the answers contradict each other. On the one hand side,
I worry that somehow we are on a path dependency
mode where we are not really embracing the
opportunity that is ahead to build better systems,
to build better health care systems for more people, because
of a lack of imagination, because of a lack of– due to the constraints that
are within these institutions that came up before. So that’s one worry. And then I have the other very,
which is almost the opposite, that the transformation
we’re going through will be so dramatic
and so radical that we’re vastly underestimated
how transformative it will be. And that the biggest challenge
will go back to your point, again, how do we preserve
human autonomy and human agency as things become
more institutional, more machine driven, more
machine decided, and what does that to us as
human beings, and where are the safeguards for that? So this is a contradiction,
but two things I worry about. So Yogi Berra said it’s very
hard to make predictions, especially about the future. But I will say that
it’s much easier to have confidence
in your predictions when you have free beer in
your system or free wine. So I’m told we
should wrap it up, so that we can enjoy downstairs. But please join me in
thanking this amazing panel. [APPLAUSE]

Leave a Reply

Your email address will not be published. Required fields are marked *