Guest Post: An Education in ‘Re-identification’: Learning From the Personal Genome Project

By LEILA JAMAL, ScM

Leila Jamal is a genetic counselor in pediatric neurology and a PhD student in Bioethics and Health Policy. The views expressed here are hers and hers only.

As many are now aware, Latanya Sweeney and her colleagues at Harvard’s Data Privacy Lab recently published a study demonstrating the individual “re-identifiability” of research participants in the Personal Genome Project (PGP). Despite misleading news coverage overstating the proportion of individuals Sweeney’s team re-identified (using self-reported birthdates, genders, and zip codes) the study has sparked some useful discussions about the implications of ‘re-identifiability’ for genomic research and the ethics of re-identification demonstrations. For a comprehensive roundup of these issues, I recommend reading a series of perspectives corralled by Michelle N. Meyer in her Online Re-Identification Symposium over at the Petrie-Flom Center’s Bill of Health. This post will not match the breadth and depth of insight covered by my friends and colleagues there. My more modest aim is to contemplate what genomic researchers and counselors can learn from the ripple effects of Sweeney’s study.

The PGP is demonstration project in its own right, with one of its goals being to “explore the opportunities, risks, and impacts of public genomics research”. As clairvoyants who saw the pitfalls of guaranteeing anonymity to participants in whole-genome research early on, PGP founder George Church and colleagues developed a novel strategy for securing the trust of prospective participants by privileging the principle of “veracity” in their informed consent process. Accordingly, the PGP informed consent form clearly tells prospective participants that any personal data they contribute to the PGP may be linked to their individual names.

By pursuing this strategy, the PGP nudged a shift in our thinking about the risks of genomic research. The emphasis on veracity reflects a subordination of concern about risks to individuals posed by anonymity breeches in favor of concern about risks to genomic research posed by breeches of researcher-participant trust. Since its inception in 2005, the PGP has reciprocated the openness of its participants by developing open-source research tools, hosting them at an annual education meeting, returning their individual research data, and keeping them abreast of the PGP’s activities with blog updates.

In light of the PGP’s emphasis on transparency and data-sharing, a key question raised in the aftermath of the Sweeney et al. study is whether participants had a “right” to be distressed – or even surprised – that their identities were (potentially) made public by a third-party demonstration project. In a pair of symposium posts, Madeleine Ball and Misha Angrist stress that the possibility of individual participant identification from PGP data is explained thoroughly in the project’s informed consent form, pre-enrollment study guide, and ongoing correspondence with participants. Their advice to anyone in the PGP with residual concerns about being identifiable? To refrain from sharing ‘sensitive’ data with the PGP, or to withdraw what data they can from the protocol altogether.

On the surface, these suggestions make complete sense and are consistent with the PGP’s fidelity to the principles of veracity and respect for autonomy. Yet their bottom line makes me uncomfortable. It reminds me of a recent meeting I attended where Johns Hopkins bioethicist Jeffrey Kahn spoke to a group of communications researchers about the ethical issues raised by using Twitter API and other internet data in public health research. Kahn’s suggestion that mining ‘anonymous’ Twitter data (which is stamped by time and location) for health-relevant content could be upsetting or even harmful to some Twitter users was met with a common rebuttal, loosely paraphrased as follows: “If they don’t want it used, they shouldn’t have put it out there.”

To me, this sounded like the research ethics equivalent of being told I deserve to be catcalled for wearing a skirt in the street.

Obviously, the PGP is not trying to be the street, nor is it trying to be Twitter. Given the PGP’s specific ethos and aims, some might argue that adopting a “we told you so” approach to informed consent is sufficient to advance the project’s research aims (though I suspect not, given my wholehearted faith in the PGP’s commitment to collecting reliable phenotype data and recruiting diverse participants, not to mention departing from the status quo in research ethics.) To its credit, the PGP has welcomed the response to Sweeney’s re-identification demonstration as a teaching moment and is soliciting feedback about how to improve its communication with participants. The PGP’s humility moves me to consider: What are the rest of us taking for granted about research participants’ long-term views regarding secondary uses of their personal data – ‘identifiable’ or otherwise?

In her own re-identification symposium post, Meyer highlights a number of concerns I share (in case I butcher them in what follows, I encourage readers to refer directly to her original words.) Responding to Angrist’s question about why she remains in the PGP despite misgivings over Sweeney’s findings, Meyer draws an important distinction between a) assuming the risk of individual re-identification to advance biomedical research (which she authorized) and b) providing consent for third parties to use her data with the explicit goal of determining her identity (which she did not). At the core of Meyer’s qualm is that “choosing to share personal information when asked is different than having that information taken from you without your permission or even knowledge” [emphasis mine]. Her point is that we shouldn’t have to choose ‘both’ or ‘neither’ to participate in genomic research.

The irony of this debate is that the PGP leadership has asked its discontents to withdraw data from the protocol to mitigate their concerns over the risks of being re-identified, when the breech Meyer refers to is one of trust and shared understanding about the purposes of donating her data to research. Aren’t trust and understanding the very dimensions of the research-participant relationship the PGP seeks to preserve with its veracious approach to informed consent? If so, this is a critical lesson for any of us involved in biomedical research at a time of impending (we think) regulatory reform. If such misunderstandings can surface in a cohort of scientifically literate and motivated “genomic altruists” despite a rigorous informed consent process, what does this presage for other, less thoughtful research projects in an era of genomic identifiability? It would suggest that reforming U.S. research ethics regulations to encourage the use of more ‘open’ informed consent protocols administered at a single time point would be insufficient to respect autonomy and voluntariness in research participation. At best.

As a member of the ethics team for Genetic Alliance’s new Registry for All Diseases [Reg4All] I follow events like these with interest and concern. Reg4All is committed to building an inclusive, accessible research repository while honoring heterogeneous privacy preferences and facilitating participants’ control over aspects of data-sharing that matter to them. Like the PGP, Reg4All will evolve in response to the engagement and feedback of research participants. In order to listen to them, we must know who they are. But once we do know, we must REALLY listen.

Guest Post: An Education in ‘Re-identification’: Learning From the Personal Genome Project

Leave a comment Cancel reply

Categories

Blogroll

The DNA Exchange

Recent Comments

Guest Post: An Education in ‘Re-identification’: Learning From the Personal Genome Project

Share this:

Related

Leave a comment Cancel reply

Categories

Blogroll

The DNA Exchange

Recent Comments