Sharing sensitive data requires careful consideration, but it can be done. Find out how.
Sensitive data can be shared!
Sensitive data can be human data (e.g. health and personal data, secret or sacred practices) or ecological data (may place vulnerable species at risk).
Given the nature of this type of data, you might expect that it can’t be shared and reused. But in many cases, it can be.
1. Explore one of these examples of published sensitive data:
- Remember we met the Pregnancy and Lifestyle study (PALS) dataset in Thing 3 Activity 2 as an example of open data? It shows how sensitive data can be safely de-identified and openly shared. Click on Pregnancy and Lifestyle study (PALS) and then “Go to Data Provider” to see the actual data.
- This 1 page story tells how sensitive data from the Australian Longitudinal Study of Women’s Health data has been successfully published for almost 20 years. Note the data is available through conditional access, as introduced in Thing 3: Data sharing and discovery Activity 2.
2. How do you share and publish sensitive data?
- Browse through the ANDS sensitive data webpage.
- Click on the Publishing and sharing sensitive data flowchart to get an overview of issues and solutions.
- If you have time: follow a couple of the links on the sensitive data page which are of particular interest to you.
Consider: Imagine you are either a researcher or a participant in a health survey:
- Participant: what questions might you first ask the researcher about intended sharing and reuse of the survey data?
- Researcher: What responses would you need to prepare to anticipate participants questions about publishing “their data for all the world to see”?
De-identification of data
De-identification is a process that balances the risks of producing safe data with maintaining useful data. When it is done well the risk of disclosing information referring to individuals should be negligible.
- Explore this guide to anonymisation of medical data
- Discover some tools and resources for information about de-identification of data.
Consider: are there any tools or resources you have come across that could help a researcher de-identify or anonymise their data?
Consent for data sharing
Informed consent is required from human participants before obtaining and publishing data. The best time to obtain consent is before the data are collected. Participants should, at a minimum, be informed about procedures for maintaining privacy and the conditions under which the data will be shared.
Explore one, or more, of the following consent forms that ask for permission to share research data:
- UK Data Archive sample consent form
- Global Alliance for Genomics and Health consent tools (halfway down page. Open and focus on Section C)
- Health Science Alliance Biobank Consent
Consider: would you be willing to sign that consent form?
If you have time, check out the Personal Genome Project’s Global Network Guidelines which say that "risks of participant re-identification are addressed up front, as an integral part of the consent and enrollment process; neither anonymity nor confidentiality of participant identities or their data are promised to research participants" (and if you’re really keen check out this article examining the issue in more depth).