Preserving Privacy in Human Genomic Data
Date:
Genome-wide association studies (GWAS) have received intensive attention due to the rapid decrease of genotyping costs and promising potential in genetic diagnostics. GWAS typically focus on associations between single-nucleotide polymorphisms (SNPs) and human traits like common diseases. However, sharing de-identified raw data, or only summary statistics from GWAS studies, can incur privacy disclosure for GWAS participants and potentially for regular individuals whose genetic data are collected by organizations such as hospitals or gene banks.
In this project, we conduct systematic studies of genetic privacy disclosure analysis and develop rigorous privacy protection methods when various advanced analytic methods are applied on GWAS data or other types of data (e.g., phenotype data, expression data, reference alignment data, variance call format data, PubMed) are linked to GWAS data. We study various privacy breach attacks and examine various types of background knowledge which may be collected and exploited by attackers.