Research Projects Dr. Wang has worked on

Preserving Privacy in Human Genomic Data

September 01, 2015

Genome-wide association studies (GWAS) have received intensive attention due to the rapid decrease of genotyping costs and promising potential in genetic diagnostics. GWAS typically focus on associations between single-nucleotide polymorphisms (SNPs) and human traits like common diseases. However, sharing de-identified raw data, or only summary statistics from GWAS studies, can incur privacy disclosure for GWAS participants and potentially for regular individuals whose genetic data are collected by organizations such as hospitals or gene banks.

SMASH: Semantic Mining of Activity, Social, and Health data

May 01, 2013

Two thirds of the US population are now overweight or obese. This incurs significant health risks and financial costs to society. Traditionally, support groups and other social reinforcement approaches have been popular and effective in dealing with unhealthy behaviors including overweight. Of the factors associated with sustained weight loss one of the most important is continued intervention with frequent social contacts. Research in the design and implementation of the SMASH (Semantic Mining of Activity, Social, and Health data) system will address a critical need for data mining tools to help understanding the influence of healthcare social networks, such as YesiWell, on sustained weight loss where the data are multi-dimensional, temporal, semantically heterogeneous, and very sensitive. System design and implementation rest on specific aims including to develop a novel data mining and statistical learning approach to understand key factors that enable spread of healthy behaviors in a social network and to protect the privacy of human subjects during the data mining process for social network and health data. We consider the enforcement of differential privacy through a privacy preserving analysis layer. We develop novel solutions to preserve differential privacy for mining dynamic health data and social activities of human subjects.

EAGER: Spectral Analysis for Fraud Detection in Large-scale Networks

September 01, 2011

This project takes a unified spectral transformation approach to address challenges of analyzing network topology and identifying fraud patterns in large-scale dynamic networks by using data spectral transformation with network topology visualization. Large-scale social and communication networks contain rich topological information embedded inside, in addition to various structured, semi-structured, and unstructured data. The research is characterizing patterns of various attacks in the spectral projection space of graph topology and developing spectrum based methods to identify these attacks. The approach, which exploits the spectral space of the underlying interaction structure of the network, is orthogonal to traditional approaches using content profiling. The ability to perform this spectral analysis is dependent upon the development of complex mathematical techniques. Critical issues that are being explored include the scalability of the methods to very large data sets, the determination of the dimensionality of the node representation in spectral space, and the interpretation of patterns in spectral space.