Ph.D. Candidate: Armin Pourshafeie
Research Advisor: Carlos D. Bustamante
Date: Tuesday, February 18, 2020
Location: Hewlett 101
Title: Decentralized Data Analysis: Genome-Wide Association Studies and other biomedical applications.
The recent growth of data with potential medical ramifications has led to a better understanding of complex disease pathways and risk factors. With this expansion of medically relevant data, it has become more and more expensive, and unsafe, to host entire datasets in a centralized location. The personal nature of the data makes the risk even more substantial. Often, meta-analysis techniques offer a feasible solution; however, they can introduce bias or may not be applicable in certain cases (e.g. small sample sizes). Federated learning can be used to combine some of the benefits of both data centralization and meta-analysis.
I will describe how federated learning algorithms can be used with multi-party computation or partially-homomorphic encryption techniques to perform regression-based association analysis and improve data privacy. I will show that these techniques are practical in many low-dimensional settings and that some even generalize to computationally intensive tasks such as genome-wide association studies (GWAS) at consortium scale. Finally, I will introduce our federated GWAS platform, HyDRA.