Skip to main content

Current Funding

Community Data Hub for Integrative Visualization

National Institutes of Health - Common Fund

NIH/OD Federal U24 OD038421

September 2024 - August 2027

Role: Principal Investigator

To develop a transformative cloud-based visualization hub to enable deep, interactive exploration of multimodal biomedical data, using large omics and imaging datasets sourced from CFDE DCCs and external consortia.

Accelerating Discovery with AI and Grammar-based Visual Exploration Interfaces for Biomedical Data Repositories

Advanced Research Projects Agency - Health

ARPA-H AY2AX000028

August 2024 - August 2027

Role: Principal Investigator

To build tools that will enable data consumers from a wide range of user audiences to describe their discovery needs using a wide range of input modalities, including text and speech prompts, to query biomedical data resources using knowledge-driven LLMs that construct discovery interfaces based on their needs, and to visualize and integrate retrieved datasets as part of the discovery process.

Data Analysis Center for Somatic Mosaicism Across Human Tissues Network

NIH/NIDA Federal

UM1 DA058230

May 2023 - April 2028

Role: Co-Investigator

To operate the NIH SMaHT consortium data coordination center, including data analysis, curation and visualization in the SMaHT Data Portal.

KPMP Kidney Mapping and Atlas Project (KMAP)

National Institutes of Health - NIDDK

University of Michigan (NIH) Federal U01 DK133090

September 2022 - June 2027

Role: Co-Principal Investigator

To lead visualization-related research and software implementation activities of the proposed KPMP Tissue Atlas Coordinating Center (KTACC).

Grammar-Driven Genomic Data Visualization

National Institutes of Health - NHGRI

NIH/NHGRI Federal 1R01HG011773

June 2022 - March 2026

Role: Principal Investigator

To create a grammar, recommendation system, and design tools to democratize interactive genomic data visualization.

Cellular Senescence Network (SenNet) Consortium Organization and Data Coordinating Center (CODCC)

National Institutes of Health - NCI

University of Pittsburgh (NIH) Federal U24 CA268108

September 2021 - August 2026

Role: Co-Principal Investigator

To contribute data visualization tools for spatial single-cell data to the SenNet Data Portal.

Biomedical Informatics and Data Science Research Training Program (BIRT)

National Institutes of Health - NLM

NIH/NLM Training Grant T15 LM007092

July 1992 - June 2027

Role: Principal Investigator

To further develop the Harvard Biomedical Informatics and Data Science Research Training (BIRT) program and to contribute to the cadre of highly trained independent and successful researchers in the field of biomedical informatics.

Past Funding

Integrative Visualization of Spatiotemporal Tumor Atlases

National Institutes of Health - NCI

NIH/NCI Federal R33 CA263666

September 2021 - August 2025

Role: Principal Investigator

To develop novel visualization tools for the interpretation of spatiotemporal data from tumor atlas project.

Data Exploration and Visualization Tools for HuBMAP and a Human Reference Atlas

National Institutes of Health - Common Fund

NIH/OD Federal OT2 OD033758

August 2022 - July 2025

Role: Principal Investigator

To develop the Tools Component for the Human BioMolecular Atlas Program Integration, Visualization & Engagement (HIVE) Collaboration.

Human Cell Annotation Platform Fund

Broad Institute Research Agreement

500584-5500003090

July 2024 - June 2025

Role: Principal Investigator

To operate and extend the Cell Annotation Platform with additional features for curation, community annotation, and integration of single-cell data.

4D Nucleome Center for 3D Structure and Physics of the Genome (Phase 2)

National Institutes of Health - Common Fund

UMass Chan Medical School (NIH) Federal UM1 HG011536

June 2020 - June 2025

Role: Co-Investigator

To determine how the human genome is folded inside the cell nucleus, and how this organization changes as cells go through important transitions during early differentiation and during aging.

Cell Annotation Platform (CAP)

The Broad Institute (Schmidt Futures) Gift

520-364499

June 2020 - March 2025

Role: Principal Investigator

To create the Cell Annotation Platform web application for community-based creation of cell atlases in collaboration with the Human Cell Atlas consortium.

4D Nucleome Data Coordination and Integration Center (Phase 1)

National Institutes of Health - Common Fund

NIH/NCI Federal U01 CA200059

September 2015 - August 2020

Role: Co-Investigator

To develop a platform for data collection, curation, visualization, and analysis of high-order chromatin interaction data.

QuBBD - Statistical & Visualization Methods for PGHD to Enable Precision Medicine

National Institutes of Health (NIBIB) & National Science Foundation

NIH 1R01EB025024

September 2017 - June 2020

Role: Co-Investigator

The proposed research is relevant to public health by seeking to improve symptoms for patients with inflammatory bowel diseases, which are chronic, life-long conditions with waxing and waning symptoms. Developing novel statistical and visualization methods to provide a more nuanced understanding of the precise relationship between physical activity and sleep to disease activity is relevant to BD2K's mission.

Visualization of (Epi)Genomic Data for Discovery of Disease-Associated Variants

National Institutes of Health - NHGRI

NIH R00HG007583

August 2015 - August 2018

Role: Principal Investigator

This proposal combines an extensive mentored training program for the PI with a research project that aims to develop novel approaches for visualization and exploration that will accelerate the identification and validation of disease-associated variants in large and complex genomics and epigenomics data sets. An increasing number of such variants are discovered in studies that generate and analyze a wide range of molecular data types for thousands of patients or samples. This progress is enabled by the availability of computational analysis pipelines that employ sophisticated statistical methods for next-generation sequencing (NGS) data. Interpretation of analysis results by biological and clinical domain experts, however, is emerging as a major bottle- neck due to the amount and complexity of the pipeline outputs. To address this, we propose to develop inter- active visualization methods and a web-based infrastructure that will enable domain experts to identify disease-associated variants in large (epi)genomic data sets through visual exploration of computational predictions and the underlying data. This will have a significant impact on the rate at which predictions can be verified, interpreted and translated into clinically actionable finding. Our first priority is the design of methods and tools to visualize (epi) genomic data in a range of different contexts, for instance by grouping and representing features based on their function, chromatin state, transcriptional activity or genomic coordinates. We will also develop new non-linear genome representations to compare structural variants across genomes, complementing the functionality of the highly successful genome browsers. We then investigate how information external to the primary data - for instance from other studies, drug target or biomarker databases - can be applied to guide investigators through the data set. Finally, we implement a web-based exploration system for biological and clinical domain an expert that combines our interactive visualizations with large-scale public (epi) genomic data sets. The methods and tools developed under this proposal will be generally applicable and driving biological examples are chosen from The Cancer Genome Atlas (TCGA) and the Encyclopedia of DNA Elements (ENCODE and modENCODE).

Patient Centered Information Commons

National Institutes of Health - NHGRI

NIH 5U54HG007963

Role: Co-Investigator

Center for Stem Cell Bioinformatics

Harvard Stem Cell Institute

August 2014 - August 2018

Role: Principal Investigator

Visualization of (Epi)Genomic Data for Discovery of Disease-Associated Variants

National Institutes of Health - NHGRI

NIH K99HG007583

January 2014 - August 2015

Role: Principal Investigator

This proposal combines an extensive mentored training program for the PI with a research project that aims to develop novel approaches for visualization and exploration that will accelerate the identification and validation of disease-associated variants in large and complex genomics and epigenomics data sets. An increasing number of such variants are discovered in studies that generate and analyze a wide range of molecular data types for thousands of patients or samples. This progress is enabled by the availability of computational analysis pipelines that employ sophisticated statistical methods for next-generation sequencing (NGS) data. Interpretation of analysis results by biological and clinical domain experts, however, is emerging as a major bottle- neck due to the amount and complexity of the pipeline outputs. To address this, we propose to develop inter- active visualization methods and a web-based infrastructure that will enable domain experts to identify disease-associated variants in large (epi)genomic data sets through visual exploration of computational predictions and the underlying data. This will have a significant impact on the rate at which predictions can be verified, interpreted and translated into clinically actionable finding. Our first priority is the design of methods and tools to visualize (epi) genomic data in a range of different contexts, for instance by grouping and representing features based on their function, chromatin state, transcriptional activity or genomic coordinates. We will also develop new non-linear genome representations to compare structural variants across genomes, complementing the functionality of the highly successful genome browsers. We then investigate how information external to the primary data - for instance from other studies, drug target or biomarker databases - can be applied to guide investigators through the data set. Finally, we implement a web-based exploration system for biological and clinical domain an expert that combines our interactive visualizations with large-scale public (epi) genomic data sets. The methods and tools developed under this proposal will be generally applicable and driving biological examples are chosen from The Cancer Genome Atlas (TCGA) and the Encyclopedia of DNA Elements (ENCODE and modENCODE).