All of Us + AnVIL Imputation Service

Imputation can help complete your
datasets efficiently and accurately

Our imputation service leverages Terra and uses a large and diverse reference panel that combines genomes from both the All of Us Research Program and AnVIL Centers for Common Disease Genomics.

Visit the All of Us + AnVIL Imputation Service Portal

Why use this imputation service?

No cost during Beta

NIH funding provides users with 2,500 samples for free during the Beta release. After Beta, the service will shift to a paid model based on sample size.

Large, high-quality dataset

515K+ genomes from All of Us and AnVIL.

Secure service

Our service is FedRAMP Moderate and approved by the NIH for use with controlled-access data. View our security documentation here.

Diverse reference panel

Representation across genetic ancestries for more accurate results. Learn more about the reference panel.

The All of Us + AnVIL
dataset contains
515,000+ diverse
genomes

515,000+

total genomes from All of Us + AnVIL

NUMBER OF GENOMES *PERCENTAGE

254,416 European 49%
101,982 African 20%
90,553 Americas 18%
13,226 East Asian 3%
9,710 South Asian 2%
1,065 Middle Eastern 0.2%
44,627 Remaining Individuals 9%
* Based on computed genetic ancestry on a combined dataset derived from the All of Us Curated Data Repository v8 release and AnVIL Centers for Common Disease Genomics.

How does the imputation service work?

Learn about the steps needed to use the imputation service for your research project.

Create an account

Create a Terra account to get started.

Pick your preferred method

Visit our web interface or install our command-line tool in your preferred environment.

Upload your data and launch

Upload your data to our secure environment and select parameters for your specific analysis.

Download your results

Download your results and review.

Run analyses on the imputed dataset

Optionally continue your analysis in Terra where we provide genome-wide analysis and polygenic risk score pipelines to enable downstream analysis on the imputed dataset.

Documentation

This imputation service from the Broad Institute’s Data Sciences Platform was made possible with support from the National Institutes of Health’s All of Us Research Program, the National Human Genome Research Institute, and the Broad Institute.

This work was made possible by National Institutes of Health (NIH) awards: (1) OT2OD035404, "All of Us Data and Research Center (DRC);" (2) OT2OD03821, "Broad-Color: The Genome Center for the Future of All of Us;" (3) OT2OD002750, "The Broad-LMM-Color Genome Center for All of Us," funded by the NIH Office of the Director; and (4) U24HG010262, "AnVIL: A National Resource for Genomic Data Analysis and Visualization," funded by the National Human Genome Research Institute.

All of Us and the All of Us logo are registered trademarks of the U.S. Department of Health and Human Services.

The inclusion of trade names, logos, trademarks, and references to outside entities does not constitute or imply an endorsement by any Federal entity.

Scientific acknowledgments