LogoLogo
  • About the Research Analysis Platform
    • About this documentation
    • Frequently asked questions
      • General FAQs
      • 500k WGS FAQs
  • Getting Started
    • Quickstart
      • Creating an account
      • Creating a project
    • Key concepts
    • Data structure
      • Data release versions
      • Updating dispensed data
    • Training videos
      • General
      • Tools
      • Analysis & data types
      • Roundtables
  • Administrator
    • Costs and Billing
    • Managing usage and storage costs
    • Contact support
      • Service packages
  • Working on the Research Analysis Platform
    • Accessing data
      • Accessing phenotype data
      • Accessing bulk data
    • Running analysis jobs
      • Tools library
      • RStudio
      • JupyterLab
      • SAIGE
      • Command-Line Interface
      • Custom app
    • Managing jobs
      • Job Priority
      • Troubleshooting guide
    • Tips & tricks
      • Guide to analyzing large sample sets
    • Returning pVCF Files to UK Biobank
  • Science Corner
    • About the science corner
    • End-to-end target discovery with GWAS and PheWAS
    • Whole Exome Sequencing OQFE Protocol
      • Protocol for Processing UKB Whole Exome Sequencing Data Sets
      • Generation and Utilization of Quality Control Set 90pct10dp on OQFE Data
        • Details on Processing Whole Exome Datasets to Generate the Quality Control Set
    • Burden testing with WES
    • GWAS guide using Alzheimer's disease
Powered by GitBook
On this page

Was this helpful?

Export as PDF
  1. Science Corner
  2. Whole Exome Sequencing OQFE Protocol

Generation and Utilization of Quality Control Set 90pct10dp on OQFE Data

PreviousProtocol for Processing UKB Whole Exome Sequencing Data SetsNextDetails on Processing Whole Exome Datasets to Generate the Quality Control Set

Last updated 2 years ago

Was this helpful?

Single unfiltered multi-sample VCF files were provided for all UK Biobank whole exome sequencing (WES) releases (200k, 300k, 450k, and the final exome release). To aid researchers to generate a quality control data set for genotype-phenotype association analyses, a “90pct10dp” QC filter was applied to all UK Biobank aggregate data sets based on analyses on the UK Biobank 200k data release.

Only variant sites with at least 90% of the genotypes having DP>10 were retained by this filter. The filtered data sets are provided in the “helper_files” folders of all UK Biobank WES releases. For details on the analysis and other considerations, please refer to the UKB-RAP documentation: "".

Details on Processing Whole Exome Datasets to Generate the Quality Control Set