500k WGS FAQ

This FAQ addresses questions related to the new data dispensing functionality that allows users to select which elements of the data to dispense. If you would like more information on the new 500k WGS data release, visit the UK Biobank FAQ.

How can I follow the status regarding platform maintenance?

You can subscribe at https://status.dnanexus.com/

Can I “refresh” existing projects to get the 500k WGS data?

Currently the refresh feature is unavailable to ensure that the maximum number of users can get access to the new data as soon as possible via dispensal.

We recommend that users dispense a new project to get the 500k WGS data, and migrate data analysis workflows from existing projects to the new project. We will enable the “refresh” feature again in the future and send notifications out once it is available.

How many projects can I dispense data to?

We recommend that each research application dispense data to only one project to be considerate to other researchers who would like to access the data.

How long will the dispensal process take?

Each dispense request will take about 4-8 hours once your project starts dispensing. However, due to the large number of people interested in 500k WGS data and the size of this data, you might experience a long waiting time for your project dispensal to start due to the queue of requests. Please do not dispense more than one project.

I created a project but it's stuck at "0%".

Your request to dispense data may be queued behind that of other users. The system will service your request in the order it was received. We appreciate your patience during that time. https://dnanexus.gitbook.io/uk-biobank-rap/frequently-asked-questions#i-created-a-project-but-its-stuck-at-0-.

How do I select what data I’d like to dispense?

You will need to create a new project in order to access the data. Note that you will not be able to refresh an already existing project. On the project creation screen, users will now see a new section with the different data types available to dispense. For a faster dispensal time, only select what data you’ll need. You will have the option to dispense the data on project creation or later in the project settings of that new project.

What data should I select for dispensal?

  1. If you are interested in accessing the updated phenotypic, health care and proteomics data, select structured tabular data. This option is selected by default, but can be unselected if the data is not necessary for your project.

  2. If you are interested in accessing the updated imaging data or the population-level WGS pVCF data, select unstructured bulk data files. This option will dispense population-level WGS pVCF data (600,000 files), but not individual-level WGS data such as CRAM or gVCF files. This was decided in order to streamline the new project experience for all users. If your research requires access to the individual-level WGS data (18 million files), return back to the project once the initial dispensing is completed and request an additional dispensing of these data files.

    1. Due to the size of the dispensal we recommend waiting until demand for the WGS has decreased.

How do I dispense the individual-level data?

Due to the size of the data (18 million files), we recommend waiting until the demand for the WGS has reduced. If your research requires access to the individual-level WGS data, you will have to request "Additional Bulk Data Files" after your first request has been completed. You can make the request in your project settings by selecting the “Dispense More Data” button.

Can I create a project without having to dispense data?

You can create an empty project without dispensing data by deselecting both checkboxes on the project creation screen.

What fields & data require the “Dispense More Data” step?

See the below table for details.

Where can I find the population-level files from the 500k WGS release after my dispensal is completed?

They can be found at the two locations below:

  1. /Bulk/GATK and GraphTyper WGS/GraphTyper population level WGS variants, pVCF format [500k release]/

  2. /Bulk/DRAGEN WGS/DRAGEN population level WGS variants, pVCF format [500k release]

Last updated