Accessing Phenotypic Data as a File
Learn how to export selected phenotypic fields into a TSV or CSV file, for easy browsing and analysis.
If you've worked with UK Biobank data prior to using the Research Analysis Platform, you may be aware that UK Biobank distributes the main tabular dataset in a large encoded file with the extension .enc_ukb. To work with the dataset, you first convert this file to TSV or CSV format.
On the Research Analysis Platform, this dataset is dispensed into your project as a Spark SQL database, in Parquet format. You can access this database within a Spark environment - for example, by querying it from inside a Spark JupyterLab session.
If you have existing code that relies on reading just a handful of fields from a file, you may find it easier to extract those fields from the Spark SQL database, and dumping them into a TSV or CSV file. You can then run your code or otherwise work with the file, without having to do so within a Spark environment.

Selecting Fields of Interest in the Cohort Browser

Start by navigating to your project and clicking on the name of the dispensed dataset. The Cohort Browser will launch.
In the Cohort Browser, open the Data Preview tab:
Click the "grid" icon at the right end of the Participant ID header row. Then click Add Columns. The Add Columns to Table dialog will open:
Navigate to any field, either directly or via search. Once you've found the field you're looking for, click Add as Column:
Continue locating the fields you're interested in, and adding them as columns. Note that as you add additional fields as columns, you do not have to wait for the Data Preview to finish loading.
Once you've finished, close the dialog by clicking the X to the right of the Add Column to Table title. In the Data Preview tab, you'll see the first few rows of the data.
In the upper right corner of the screen, click Views, then click Save View. Enter a name for the view, then save it.

Creating a TSV or CSV File Using Table Exporter

Now convert your saved view into a TSV or CSV file, using the Table Exporter app.
Navigate back to your project and click the Start Analysis button in the upper right corner of the screen. In the Start New Analysis dialog, select the Table Exporter app, then click Run Selected. Note that if this is the first time you've run Table Exporter, you'll be prompted to install it first.

Selecting an Input

Within the Table Exporter app, open the Analysis Inputs tab on the right side of the screen. Then click the Dataset or Cohort or Dashboard tile:
A modal window will open. Select the view that you created and saved in the Cohort Browser.

Configuring Output Options

Within the Options section, configure your output options.
In the Output File Name field, enter a filename prefix. In the Output File Format field, select "CSV" or "TSV." You may find it easier to work with a TSV file downstream, because the values in certain fields contain commas, complicating the parsing of a CSV file.
In the Coding Option field, select "RAW" so that you can work with the original UK Biobank data, as you would get them from the Biobank. (For example, in the Sex field, you will see the coded value "0" rather than "Female.")
In the Header Style field, select "UKB-FORMAT" to get headers that match the original UK Biobank format (e.g. 123-4.5).

Launching the Table Exporter App and Viewing the Converted File

Click Start Analysis. Once the conversion finishes and the file is ready, you will be notified via email. To access the file, either return to your project, or click the link in the email.
Last modified 5mo ago