Ultima Genomics | Germline WGS

How to Analyze Ultima Germline WGS Data

Analysis Pipeline for Ultima Germline Whole Genome Sequencing

Secondary Analysis Steps for Germline WGS Data:

Trimming
The Ultima Trimmer software tool divides each read into segments, trimming Ultima adapter sequence, and keeping the Ultima barcode and insert sequence for downstream processing.
Alignment and Tagging
Reads are tagged (Ultima barcode and insert) and the insert sequence is aligned to a reference. For more information on alignment: Ultima Aligner (UA) Github Page
Demultiplex
Data output is sorted, per sample, into folders based on the Ultima barcodes.
Sample QC
A QC file labeled “<<run filename>>.selfSM.contamination_stats.csv” is generated containing potential contamination or sample swap information (based on verifyBamID).
File Output
Read data is output as a CRAM file ready for variant calling using the Ultima germline variant caller.

Running Ultima analysis pipelines on AWS HealthOmics

Ultima Genomics offers pipelines as Ready2Run workflows on AWS HealthOmics. Ready2Run workflows enable you to run these pipelines on AWS HealthOmics by simply loading your data. For more flexibility, such as the use of larger file sizes or changing the reference genome, you can convert Ready2Run workflows to private workflows.

To get started visit the UltimaGenomics repository for workflows compatible with AWS HealthOmics:

Each Ready2Run workflow folder contains the following:

Required WDL file(s)
“How To” documentation that details the workflow and how to run it externally of WDL
Documentation of the WDL inputs and outputs
.json file that lists the parameters for workflow
Folder with optional input templates with default parameters for the WDL
Folder containing tasks the WDL is running

For more questions about these workflows, contact us via email:  healthomics.support@ultimagen.com

Germline Variant Calling using Efficient DV

Ultima utilizes Efficient DV, an analysis pipeline designed to call variants from aligned CRAM files using a version of DeepVariant adapted for Ultima Genomics data.

For detailed instructions on implementing this workflow please visit: https://github.com/Ultimagen/healthomics-workflows/blob/main/workflows/efficient_dv/howto-germline-calling-efficient-dv.md

Germline CNV Calling

Ultima’s germline CNV analysis pipeline uses each single sample aligned file (BAM/CRAM) and pre-calculated coverage in BedGraph format to make the CNV calls, which are returned in BED and VCF format.

For detailed instructions on implementing this workflow please visit: https://github.com/Ultimagen/healthomics-workflows/blob/main/workflows/germline_CNV_pipeline/how-to-single-sample-germline-cnv-calling.md

Structural Variant (SV) Calling

SV calling pipeline detects SVs using an in-house assembly tool, GRIDSS tool for SV detection and GRIPSS tool for filtering the output callset. The pipeline supports both germline and somatic modes.

The pipeline takes as input aligned BAM/CRAM files and outputs a filtered VCF containing annotated SV calls. The steps of the pipeline are as follows: assembly, realignment with Ultima Aligner, reverting secondary low MAPQ alignments, GRIDSS identification of SVs, GRIDSS annotation and SV filtering using GRIPSS for somatic runs and GermlineLinkVariants which links GRIDSS breakends to create an SV for germline runs.

For detailed instructions on implementing this workflow please visit: https://github.com/Ultimagen/healthomics-workflows/blob/main/workflows/structural_variant_pipeline/howto-structural-variant-calling.md

Example Variant Calling Report