How to Analyze Ultima Germline WGS Data
Analysis Pipeline for Ultima Gerline Whole Genome Sequencing
Secondary Analysis Steps for Germline WGS Data:
Trimming
The Ultima Trimmer software tool divides each read into segments, trimming Ultima adapter sequence, and keeping the Ultima barcode and insert sequence for downstream processing.
Alignment and Tagging
Reads are tagged (Ultima barcode and insert) and the insert sequence is aligned to a reference. For more information on Realignment: Ultima Aligner (UA) Github Page:
Demultiplexing
Data output is sorted, per sample, into folders based on the Ultima barcodes.
Sample QC
A QC file labeled “<<run filename>>.selfSM.contamination_stats.csv” is generated containing potential contamination or sample swap information (based on verifyBamID).
File Output
Read data is output as a CRAM file ready for variant calling using the Ultima germline variant caller.
Running Ultima analysis pipelines on AWS HealthOmics
Ultima Genomics offers pipelines as Ready2Run workflows on AWS HealthOmics. Ready2Run workflows enable you to run these pipelines on AWS HealthOmics by simply bringing your data. For more flexibility, such as the use of larger file sizes or changing the reference genome, you can convert Ready2Run workflows to private workflows.
To get started visit the UltimaGenomics repository for workflows compatible with AWS HealthOmics:
Each Ready2Run workflow folder contains the following:
Required WDL file(s)
“How To” documentation that details the workflow and how to run it externally of WDL.
Documentation of the WDL inputs and outputs
.json file that lists the parameters for workflow
Folder with optional input templates with default parameters for the WDL
Folder containing tasks the WDL is running
For more questions about these workflows, contact us via email: healthomics.support@ultimagen.com
Running Ultima analysis pipelines on AWS HealthOmics
Ultima Genomics offers pipelines as Ready2Run workflows on AWS HealthOmics. Ready2Run workflows enable you to run these pipelines on AWS HealthOmics by simply bringing your data. For more flexibility, such as the use of larger file sizes or changing the reference genome, you can convert Ready2Run workflows to private workflows.
To get started visit the UltimaGenomics repository for workflows compatible with AWS HealthOmics:
Each Ready2Run workflow folder contains the following:
Required WDL file(s)
“How To” documentation that details the workflow and how to run it externally of WDL.
Documentation of the WDL inputs and outputs
.json file that lists the parameters for workflow
Folder with optional input templates with default parameters for the WDL
Folder containing tasks the WDL is running
For more questions about these workflows, contact us via email: healthomics.support@ultimagen.com
Germline Variant Calling using Efficient DV
Ultima utilizes Efficient DV, an analysis pipeline designed to call variants from aligned CRAM files using a version of DeepVariant adapted for Ultima Genomics data.
For detailed instructions on implementing this workflow please visit: https://github.com/Ultimagen/healthomics-workflows/blob/main/workflows/efficient_dv/howto-germline-calling-efficient-dv.md
Germline CNV Calling
Ultima’s germline CNV analysis pipeline takes a single sample aligned file (BAM/CRAM) alongside with pre-calculated coverage in BedGraph format. The CNV calls are returned in BED and VCF format.
For detailed instructions on implementing this workflow please visit: https://github.com/Ultimagen/healthomics-workflows/blob/main/workflows/germline_CNV_pipeline/how-to-single-sample-germline-cnv-calling.md
Structural Variant (SV) Calling
SV calling pipeline detects SVs using an in-house assembly tool, GRIDSS tool for SV detection and GRIPSS tool for filtering the output callset. The pipeline supports both germline and somatic modes.
The pipeline takes as input aligned BAM/CRAM files and outputs a filtered VCF containing annotated SV calls. The steps of the pipeline are as follows: assembly, realignment with Ultima Aligner, reverting secondary low MAPQ alignments, GRIDSS identification of SVs, GRIDSS annotation and SV filtering using GRIPSS for somatic runs and GermlineLinkVariants which links GRIDSS breakends to create an SV for germline runs.
For detailed instructions on implementing this workflow please visit: https://github.com/Ultimagen/healthomics-workflows/blob/main/workflows/structural_variant_pipeline/howto-structural-variant-calling.md