Review Article

High-Performance Computing Pipelines for NGS Variant Calling  

Wenzhong Huang
Biomass Research Center, Hainan Institute of Tropical Agricultural Resouces, Sanya, 572025, Hainan, China
Author    Correspondence author
Computational Molecular Biology, 2025, Vol. 15, No. 3   doi: 10.5376/cmb.2025.15.0015
Received: 18 Apr., 2025    Accepted: 29 May, 2025    Published: 21 Jun., 2025
© 2025 BioPublisher Publishing Platform
This is an open access article published under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Preferred citation for this article:

Huang W.Z., 2025, High-performance computing pipelines for NGS variant calling, Computational Molecular Biology, 15(3): 151-159 (doi: 10.5376/cmb.2025.15.0015)

 

Abstract

With the popularization of high-throughput sequencing (NGS) technology, genomic sequencing data have grown exponentially, posing severe computational challenges for variant detection. Traditional mutation detection processes (such as GATK-based pipelines) are prone to computational bottlenecks and I/O bottlenecks when dealing with large-scale data. This paper reviews the high-performance computing (HPC) processes for NGS mutation detection, introduces the typical workflows and commonly used algorithms of NGS mutation detection, and analyzes the performance bottlenecks of traditional processes. Subsequently, the application of the architecture of HPC and the parallel computing model in bioinformatics was expounded. On this basis, the HPC optimization strategies for the mutation detection process were mainly discussed, including task parallelization, I/O optimization, data locality management, and the methods of workflow orchestration using middleware such as SLURM, Nextflow, and Cromwell. This paper introduces the application of emerging hardware acceleration technologies such as GPU and FPGA in mutation detection, discusses performance evaluation metrics and benchmark testing frameworks, as well as a comparative study of HPC-driven processes and traditional methods.

Keywords
High-performance computing; Mutation detection; Next-generation sequencing; Parallel computing; Workflow
[Full-Text PDF] [Full-Flipping PDF] [Full-Text HTML]
Computational Molecular Biology
• Volume 15
View Options
. PDF(602KB)
. FPDF(win)
. FPDF(mac)
. HTML
. Online fPDF
Associated material
. Readers' comments
Other articles by authors
. Wenzhong Huang
Related articles
. High-performance computing
. Mutation detection
. Next-generation sequencing
. Parallel computing
. Workflow
Tools
. Email to a friend
. Post a comment