×

nf-core/gatk4/markduplicates @ 0.0.0-6c4ed3a

This tool locates and tags duplicate reads in a BAM or SAM file, where duplicate reads are defined as originating from a single fragment of DNA.

Latest version: 0.0.0-6c4ed3a
Total downloads: 5
Source: nf-core/modules

Summary

This tool locates and tags duplicate reads in a BAM or SAM file, where duplicate reads are defined as originating from a single fragment of DNA.

Get started

Add the following snippet to your workflow script to include this module.

include { GATK4_MARKDUPLICATES } from 'nf-core/gatk4/markduplicates'

License

MIT License

Process
Name GATK4_MARKDUPLICATES
Input 3 channels
#1 tuple
meta map

Groovy Map containing sample information e.g. [ id:'test', single_end:false ]

bam file

Sorted BAM file

*.{bam}
fasta file

Fasta file

*.{fasta}
fasta_fai file

Fasta index file

*.{fai}
Output 7 channels
#1 bai tuple
meta map

Groovy Map containing sample information e.g. [ id:'test', single_end:false ]

*.bai file

BAM index file

*.{bam.bai}
#2 bam tuple
meta map

Groovy Map containing sample information e.g. [ id:'test', single_end:false ]

*bam file

Marked duplicates BAM file

*.{bam}
#3 crai tuple
meta map

Groovy Map containing sample information e.g. [ id:'test', single_end:false ]

*.crai file

CRAM index file

*.{cram.crai}
#4 cram tuple
meta map

Groovy Map containing sample information e.g. [ id:'test', single_end:false ]

*cram file

Marked duplicates CRAM file

*.{cram}
#5 metrics tuple
meta map

Groovy Map containing sample information e.g. [ id:'test', single_end:false ]

*.metrics file

Duplicate metrics file generated by GATK

*.{metrics.txt}
#6 versions_gatk4 tuple
${task.process} string

The name of the process

gatk4 string

The name of the tool

gatk --version | sed -n '/GATK.*v/s/.*v//p' eval

The expression to obtain the version of the tool

#7 versions_samtools tuple
${task.process} string

The name of the process

samtools string

The name of the tool

samtools version | sed '1!d;s/.* //' eval

The expression to obtain the version of the tool

Tool Description Homepage
gatk4 Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. https://gatk.broadinstitute.org/hc/en-us
Version 0.0.0-6c4ed3a
Commit ID 6c4ed3a220310b905a1fc9d04f05be2e0837142b
Release Date 23 Apr 2026 15:20:14 (UTC)
Download URL https://registry.nextflow.io/api/v1/modules/nf-core%2Fgatk4%2Fmarkduplicates/0.0.0-6c4ed3a/download
OCI Store URL https://public.cr.seq