×

nf-core/gtdbtk/classifywf @ 0.0.0-6c4ed3a

GTDB-Tk is a software toolkit for assigning objective taxonomic classifications to bacterial and archaeal genomes based on the Genome Database Taxonomy GTDB.

Latest version: 0.0.0-6c4ed3a
Total downloads: 7
Source: nf-core/modules
Maintainers: @skrakau @abhi18av

Summary

GTDB-Tk is a software toolkit for assigning objective taxonomic classifications to bacterial and archaeal genomes based on the Genome Database Taxonomy GTDB.

Get started

Add the following snippet to your workflow script to include this module.

include { GTDBTK_CLASSIFYWF } from 'nf-core/gtdbtk/classifywf'

License

MIT License

Process
Name GTDBTK_CLASSIFYWF
Input 3 channels
#1 tuple
meta map

Groovy Map containing sample information e.g. [ id:'test', single_end:false, assembler:'spades' ]

bins/* file

A list of one or more bins in FASTA format for classification

*.{fasta,fna,fas,fa}{,.gz}
#2 tuple
db_name string

The name of the GTDB database to use.

db file

Path to a directory containing a GDTB database, as uncompressed from from the 'full package' gtdbdtk_data.tar.gz file. You can give the 'release' directory here. Must contain the 'metadata' subdirectory

release[0-9]+/
use_pplacer_scratch_dir boolean

Set to true to reduce pplacer memory usage by writing to disk (slower)

Output 12 channels
#1 log tuple
meta map

Groovy Map containing sample information e.g. [ id:'test', single_end:false, assembler:'spades' ]

*
${prefix}/${prefix}.log file

GTDB-tk log file

*.{log}
#2 msa tuple
meta map

Groovy Map containing sample information e.g. [ id:'test', single_end:false, assembler:'spades' ]

*
${prefix}/align/*.msa.fasta.gz file

Multiple sequence alignments file.

*.{msa.fasta.gz}
#3 tree tuple
meta map

Groovy Map containing sample information e.g. [ id:'test', single_end:false, assembler:'spades' ]

*
${prefix}/classify/*.classify.tree file

Groovy Map NJ or UPGMA trees in Newick format produced from a multiple sequence alignment

*.{classify.tree}
#4 failed tuple
meta map

Groovy Map containing sample information e.g. [ id:'test', single_end:false, assembler:'spades' ]

*
${prefix}/identify/*.failed_genomes.tsv file

A TSV summary of the genomes which GTDB-tk failed to classify.

*.{failed_genomes.tsv}
#5 markers tuple
meta map

Groovy Map containing sample information e.g. [ id:'test', single_end:false, assembler:'spades' ]

*
${prefix}/identify/*.markers_summary.tsv file

A TSV summary file lineage markers used for the classification.

*.{markers_summary.tsv}
#6 summary tuple
meta map

Groovy Map containing sample information e.g. [ id:'test', single_end:false, assembler:'spades' ]

*
${prefix}/classify/*.summary.tsv file

A TSV summary file for the classification

*.{summary.tsv}
#7 filtered tuple
meta map

Groovy Map containing sample information e.g. [ id:'test', single_end:false, assembler:'spades' ]

*
${prefix}/align/*.filtered.tsv file

A list of genomes with an insufficient number of amino acids in MSA

*.{filtered.tsv}
#8 user_msa tuple
meta map

Groovy Map containing sample information e.g. [ id:'test', single_end:false, assembler:'spades' ]

*
${prefix}/align/*.user_msa.fasta.gz file

Multiple sequence alignments file for the user-provided files.

*.{user_msa.fasta.gz}
#9 warnings tuple