Nextflow Registry | nf-core/gtdbtk/classifywf@0.0.0-6c4ed3a

nf-core/gtdbtk/classifywf @ 0.0.0-6c4ed3a

GTDB-Tk is a software toolkit for assigning objective taxonomic classifications to bacterial and archaeal genomes based on the Genome Database Taxonomy GTDB.

GTDB taxonomy taxonomic classification metagenomics classification genome taxonomy database bacteria archaea

Latest version: 0.0.0-6c4ed3a

Total downloads: 7

Source: nf-core/modules

Authors: @skrakau @prototaxites @abhi18av

Maintainers: @skrakau @abhi18av

Summary

GTDB-Tk is a software toolkit for assigning objective taxonomic classifications to bacterial and archaeal genomes based on the Genome Database Taxonomy GTDB.

Get started

Add the following snippet to your workflow script to include this module.

include { GTDBTK_CLASSIFYWF } from 'nf-core/gtdbtk/classifywf'

License

MIT License

Process

`Name`	`GTDBTK_CLASSIFYWF`

Input 3 channels

#1 tuple

`meta` map	Groovy Map containing sample information e.g. [ id:'test', single_end:false, assembler:'spades' ]
`bins/*` file	A list of one or more bins in FASTA format for classification `*.{fasta,fna,fas,fa}{,.gz}`

#2 tuple

`db_name` string	The name of the GTDB database to use.
`db` file	Path to a directory containing a GDTB database, as uncompressed from from the 'full package' gtdbdtk_data.tar.gz file. You can give the 'release' directory here. Must contain the 'metadata' subdirectory `release[0-9]+/`

`use_pplacer_scratch_dir` boolean	Set to true to reduce pplacer memory usage by writing to disk (slower)

Output 12 channels

#1 log tuple

`meta` map	Groovy Map containing sample information e.g. [ id:'test', single_end:false, assembler:'spades' ] `*`
`${prefix}/${prefix}.log` file	GTDB-tk log file `*.{log}`

#2 msa tuple

`meta` map	Groovy Map containing sample information e.g. [ id:'test', single_end:false, assembler:'spades' ] `*`
`${prefix}/align/*.msa.fasta.gz` file	Multiple sequence alignments file. `*.{msa.fasta.gz}`

#3 tree tuple

`meta` map	Groovy Map containing sample information e.g. [ id:'test', single_end:false, assembler:'spades' ] `*`
`${prefix}/classify/*.classify.tree` file	Groovy Map NJ or UPGMA trees in Newick format produced from a multiple sequence alignment `*.{classify.tree}`

#4 failed tuple

`meta` map	Groovy Map containing sample information e.g. [ id:'test', single_end:false, assembler:'spades' ] `*`
`${prefix}/identify/*.failed_genomes.tsv` file	A TSV summary of the genomes which GTDB-tk failed to classify. `*.{failed_genomes.tsv}`

#5 markers tuple

`meta` map	Groovy Map containing sample information e.g. [ id:'test', single_end:false, assembler:'spades' ] `*`
`${prefix}/identify/*.markers_summary.tsv` file	A TSV summary file lineage markers used for the classification. `*.{markers_summary.tsv}`

#6 summary tuple

`meta` map	Groovy Map containing sample information e.g. [ id:'test', single_end:false, assembler:'spades' ] `*`
`${prefix}/classify/*.summary.tsv` file	A TSV summary file for the classification `*.{summary.tsv}`

#7 filtered tuple

`meta` map	Groovy Map containing sample information e.g. [ id:'test', single_end:false, assembler:'spades' ] `*`
`${prefix}/align/*.filtered.tsv` file	A list of genomes with an insufficient number of amino acids in MSA `*.{filtered.tsv}`

#8 user_msa tuple

`meta` map	Groovy Map containing sample information e.g. [ id:'test', single_end:false, assembler:'spades' ] `*`
`${prefix}/align/*.user_msa.fasta.gz` file	Multiple sequence alignments file for the user-provided files. `*.{user_msa.fasta.gz}`

#9 warnings tuple