×

nf-core/rundbcan/easycgc @ 0.0.0-6c4ed3a

CGC annotation module for the dbcan pipeline. This module is used to annotate carbohydrate-active enzymes (CAZymes) from genomic data using the dbCAN annotation tool.

Latest version: 0.0.0-6c4ed3a
Total downloads: 10
Source: nf-core/modules
Authors: @Xinpeng021001
Maintainers: @Xinpeng021001

Summary

CGC annotation module for the dbcan pipeline. This module is used to annotate carbohydrate-active enzymes (CAZymes) from genomic data using the dbCAN annotation tool.

Get started

Add the following snippet to your workflow script to include this module.

include { RUNDBCAN_EASYCGC } from 'nf-core/rundbcan/easycgc'

License

MIT License

Process
Name RUNDBCAN_EASYCGC
Input 3 channels
#1 tuple
meta map

Groovy Map containing sample information e.g. [ id:'sample1' ]

input_raw_data file

FASTA file for protein sequences.

*.{fasta,fa,faa}
#2 tuple
meta2 map

Groovy Map containing sample information e.g. [ id:'sample1' ]

input_gff file

GFF file for protein sequences.

gff_type string

Type of GFF file. Options are NCBI_prok, JGI, NCBI_euk, and prodigal. This is used to parse the GFF file correctly.

dbcan_db directory

Path to the dbCAN database directory.

Output 10 channels
#1 cgc_gff tuple
meta map

Groovy Map containing sample information e.g. [ id:'sample1']

${prefix}_cgc.gff file

GFF file containing the CAZyme gene clusters (CGC) identified by dbCAN. This file is generated from the dbCAN annotation and contains the locations of CAZyme gene clusters in the genome.

#2 versions
versions.yml file

File containing software versions

versions.yml
#3 diamond_out_tc tuple
meta map

Groovy Map containing sample information e.g. [ id:'sample1']

${prefix}_diamond.out.tc file

TSV file containing the diamond output for transporter annotation.

#4 tf_hmm_results tuple
meta map

Groovy Map containing sample information e.g. [ id:'sample1']

${prefix}_TF_hmm_results.tsv file

TSV file containing the results of Transcription factor.

#5 stp_hmm_results tuple
meta map

Groovy Map containing sample information e.g. [ id:'sample1']

${prefix}_STP_hmm_results.tsv file

TSV file containing the results of signaling transduction proteins (STP) annotation.

#6 cgc_standard_out tuple
meta map

Groovy Map containing sample information e.g. [ id:'sample1']

${prefix}_cgc_standard_out.tsv file

Standard output file from dbCAN for CAZyme gene clusters (CGC) in a tabular format. This file summarizes the CAZyme gene clusters identified in the genome.

#7 dbcanhmm_results tuple
meta map

Groovy Map containing sample information e.g. [ id:'sample1']

${prefix}_dbCAN_hmm_results.tsv file

TSV file containing the detailed dbCAN HMM results for CAZyme annotation.

#8 dbcansub_results tuple
meta map

Groovy Map containing sample information e.g. [ id:'sample1']

${prefix}_dbCANsub_hmm_results.tsv file

TSV file containing the detailed dbCAN subfamily results for CAZyme annotation.

#9 cazyme_annotation tuple
meta map

Groovy Map containing sample information e.g. [ id:'sample1']

${prefix}_overview.tsv file

TSV file containing the results of dbCAN CAZyme annotation.

#10 dbcandiamond_results tuple
meta map

Groovy Map containing sample information e.g. [ id:'sample1']