Nextflow Registry | nf-core/stitch@0.0.0-6c4ed3a

nf-core/stitch @ 0.0.0-6c4ed3a

STITCH is an R program for reference panel free, read aware, low coverage sequencing genotype imputation. STITCH runs on a set of samples with sequencing reads in BAM format, as well as a list of positions to genotype, and outputs imputed genotypes in VCF format.

imputation genomics vcf bgen cram bam sam

Latest version: 0.0.0-6c4ed3a

Total downloads: 11

Source: nf-core/modules

Authors: @saulpierotti

Maintainers: @saulpierotti

Summary

Get started

Add the following snippet to your workflow script to include this module.

include { STITCH } from 'nf-core/stitch'

License

MIT License

Process

`Name`	`STITCH`

Input 3 channels

#1 tuple

`meta` map	Groovy Map containing information about the set of samples e.g. `[ id:'test' ]`
`collected_crams` file	List of sorted BAM/CRAM/SAM file `*.{bam,cram,sam}`
`collected_crais` file	List of BAM/CRAM/SAM index files `*.{bai,crai,sai}`
`cramlist` file	Text file with the path to the cram files to use in imputation, one per line. Since the cram files are staged to the working directory for the process, this file should just contain the file names without any pre-pending path. `*.txt`
`samplename` file	(Optional) File with list of samples names in the same order as in bamlist to impute. One file per line. `*.{txt}`
`posfile` file	Tab-separated file describing the variable positions to be used for imputation. Refer to the documentation for the `--posfile` argument of STITCH for more information. `*.tsv`
`input` directory	Folder of pre-generated input RData objects used when STITCH is called with the `--regenerateInput FALSE` flag. It is generated by running STITCH with the `--generateInputOnly TRUE` flag. `input`
`genetic_map` file	(Optional) File with genetic map information, a file with 3 white-space delimited entries giving position (1-based), genetic rate map in cM/Mbp, and genetic map in cM. If no file included, rate is based on physical distance and expected rate (expRate). `*.{txt,map}{,gz}`
`rdata` directory	Folder of pre-generated input RData objects used when STITCH is called with the `--regenerateInput FALSE` flag. It is generated by running STITCH with the `--generateInputOnly TRUE` flag. `RData`
`chromosome_name` string	Name of the chromosome to impute. Should match a chromosome name in the reference genome.
`start` integer	Start position of the region to impute.
`end` integer	End position of the region to impute.
`K` integer	Number of ancestral haplotypes to use for imputation. Refer to the documentation for the `--K` argument of STITCH for more information.
`nGen` integer	Number of generations since founding of the population to use for imputation. Refer to the documentation for the `--nGen` argument of STITCH for more information.

#2 tuple

`meta2` map	Groovy Map containing information about the reference genome used e.g. `[ id:'test' ]`
`fasta` file	FASTA reference genome file `*.{fa,fasta}`
`fasta_fai` file	FASTA index file `*.{fai}`

`seed` integer	Seed for random number generation

Output 8 channels

#1 vcf tuple

`meta` map	Groovy Map containing sample information e.g. `[ id:'test' ]`
`*.vcf.gz` file	Imputed genotype calls for the positions in `posfile`, in vcf format. This is the default output. `.vcf.gz`

#2 bgen tuple

`meta` map	Groovy Map containing sample information e.g. `[ id:'test' ]`
`*.bgen` file	Imputed genotype calls for the positions in `posfile`, in vcf format. This is the produced if `--output_format bgen` is specified. `.bgen`

#3 input tuple

`meta` map	Groovy Map containing sample information e.g. `[ id:'test' ]`
`input` directory	Folder of pre-generated input RData objects used when STITCH is called with the `--regenerateInput FALSE` flag. It is generated by running STITCH with the `--generateInputOnly TRUE` flag. `input`

#4 plots tuple

`meta` map