×

nf-core/quilt/quilt2 @ 0.0.0-6c4ed3a

QUILT2 is an R and C++ program for fast genotype imputation from low-coverage sequence using a large phased reference panel in VCF/BCF format.

Latest version: 0.0.0-6c4ed3a
Total downloads: 14
Source: nf-core/modules
Authors: @atrigila
Maintainers: @atrigila

Summary

QUILT2 is an R and C++ program for fast genotype imputation from low-coverage sequence using a large phased reference panel in VCF/BCF format.

Get started

Add the following snippet to your workflow script to include this module.

include { QUILT_QUILT2 } from 'nf-core/quilt/quilt2'

License

MIT License

Process
Name QUILT_QUILT2
Input 2 channels
#1 tuple
meta map

Groovy Map containing sample information e.g. [ id:'test', single_end:false ]

bams file

(Mandatory) BAM/CRAM files

*.{bam,cram}
bais file

(Mandatory) BAM/CRAM index files

*.{bai,crai}
bamlist file

(Optional) File with list of BAM/CRAM files to impute. One file per line.

*.{txt}
samplename file

(Optional) File with list of samples names in the same order as in bamlist to impute. One file per line.

*.{txt}
reference_vcf_file file

(Mandatory) Phased reference panel VCF/BCF file for imputation.

*.{vcf,vcf.gz,bcf}
reference_vcf_index file

(Mandatory) Index for the reference panel VCF file.

*.{tbi,csi}
posfile file

(Optional) File with positions of where to impute, lining up one-to-one with genfile. File is tab separated with no header, one row per SNP, with col 1 = chromosome, col 2 = physical position (sorted from smallest to largest), col 3 = reference base, col 4 = alternate base. Bases are capitalized.

*.{txt}
phasefile file

(Optional) File with truth phasing results. Supersedes genfile if both options given. File has a header row with a name for each sample, matching what is found in the bam file. Each subject is then a tab separated column, with 0 = ref and 1 = alt, separated by a vertical bar |, e.g. 0|0 or 0|1. Note therefore this file has one more row than posfile which has no header.

*.{txt}
genfile file

(Optional) Path to gen file with high coverage results. Empty for no genfile. If both genfile and phasefile are given, only phasefile is used, as genfile (unphased genotypes) is derivative to phasefile (phased genotypes). File has a header row with a name for each sample, matching what is found in the bam file. Each subject is then a tab seperated column, with 0 = hom ref, 1 = het, 2 = hom alt and NA indicating missing genotype, with rows corresponding to rows of the posfile. Note therefore this file has one more row than posfile which has no header [default ""]

*.{txt}
chr string

(Mandatory) What chromosome to run. Should match BAM headers.

regions_start integer

(Mandatory) When running imputation, where to start from. The 1-based position x is kept if regionStart <= x <= regionEnd.

regions_end integer

(Mandatory) When running imputation, where to stop.

ngen integer

Number of generations since founding or mixing. Note that the algorithm is relatively robust to this. Use nGen = 4 * Ne / K if unsure.

buffer integer

Buffer of region to perform imputation over. So imputation is run form regionStart-buffer to regionEnd+buffer, and reported for regionStart to regionEnd, including the bases of regionStart and regionEnd.

genetic_map file

(Optional) File with genetic map information, a file with 3 white-space delimited entries giving position (1-based), genetic rate map in cM/Mbp, and genetic map in cM. If no file included, rate is based on physical distance and expected rate (expRate).

*.{txt,map}{,gz}
#2 tuple
meta2 map

Groovy Map containing sample information e.g. [ id:'test', single_end:false ]

fasta file

(Optional) File with reference genome.

*.{fa,fasta}
fasta_fai file

(Optional) File with reference genome index.

*.{fai}
Output 6 channels
#1 tbi tuple
meta map

Groovy Map containing sample information e.g. [ id:'test', single_end:false ]

*.vcf.gz.tbi file

TBI file of the VCF.

*.{vcf.gz.tbi}
#2 vcf tuple
meta map

Groovy Map containing sample information e.g. [ id:'test', single_end:false ]

*.vcf.gz file

VCF file with both SNP annotation information and per-sample genotype information.

*.{vcf.gz}
#3 plots tuple
meta map

Groovy Map containing sample information e.g. [ id:'test', single_end:false ]

plots directory

Folder of plots generated during the imputation process.

plots