nf-core/quilt/quilt2 @ 0.0.0-6c4ed3a
Summary
QUILT2 is an R and C++ program for fast genotype imputation from low-coverage sequence using a large phased reference panel in VCF/BCF format.
Get started
Add the following snippet to your workflow script to include this module.
include { QUILT_QUILT2 } from 'nf-core/quilt/quilt2'
License
MIT License
Name
|
QUILT_QUILT2 |
|---|
meta
map
|
Groovy Map containing sample information e.g. [ id:'test', single_end:false ] |
|---|---|
bams
file
|
(Mandatory) BAM/CRAM files *.{bam,cram}
|
bais
file
|
(Mandatory) BAM/CRAM index files *.{bai,crai}
|
bamlist
file
|
(Optional) File with list of BAM/CRAM files to impute. One file per line. *.{txt}
|
samplename
file
|
(Optional) File with list of samples names in the same order as in bamlist to impute. One file per line. *.{txt}
|
reference_vcf_file
file
|
(Mandatory) Phased reference panel VCF/BCF file for imputation. *.{vcf,vcf.gz,bcf}
|
reference_vcf_index
file
|
(Mandatory) Index for the reference panel VCF file. *.{tbi,csi}
|
posfile
file
|
(Optional) File with positions of where to impute, lining up one-to-one with genfile. File is tab separated with no header, one row per SNP, with col 1 = chromosome, col 2 = physical position (sorted from smallest to largest), col 3 = reference base, col 4 = alternate base. Bases are capitalized. *.{txt}
|
phasefile
file
|
(Optional) File with truth phasing results. Supersedes genfile if both options given. File has a header row with a name for each sample, matching what is found in the bam file. Each subject is then a tab separated column, with 0 = ref and 1 = alt, separated by a vertical bar |, e.g. 0|0 or 0|1. Note therefore this file has one more row than posfile which has no header. *.{txt}
|
genfile
file
|
(Optional) Path to gen file with high coverage results. Empty for no genfile. If both genfile and phasefile are given, only phasefile is used, as genfile (unphased genotypes) is derivative to phasefile (phased genotypes). File has a header row with a name for each sample, matching what is found in the bam file. Each subject is then a tab seperated column, with 0 = hom ref, 1 = het, 2 = hom alt and NA indicating missing genotype, with rows corresponding to rows of the posfile. Note therefore this file has one more row than posfile which has no header [default ""] *.{txt}
|
chr
string
|
(Mandatory) What chromosome to run. Should match BAM headers. |
regions_start
integer
|
(Mandatory) When running imputation, where to start from. The 1-based position x is kept if regionStart <= x <= regionEnd. |
regions_end
integer
|
(Mandatory) When running imputation, where to stop. |
ngen
integer
|
Number of generations since founding or mixing. Note that the algorithm is relatively robust to this. Use nGen = 4 * Ne / K if unsure. |
buffer
integer
|
Buffer of region to perform imputation over. So imputation is run form regionStart-buffer to regionEnd+buffer, and reported for regionStart to regionEnd, including the bases of regionStart and regionEnd. |
genetic_map
file
|
(Optional) File with genetic map information, a file with 3 white-space delimited entries giving position (1-based), genetic rate map in cM/Mbp, and genetic map in cM. If no file included, rate is based on physical distance and expected rate (expRate). *.{txt,map}{,gz}
|
meta2
map
|
Groovy Map containing sample information e.g. [ id:'test', single_end:false ] |
|---|---|
fasta
file
|
(Optional) File with reference genome. *.{fa,fasta}
|
fasta_fai
file
|
(Optional) File with reference genome index. *.{fai}
|
tbi
tuple
meta
map
|
Groovy Map containing sample information e.g. [ id:'test', single_end:false ] |
|---|---|
*.vcf.gz.tbi
file
|
TBI file of the VCF. *.{vcf.gz.tbi}
|
vcf
tuple
meta
map
|
Groovy Map containing sample information e.g. [ id:'test', single_end:false ] |
|---|---|
*.vcf.gz
file
|
VCF file with both SNP annotation information and per-sample genotype information. *.{vcf.gz}
|
plots
tuple
meta
map
|
Groovy Map containing sample information e.g. [ id:'test', single_end:false ] |
|---|---|
plots
directory
|
Folder of plots generated during the imputation process. plots
|