×

nf-core/links @ 0.0.0-6c4ed3a

LINKS is a genomics application for scaffolding genome assemblies with long reads, such as those produced by Oxford Nanopore Technologies Ltd. It can be used to scaffold high-quality draft genome assemblies with any long sequences (eg. ONT reads, PacBio reads, other draft genomes, etc). It is also used to scaffold contig pairs linked by ARCS/ARKS. This module is for LINKS >=2.0.0 and does not support MPET input.

Latest version: 0.0.0-6c4ed3a
Total downloads: 11
Source: nf-core/modules
Authors: @nschan
Maintainers: @nschan

Summary

LINKS is a genomics application for scaffolding genome assemblies with long reads, such as those produced by Oxford Nanopore Technologies Ltd. It can be used to scaffold high-quality draft genome assemblies with any long sequences (eg. ONT reads, PacBio reads, other draft genomes, etc). It is also used to scaffold contig pairs linked by ARCS/ARKS. This module is for LINKS >=2.0.0 and does not support MPET input.

Get started

Add the following snippet to your workflow script to include this module.

include { LINKS } from 'nf-core/links'

License

MIT License

Process
Name LINKS
Input 2 channels
#1 tuple
meta map

Groovy Map containing sample information e.g. [ id:'sample1' ]

assembly file

(Multi-)fasta file containing the draft assembly

*.{fa,fasta,fa.gz,fasta.gz}
#2 tuple
meta2 map

Groovy Map containing sample information e.g. [ id:'sample1' ]

reads file

fastq file(s) containing the long reads to be used for scaffolding

*.{fq,fastq,fq.gz,fastq.gz}
Output 11 channels
#1 log tuple
meta map

Groovy Map containing sample information e.g. [ id:'sample1']

*.log file

text file; Logs execution time / errors / pairing stats.

*.log
#2 bloom tuple
meta map

Groovy Map containing sample information e.g. [ id:'sample1']

*.bloom file

Bloom filter created by shredding the -f input into k-mers of size -k

*.bloom
#3 scaffolds_csv tuple
meta map

Groovy Map containing sample information e.g. [ id:'sample1']

*.scaffolds file

comma-separated file; containing the new scaffold(s)

*.scaffolds
#4 pairing_issues tuple
meta map

Groovy Map containing sample information e.g. [ id:'sample1']

*.pairing_issues file

text file; Lists all pairing issues encountered between contig pairs and illogical/out-of-bounds pairing.

*.pairing_issues
#5 versions_links tuple
${task.process} string

The name of the process

liftoff string

The name of the tool

echo \$(LINKS | grep -o 'LINKS v.*' | sed 's/LINKS v//') eval

The expression to obtain the version of the tool

#6 scaffolds_fasta tuple
meta map

Groovy Map containing sample information e.g. [ id:'sample1']

*.scaffolds.fa file

fasta file of the new scaffold sequence

*.scaffolds.fa
#7 scaffolds_graph tuple
meta map

Groovy Map containing sample information e.g. [ id:'sample1']

*.gv file

scaffold graph (for visualizing merges), can be rendered in neato, graphviz, etc

*.gv
#8 tigpair_checkpoint tuple
meta map

Groovy Map containing sample information e.g. [ id:'sample1']

*.tigpair_checkpoint.tsv file

if -b BASNAME.tigpair_checkpoint.tsv is present, LINKS will skip the kmer pair extraction and contig pairing stages. Delete this file to force LINKS to start at the beginning. This file can be used to:

  1. quickly test parameters (-l min. links / -a min. links ratio),
  2. quickly recover from crash,
  3. explore very large kmer spaces,
  4. scaffold with output of ARCS
*.tigpair_checkpoint.tsv
#9 pairing_distribution tuple
meta map

Groovy Map containing sample information e.g. [ id:'sample1']

*.pairing_distribution.csv file

comma-separated file; 1st column is the calculated distance for each pair (template) with reads that assembled logically within the same contig. 2nd column is the number of pairs at that distance.

*.pairing_distribution.csv
#10 simplepair_checkpoint tuple