BioWDL: structural-variantcalling

A workflow for calling structural variants.

Inputs for SVcalling

The following is an overview of all available inputs in SVcalling.

Required inputs

SVcalling.bamFile
File
sorted BAM file
SVcalling.bamIndex
File
BAM index(.bai) file
SVcalling.bwaIndex
BwaIndex
Struct containing the BWA reference files
SVcalling.manta.cores
Int — Default: 1
The the number of cores required to run a program
SVcalling.manta.memoryGb
Int — Default: 4
The memory required to run the manta
SVcalling.referenceFasta
File
The reference fasta file
SVcalling.referenceFastaDict
File
Sequence dictionary (.dict) file of the reference
SVcalling.referenceFastaFai
File
Fasta index (.fai) file of the reference
SVcalling.sample
String
The name of the sample

Other common inputs

SVcalling.manta.callRegions
File?
The bed file which indicates the regions to operate on.
SVcalling.manta.callRegionsIndex
File?
The index of the bed file which indicates the regions to operate on.
SVcalling.manta.exome
Boolean — Default: false
Whether or not the data is from exome sequencing.
SVcalling.outputDir
String — Default: "."
The directory the output should be written to.
SVcalling.setId.annsFile
File?
Bgzip-compressed and tabix-indexed file with annotations (see man page for details).
SVcalling.setId.annsFileIndex
File?
The index for annsFile.
SVcalling.setId.inputFileIndex
File?
The index for the input vcf or bcf.

Advanced inputs

Show/Hide
SVcalling.annotateDH.memory
String — Default: "15GiB"
The memory required to run the programs.
SVcalling.annotateDH.timeMinutes
Int — Default: 1440
The maximum duration (in minutes) the tool is allowed to run.
SVcalling.clever.memory
String — Default: "80GiB"
The memory required to run the programs.
SVcalling.clever.threads
Int — Default: 10
The the number of threads required to run a program.
SVcalling.clever.timeMinutes
Int — Default: 2200
The maximum amount of time the job will run in minutes.
SVcalling.delly.genotypeBcf
File?
A BCF with SVs to get genotyped in the samples.
SVcalling.delly.genotypeBcfIndex
File?
The index for the genotype BCF file.
SVcalling.delly.memory
String — Default: "15GiB"
The memory required to run the programs.
SVcalling.delly.timeMinutes
Int — Default: 600
The maximum amount of time the job will run in minutes.
SVcalling.delly2vcf.exclude
String?
Exclude sites for which the expression is true (see man page for details).
SVcalling.delly2vcf.excludeUncalled
Boolean — Default: false
Exclude sites without a called genotype (see man page for details).
SVcalling.delly2vcf.include
String?
Select sites for which the expression is true (see man page for details).
SVcalling.delly2vcf.memory
String — Default: "256MiB"
The amount of memory this job will use.
SVcalling.delly2vcf.samples
Array[String] — Default: []
A list of sample names to include.
SVcalling.delly2vcf.timeMinutes
Int — Default: 1 + ceil(size(inputFile,"G"))
The maximum amount of time the job will run in minutes.
SVcalling.dockerImages
Map[String,String] — Default: {"bcftools": "quay.io/biocontainers/bcftools:1.10.2--h4f4756c_2", "clever": "quay.io/biowdl/clever-toolkit:2.4", "delly": "quay.io/biocontainers/delly:0.8.5--hf3ca161_0", "manta": "quay.io/biocontainers/manta:1.4.0--py27_1", "picard": "quay.io/biocontainers/picard:2.23.2--0", "samtools": "quay.io/biocontainers/samtools:1.10--h9402c20_2", "survivor": "quay.io/biocontainers/survivor:1.0.7--hd03093a_2", "smoove": "quay.io/biocontainers/smoove:0.2.5--0", "duphold": "quay.io/biocontainers/duphold:0.2.1--h516909a_1", "gridss": "quay.io/biowdl/gridss:2.12.2"}
A map describing the docker image used for the tasks.
SVcalling.excludeMisHomRef
Boolean — Default: false
Option to exclude missing and homozygous reference genotypes.
SVcalling.FilterShortReadsBam.memory
String — Default: "1GiB"
The amount of memory this job will use.
SVcalling.FilterShortReadsBam.timeMinutes
Int — Default: 1 + ceil((size(bamFile,"GiB") * 8))
The maximum amount of time the job will run in minutes.
SVcalling.getIntersections.excludeUncalled
Boolean — Default: false
Exclude sites without a called genotype (see man page for details).
SVcalling.getIntersections.include
String?
Select sites for which the expression is true (see man page for details).
SVcalling.getIntersections.memory
String — Default: "256MiB"
The amount of memory this job will use.
SVcalling.getIntersections.samples
Array[String] — Default: []
A list of sample names to include.
SVcalling.getIntersections.timeMinutes
Int — Default: 1 + ceil(size(inputFile,"G"))
The maximum amount of time the job will run in minutes.
SVcalling.getSVtype.exclude
String?
Exclude sites for which the expression is true (see man page for details).
SVcalling.getSVtype.excludeUncalled
Boolean — Default: false
Exclude sites without a called genotype (see man page for details).
SVcalling.getSVtype.memory
String — Default: "256MiB"
The amount of memory this job will use.
SVcalling.getSVtype.samples
Array[String] — Default: []
A list of sample names to include.
SVcalling.getSVtype.timeMinutes
Int — Default: 1 + ceil(size(inputFile,"G"))
The maximum amount of time the job will run in minutes.
SVcalling.gridss.blacklistBed
File?
A bed file with blaclisted regins.
SVcalling.gridss.gridssProperties
File?
A properties file for gridss.
SVcalling.gridss.jvmHeapSizeGb
Int — Default: 64
The size of JVM heap for assembly and variant calling
SVcalling.gridss.nonJvmMemoryGb
Int — Default: 10
The amount of memory in Gb to be requested besides JVM memory.
SVcalling.gridss.normalBai
File?
The index for normalBam.
SVcalling.gridss.normalBam
File?
The BAM file for the normal/control sample.
SVcalling.gridss.normalLabel
String?
The name of the normal sample.
SVcalling.gridss.threads
Int — Default: 12
The number of the threads to use.
SVcalling.gridss.timeMinutes
Int — Default: ceil((7200 / threads)) + 1800
The maximum amount of time the job will run in minutes.
SVcalling.gridssSvTyped.dockerImage
String — Default: "quay.io/biocontainers/bioconductor-structuralvariantannotation:1.10.0--r41hdfd78af_0"
The docker image used for this task. Changing this may result in errors which the developers may choose not to address.
SVcalling.gridssSvTyped.memory
String — Default: "32GiB"
The amount of memory this job will use.
SVcalling.gridssSvTyped.timeMinutes
Int — Default: 240
The maximum amount of time the job will run in minutes.
SVcalling.manta.timeMinutes
Int — Default: 2880
The maximum amount of time the job will run in minutes.
SVcalling.mateclever.cleverMaxDelLength
Int — Default: 100000
The maximum deletion length to look for in Clever predictions.
SVcalling.mateclever.maxLengthDiff
Int — Default: 30
The maximum length difference between split-read and read-pair deletion to be considered identical.
SVcalling.mateclever.maxOffset
Int — Default: 150
The maximum center distance between split-read and read-pair deletion to be considered identical.
SVcalling.mateclever.memory
String — Default: "250GiB"
The memory required to run the programs.
SVcalling.mateclever.threads
Int — Default: 10
The the number of threads required to run a program.
SVcalling.mateclever.timeMinutes
Int — Default: 2880
The maximum amount of time the job will run in minutes.
SVcalling.removeFpDupDel.exclude
String?
Exclude sites for which the expression is true (see man page for details).
SVcalling.removeFpDupDel.excludeUncalled
Boolean — Default: false
Exclude sites without a called genotype (see man page for details).
SVcalling.removeFpDupDel.memory
String — Default: "256MiB"
The amount of memory this job will use.
SVcalling.removeFpDupDel.samples
Array[String] — Default: []
A list of sample names to include.
SVcalling.removeFpDupDel.timeMinutes
Int — Default: 1 + ceil(size(inputFile,"G"))
The maximum amount of time the job will run in minutes.
SVcalling.removeMisHomRR.include
String?
Select sites for which the expression is true (see man page for details).
SVcalling.removeMisHomRR.memory
String — Default: "256MiB"
The amount of memory this job will use.
SVcalling.removeMisHomRR.samples
Array[String] — Default: []
A list of sample names to include.
SVcalling.removeMisHomRR.timeMinutes
Int — Default: 1 + ceil(size(inputFile,"G"))
The maximum amount of time the job will run in minutes.
SVcalling.renameSample.javaXmx
String — Default: "8G"
The maximum memory available to the program. Should be lower than `memory` to accommodate JVM overhead.
SVcalling.renameSample.memory
String — Default: "9GiB"
The memory required to run the programs.
SVcalling.renameSample.timeMinutes
Int — Default: 1 + ceil((size(inputVcf,"GiB") * 2))
The maximum amount of time the job will run in minutes.
SVcalling.runClever
Boolean — Default: false
Whether or not to run clever.
SVcalling.runDupHold
Boolean — Default: false
Option to run duphold annotation and filter FP deletions and duplications.
SVcalling.runSmoove
Boolean — Default: true
Whether or not to run smoove.
SVcalling.setId.collapse
String?
Treat as identical records with <snps|indels|both|all|some|none>, see man page for details.
SVcalling.setId.columns
Array[String] — Default: []
Comma-separated list of columns or tags to carry over from the annotation file (see man page for details).
SVcalling.setId.exclude
String?
Exclude sites for which the expression is true (see man page for details).
SVcalling.setId.force
Boolean — Default: false
Continue even when parsing errors, such as undefined tags, are encountered.
SVcalling.setId.headerLines
File?
Lines to append to the VCF header (see man page for details).
SVcalling.setId.include
String?
Select sites for which the expression is true (see man page for details).
SVcalling.setId.keepSites
Boolean — Default: false
Keep sites which do not pass -i and -e expressions instead of discarding them.
SVcalling.setId.markSites
String?
Annotate sites which are present ('+') or absent ('-') in the -a file with a new INFO/TAG flag.
SVcalling.setId.memory
String — Default: "4GiB"
The amount of memory this job will use.
SVcalling.setId.noVersion
Boolean — Default: false
Do not append version and command line information to the output VCF header.
SVcalling.setId.regions
String?
Restrict to comma-separated list of regions.
SVcalling.setId.regionsFile
File?
Restrict to regions listed in a file.
SVcalling.setId.removeAnns
Array[String] — Default: []
List of annotations to remove (see man page for details).
SVcalling.setId.renameChrs
File?
rename chromosomes according to the map in file (see man page for details).
SVcalling.setId.samples
Array[String] — Default: []
List of samples for sample stats, "-" to include all samples.
SVcalling.setId.samplesFile
File?
File of samples to include.
SVcalling.setId.singleOverlaps
Boolean — Default: false
keep memory requirements low with very large annotation files.
SVcalling.setId.threads
Int — Default: 0
Number of extra decompression threads [0].
SVcalling.setId.timeMinutes
Int — Default: 60 + ceil(size(inputFile,"G"))
The maximum amount of time the job will run in minutes.
SVcalling.smoove.memory
String — Default: "15GiB"
The memory required to run the programs.
SVcalling.smoove.timeMinutes
Int — Default: 1440
The maximum duration (in minutes) the tool is allowed to run.
SVcalling.sort.memory
String — Default: "5GiB"
The amount of memory this job will use.
SVcalling.sort.timeMinutes
Int — Default: 1 + ceil(size(inputFile,"G")) * 5
The maximum amount of time the job will run in minutes.
SVcalling.survivor.breakpointDistance
Int — Default: 1000
The distance between pairwise breakpoints between SVs.
SVcalling.survivor.distanceBySvSize
Boolean — Default: false
A boolean to predict the pairwise distance between the SVs based on their size.
SVcalling.survivor.memory
String — Default: "24GiB"
The memory required to run the programs.
SVcalling.survivor.minSize
Int — Default: 30
The mimimum size of SV to be merged.
SVcalling.survivor.strandType
Boolean — Default: true
A boolean to include strand type of an SV to be merged.
SVcalling.survivor.svType
Boolean — Default: true
A boolean to include the type SV to be merged.
SVcalling.survivor.timeMinutes
Int — Default: 60
The maximum amount of time the job will run in minutes.
SVcalling.svtypes
Array[String] — Default: ["DEL", "DUP", "INS", "INV", "BND"]
List of svtypes to be further processed and output by the pipeline.