BioWDL: gatk-preprocess

A BioWDL workflow for preprocessing BAM files for variantcalling. Based on the GATK Best Practices.

Inputs for GatkPreprocess

The following is an overview of all available inputs in GatkPreprocess.

Required inputs

GatkPreprocess.bam
File
The BAM file which should be processed
GatkPreprocess.bamIndex
File
The index for the BAM file
GatkPreprocess.dbsnpVCF
File
A dbSNP vcf.
GatkPreprocess.dbsnpVCFIndex
File
Index for dbSNP vcf.
GatkPreprocess.referenceFasta
File
The reference fasta file
GatkPreprocess.referenceFastaDict
File
Sequence dictionary (.dict) for the reference fasta file
GatkPreprocess.referenceFastaFai
File
Fasta index (.fai) for the reference fasta file
GatkPreprocess.scatters
Array[File]
The bed files to be used

Other common inputs

GatkPreprocess.bamName
String — Default: "recalibrated"
The basename for the produced BAM files. This should not include any parent direcoties, use `outputDir` if the output directory should be changed.
GatkPreprocess.outputDir
String — Default: "."
The directory to which the outputs will be written.
GatkPreprocess.splitSplicedReads
Boolean — Default: false
Whether or not gatk's SplitNCgarReads should be run to split spliced reads. This should be enabled for RNAseq samples.

Advanced inputs

Show/Hide
GatkPreprocess.applyBqsr.javaXmxMb
Int — Default: 2048
The maximum memory available to the program in megabytes. Should be lower than `memoryMb` to accommodate JVM overhead.
GatkPreprocess.applyBqsr.memoryMb
Int — Default: javaXmxMb + 512
The amount of memory this job will use in megabytes.
GatkPreprocess.baseRecalibrator.javaXmxMb
Int — Default: 1024
The maximum memory available to the program in megabytes. Should be lower than `memoryMb` to accommodate JVM overhead.
GatkPreprocess.baseRecalibrator.knownIndelsSitesVCFIndexes
Array[File] — Default: []
The indexed for the known variant VCFs.
GatkPreprocess.baseRecalibrator.knownIndelsSitesVCFs
Array[File] — Default: []
VCF files with known indels.
GatkPreprocess.baseRecalibrator.memoryMb
Int — Default: javaXmxMb + 512
The amount of memory this job will use in megabytes.
GatkPreprocess.dockerImages
Map[String,String] — Default: {"picard": "quay.io/biocontainers/picard:2.23.2--0", "gatk4": "quay.io/biocontainers/gatk4:4.1.8.0--py38h37ae868_0"}
The docker images used. Changing this may result in errors which the developers may choose not to address.
GatkPreprocess.gatherBamFiles.compressionLevel
Int?
The compression level of the output BAM.
GatkPreprocess.gatherBamFiles.createMd5File
Boolean — Default: false
???
GatkPreprocess.gatherBamFiles.javaXmxMb
Int — Default: 1024
The maximum memory available to the program in megabytes. Should be lower than `memoryMb` to accommodate JVM overhead.
GatkPreprocess.gatherBamFiles.memoryMb
Int — Default: javaXmxMb + 512
The amount of memory this job will use in megabytes.
GatkPreprocess.gatherBamFiles.timeMinutes
Int — Default: 1 + ceil((size(inputBams,"G") * 1))
The maximum amount of time the job will run in minutes.
GatkPreprocess.gatherBqsr.javaXmxMb
Int — Default: 256
The maximum memory available to the program in megabytes. Should be lower than `memory` to accommodate JVM overhead.
GatkPreprocess.gatherBqsr.memoryMb
Int — Default: 256 + javaXmxMb
The amount of memory this job will use in megabytes.
GatkPreprocess.gatherBqsr.timeMinutes
Int — Default: 1
The maximum amount of time the job will run in minutes.
GatkPreprocess.splitNCigarReads.javaXmx
String — Default: "4G"
The maximum memory available to the program. Should be lower than `memory` to accommodate JVM overhead.
GatkPreprocess.splitNCigarReads.memory
String — Default: "5G"
The amount of memory this job will use.