This is not a stable version!
You are currently viewing the documentation for a development version. It is not guaranteed that this documentation is up to date. Things will likely change without announcement or versioning incrementation. If there is no other documentation available, there are likely no releases available for this repository. The content is, therefore, likely still in development and not production ready. Use at your own risk!
Please be aware that the page you are currently viewing is not for the latest available version!
Inputs for MultisampleCalling
The following is an overview of all available inputs in MultisampleCalling.
Required inputs
- MultisampleCalling.bamFilesAndGenders
-
Array[BamAndGender]
List of structs containing,BAM file, BAM index and gender. The BAM should be recalibrated beforehand if required. The gender string is optional. Actionable values are 'female','f','F','male','m' and 'M'. - MultisampleCalling.referenceFasta
-
File
The reference fasta file - MultisampleCalling.referenceFastaDict
-
File
Sequence dictionary (.dict) file of the reference - MultisampleCalling.referenceFastaFai
-
File
Fasta index (.fai) file of the reference
Other common inputs
- MultisampleCalling.dbsnpVCF
-
File?
dbsnp VCF file used for checking known sites - MultisampleCalling.dbsnpVCFIndex
-
File?
Index (.tbi) file for the dbsnp VCF - MultisampleCalling.dontUseSoftClippedBases
-
Boolean — Default:
false
Whether soft-clipped bases should be excluded from the haplotype caller analysis (should be set to 'true' for RNA). - MultisampleCalling.jointgenotyping
-
Boolean — Default:
true
Whether to perform jointgenotyping (using HaplotypeCaller to call GVCFs and merge them with GenotypeGVCFs) or not - MultisampleCalling.JointGenotyping.genotypeGvcfs.pedigree
-
File?
Pedigree file for determining the population "founders". - MultisampleCalling.JointGenotyping.regions
-
File?
A bed file describing the regions to operate on. - MultisampleCalling.JointGenotyping.Stats.compareVcf
-
File?
When inputVcf and compareVCF are given, the program generates separate stats for intersection and the complements. By default only sites are compared, samples must be given to include also sample columns. - MultisampleCalling.JointGenotyping.Stats.compareVcfIndex
-
File?
Index for the compareVcf. - MultisampleCalling.outputDir
-
String — Default:
"."
The directory where the output files should be located - MultisampleCalling.regions
-
File?
A bed file describing the regions to operate on. - MultisampleCalling.singleSampleCalling.callAutosomal.excludeIntervalList
-
Array[File]+?
Bed files or interval lists describing the regions to NOT operate on. - MultisampleCalling.singleSampleCalling.callAutosomal.pedigree
-
File?
Pedigree file for determining the population "founders". - MultisampleCalling.singleSampleCalling.callAutosomal.ploidy
-
Int?
The ploidy with which the variants should be called. - MultisampleCalling.singleSampleCalling.callX.excludeIntervalList
-
Array[File]+?
Bed files or interval lists describing the regions to NOT operate on. - MultisampleCalling.singleSampleCalling.callX.pedigree
-
File?
Pedigree file for determining the population "founders". - MultisampleCalling.singleSampleCalling.callY.excludeIntervalList
-
Array[File]+?
Bed files or interval lists describing the regions to NOT operate on. - MultisampleCalling.singleSampleCalling.callY.pedigree
-
File?
Pedigree file for determining the population "founders". - MultisampleCalling.singleSampleCalling.Stats.compareVcf
-
File?
When inputVcf and compareVCF are given, the program generates separate stats for intersection and the complements. By default only sites are compared, samples must be given to include also sample columns. - MultisampleCalling.singleSampleCalling.Stats.compareVcfIndex
-
File?
Index for the compareVcf. - MultisampleCalling.singleSampleGvcf
-
Boolean — Default:
false
Whether to output single-sample gvcfs - MultisampleCalling.vcfBasename
-
String — Default:
"multisample"
The basename of the VCF and GVCF files that are outputted by the workflow - MultisampleCalling.XNonParRegions
-
File?
Bed file with the non-PAR regions of X - MultisampleCalling.YNonParRegions
-
File?
Bed file with the non-PAR regions of Y
Advanced inputs
Show/Hide
- MultisampleCalling.calculateRegions.intersectAutosomalRegions.memory
-
String — Default:
"~{512 + ceil(size([regionsA, regionsB],"MiB"))}MiB"
The amount of memory needed for the job. - MultisampleCalling.calculateRegions.intersectAutosomalRegions.timeMinutes
-
Int — Default:
1 + ceil(size([regionsA, regionsB],"GiB"))
The maximum amount of time the job will run in minutes. - MultisampleCalling.calculateRegions.intersectX.memory
-
String — Default:
"~{512 + ceil(size([regionsA, regionsB],"MiB"))}MiB"
The amount of memory needed for the job. - MultisampleCalling.calculateRegions.intersectX.timeMinutes
-
Int — Default:
1 + ceil(size([regionsA, regionsB],"GiB"))
The maximum amount of time the job will run in minutes. - MultisampleCalling.calculateRegions.intersectY.memory
-
String — Default:
"~{512 + ceil(size([regionsA, regionsB],"MiB"))}MiB"
The amount of memory needed for the job. - MultisampleCalling.calculateRegions.intersectY.timeMinutes
-
Int — Default:
1 + ceil(size([regionsA, regionsB],"GiB"))
The maximum amount of time the job will run in minutes. - MultisampleCalling.calculateRegions.inverseBed.memory
-
String — Default:
"~{512 + ceil(size([inputBed, faidx],"MiB"))}MiB"
The amount of memory needed for the job. - MultisampleCalling.calculateRegions.inverseBed.timeMinutes
-
Int — Default:
1 + ceil(size([inputBed, faidx],"G"))
The maximum amount of time the job will run in minutes. - MultisampleCalling.calculateRegions.mergeBeds.memory
-
String — Default:
"~{512 + ceil(size(bedFiles,"MiB"))}MiB"
The amount of memory needed for the job. - MultisampleCalling.calculateRegions.mergeBeds.outputBed
-
String — Default:
"merged.bed"
The path to write the output to. - MultisampleCalling.calculateRegions.mergeBeds.timeMinutes
-
Int — Default:
1 + ceil(size(bedFiles,"G"))
The maximum amount of time the job will run in minutes. - MultisampleCalling.calculateRegions.scatterAutosomalRegions.memory
-
String — Default:
"256MiB"
The amount of memory this job will use. - MultisampleCalling.calculateRegions.scatterAutosomalRegions.prefix
-
String — Default:
"scatters/scatter-"
The prefix of the ouput files. Output will be named like:.bed, in which N is an incrementing number. Default 'scatter-'. </dd> - MultisampleCalling.calculateRegions.scatterAutosomalRegions.splitContigs
- Boolean — Default:
false
If set, contigs are allowed to be split up over multiple files.- MultisampleCalling.calculateRegions.scatterAutosomalRegions.timeMinutes
- Int — Default:
2
The maximum amount of time the job will run in minutes.- MultisampleCalling.dockerImages
- Map[String,String] — Default:
{"bedtools": "quay.io/biocontainers/bedtools:2.23.0--hdbcaa40_3", "picard": "quay.io/biocontainers/picard:2.23.2--0", "gatk4": "quay.io/biocontainers/gatk4:4.1.8.0--py38h37ae868_0", "chunked-scatter": "quay.io/biocontainers/chunked-scatter:1.0.0--py_0", "bcftools": "quay.io/biocontainers/bcftools:1.10.2--h4f4756c_2"}
specify which docker images should be used for running this pipeline- MultisampleCalling.JointGenotyping.gatherGvcfs.intervals
- Array[File] — Default:
[]
Bed files or interval lists describing the regions to operate on.- MultisampleCalling.JointGenotyping.gatherGvcfs.javaXmx
- String — Default:
"4G"
The maximum memory available to the program. Should be lower than `memory` to accommodate JVM overhead.- MultisampleCalling.JointGenotyping.gatherGvcfs.memory
- String — Default:
"5GiB"
The amount of memory this job will use.- MultisampleCalling.JointGenotyping.gatherGvcfs.timeMinutes
- Int — Default:
1 + ceil((size(gvcfFiles,"G") * 8))
The maximum amount of time the job will run in minutes.- MultisampleCalling.JointGenotyping.gatherVcfs.compressionLevel
- Int — Default:
1
The compression level at which the BAM files are written.- MultisampleCalling.JointGenotyping.gatherVcfs.javaXmx
- String — Default:
"4G"
The maximum memory available to the program. Should be lower than `memory` to accommodate JVM overhead.- MultisampleCalling.JointGenotyping.gatherVcfs.memory
- String — Default:
"5GiB"
The amount of memory this job will use.- MultisampleCalling.JointGenotyping.gatherVcfs.timeMinutes
- Int — Default:
1 + ceil(size(inputVCFs,"GiB")) * 2
The maximum amount of time the job will run in minutes.- MultisampleCalling.JointGenotyping.gatherVcfs.useJdkDeflater
- Boolean — Default:
true
True, uses the java deflator to compress the BAM files. False uses the optimized intel deflater.- MultisampleCalling.JointGenotyping.gatherVcfs.useJdkInflater
- Boolean — Default:
false
True, uses the java inflater. False, uses the optimized intel inflater.- MultisampleCalling.JointGenotyping.genotypeGvcfs.annotationGroups
- Array[String] — Default:
["StandardAnnotation"]
Which annotation groups will be used for the annotation.- MultisampleCalling.JointGenotyping.genotypeGvcfs.javaXmx
- String — Default:
"6G"
The maximum memory available to the program. Should be lower than `memory` to accommodate JVM overhead.- MultisampleCalling.JointGenotyping.genotypeGvcfs.memory
- String — Default:
"7GiB"
The amount of memory this job will use.- MultisampleCalling.JointGenotyping.genotypeGvcfs.timeMinutes
- Int — Default:
120
The maximum amount of time the job will run in minutes.- MultisampleCalling.JointGenotyping.sampleIds
- Array[String] — Default:
[]
Sample IDs which should be analysed by the stats tools.- MultisampleCalling.JointGenotyping.scatterRegions.memory
- String — Default:
"256MiB"
The amount of memory this job will use.- MultisampleCalling.JointGenotyping.scatterRegions.prefix
- String — Default:
"scatters/scatter-"
The prefix of the ouput files. Output will be named like:.bed, in which N is an incrementing number. Default 'scatter-'. </dd> - MultisampleCalling.JointGenotyping.scatterRegions.splitContigs
- Boolean — Default:
false
If set, contigs are allowed to be split up over multiple files.- MultisampleCalling.JointGenotyping.scatterRegions.timeMinutes
- Int — Default:
2
The maximum amount of time the job will run in minutes.- MultisampleCalling.JointGenotyping.Stats.afBins
- String?
Allele frequency bins, a list (0.1,0.5,1) or a file (0.1 0.5 1).- MultisampleCalling.JointGenotyping.Stats.applyFilters
- String?
Require at least one of the listed FILTER strings (e.g. "PASS,.").- MultisampleCalling.JointGenotyping.Stats.collapse
- String?
Treat as identical records with <snps|indels|both|all|some|none>, see man page for details.- MultisampleCalling.JointGenotyping.Stats.depth
- String?
Depth distribution: min,max,bin size [0,500,1].- MultisampleCalling.JointGenotyping.Stats.exclude
- String?
Exclude sites for which the expression is true (see man page for details).- MultisampleCalling.JointGenotyping.Stats.exons
- File?
Tab-delimited file with exons for indel frameshifts (chr,from,to; 1-based, inclusive, bgzip compressed).- MultisampleCalling.JointGenotyping.Stats.firstAlleleOnly
- Boolean — Default:
false
Include only 1st allele at multiallelic sites.- MultisampleCalling.JointGenotyping.Stats.include
- String?
Select sites for which the expression is true (see man page for details).- MultisampleCalling.JointGenotyping.Stats.memory
- String — Default:
"256MiB"
The amount of memory this job will use.- MultisampleCalling.JointGenotyping.Stats.regions
- String?
Restrict to comma-separated list of regions.- MultisampleCalling.JointGenotyping.Stats.samplesFile
- File?
File of samples to include.- MultisampleCalling.JointGenotyping.Stats.splitByID
- Boolean — Default:
false
Collect stats for sites with ID separately (known vs novel).- MultisampleCalling.JointGenotyping.Stats.targets
- String?
Similar to regions but streams rather than index-jumps.- MultisampleCalling.JointGenotyping.Stats.targetsFile
- File?
Similar to regionsFile but streams rather than index-jumps.- MultisampleCalling.JointGenotyping.Stats.threads
- Int — Default:
0
Number of extra decompression threads [0].- MultisampleCalling.JointGenotyping.Stats.timeMinutes
- Int — Default:
1 + 2 * ceil(size(select_all([inputVcf, compareVcf]),"G"))
The maximum amount of time the job will run in minutes.- MultisampleCalling.JointGenotyping.Stats.userTsTv
- String?
<TAG[:min:max:n]>. Collect Ts/Tv stats for any tag using the given binning [0:1:100].- MultisampleCalling.JointGenotyping.Stats.verbose
- Boolean — Default:
false
Produce verbose per-site and per-sample output.- MultisampleCalling.scatterSize
- Int?
The size of the scattered regions in bases. Scattering is used to speed up certain processes. The genome will be seperated into multiple chunks (scatters) which will be processed in their own job, allowing for parallel processing. Higher values will result in a lower number of jobs. The optimal value here will depend on the available resources.- MultisampleCalling.scatterSizeMillions
- Int — Default:
1000
Same as scatterSize, but is multiplied by 1000000 to get scatterSize. This allows for setting larger values more easily- MultisampleCalling.singleSampleCalling.callAutosomal.contamination
- Float?
Equivalent to HaplotypeCaller's `-contamination` option.- MultisampleCalling.singleSampleCalling.callAutosomal.emitRefConfidence
- String — Default:
if gvcf then "GVCF" else "NONE"
Whether to include reference calls. Three modes: 'NONE', 'BP_RESOLUTION' and 'GVCF'.- MultisampleCalling.singleSampleCalling.callAutosomal.javaXmxMb
- Int — Default:
4096
The maximum memory available to the program in megabytes. Should be lower than `memoryMb` to accommodate JVM overhead.- MultisampleCalling.singleSampleCalling.callAutosomal.memoryMb
- Int — Default:
javaXmxMb + 512
The amount of memory this job will use in megabytes.- MultisampleCalling.singleSampleCalling.callAutosomal.outputMode
- String?
Specifies which type of calls we should output. Same as HaplotypeCaller's `--output-mode` option.- MultisampleCalling.singleSampleCalling.callX.contamination
- Float?
Equivalent to HaplotypeCaller's `-contamination` option.- MultisampleCalling.singleSampleCalling.callX.emitRefConfidence
- String — Default:
if gvcf then "GVCF" else "NONE"
Whether to include reference calls. Three modes: 'NONE', 'BP_RESOLUTION' and 'GVCF'.- MultisampleCalling.singleSampleCalling.callX.javaXmxMb
- Int — Default:
4096
The maximum memory available to the program in megabytes. Should be lower than `memoryMb` to accommodate JVM overhead.- MultisampleCalling.singleSampleCalling.callX.memoryMb
- Int — Default:
javaXmxMb + 512
The amount of memory this job will use in megabytes.- MultisampleCalling.singleSampleCalling.callX.outputMode
- String?
Specifies which type of calls we should output. Same as HaplotypeCaller's `--output-mode` option.- MultisampleCalling.singleSampleCalling.callY.contamination
- Float?
Equivalent to HaplotypeCaller's `-contamination` option.- MultisampleCalling.singleSampleCalling.callY.emitRefConfidence
- String — Default:
if gvcf then "GVCF" else "NONE"
Whether to include reference calls. Three modes: 'NONE', 'BP_RESOLUTION' and 'GVCF'.- MultisampleCalling.singleSampleCalling.callY.javaXmxMb
- Int — Default:
4096
The maximum memory available to the program in megabytes. Should be lower than `memoryMb` to accommodate JVM overhead.- MultisampleCalling.singleSampleCalling.callY.memoryMb
- Int — Default:
javaXmxMb + 512
The amount of memory this job will use in megabytes.- MultisampleCalling.singleSampleCalling.callY.outputMode
- String?
Specifies which type of calls we should output. Same as HaplotypeCaller's `--output-mode` option.- MultisampleCalling.singleSampleCalling.mergeSingleSampleGvcf.intervals
- Array[File] — Default:
[]
Bed files or interval lists describing the regions to operate on.- MultisampleCalling.singleSampleCalling.mergeSingleSampleGvcf.javaXmx
- String — Default:
"4G"
The maximum memory available to the program. Should be lower than `memory` to accommodate JVM overhead.- MultisampleCalling.singleSampleCalling.mergeSingleSampleGvcf.memory
- String — Default:
"5GiB"
The amount of memory this job will use.- MultisampleCalling.singleSampleCalling.mergeSingleSampleGvcf.timeMinutes
- Int — Default:
1 + ceil((size(gvcfFiles,"G") * 8))
The maximum amount of time the job will run in minutes.- MultisampleCalling.singleSampleCalling.mergeSingleSampleVcf.compressionLevel
- Int — Default:
1
The compression level at which the BAM files are written.- MultisampleCalling.singleSampleCalling.mergeSingleSampleVcf.javaXmx
- String — Default:
"4G"
The maximum memory available to the program. Should be lower than `memory` to accommodate JVM overhead.- MultisampleCalling.singleSampleCalling.mergeSingleSampleVcf.memory
- String — Default:
"5GiB"
The amount of memory this job will use.- MultisampleCalling.singleSampleCalling.mergeSingleSampleVcf.timeMinutes
- Int — Default:
1 + ceil(size(inputVCFs,"GiB")) * 2
The maximum amount of time the job will run in minutes.- MultisampleCalling.singleSampleCalling.mergeSingleSampleVcf.useJdkDeflater
- Boolean — Default:
true
True, uses the java deflator to compress the BAM files. False uses the optimized intel deflater.- MultisampleCalling.singleSampleCalling.mergeSingleSampleVcf.useJdkInflater
- Boolean — Default:
false
True, uses the java inflater. False, uses the optimized intel inflater.- MultisampleCalling.singleSampleCalling.Stats.afBins
- String?
Allele frequency bins, a list (0.1,0.5,1) or a file (0.1 0.5 1).- MultisampleCalling.singleSampleCalling.Stats.applyFilters
- String?
Require at least one of the listed FILTER strings (e.g. "PASS,.").- MultisampleCalling.singleSampleCalling.Stats.collapse
- String?
Treat as identical records with <snps|indels|both|all|some|none>, see man page for details.- MultisampleCalling.singleSampleCalling.Stats.depth
- String?
Depth distribution: min,max,bin size [0,500,1].- MultisampleCalling.singleSampleCalling.Stats.exclude
- String?
Exclude sites for which the expression is true (see man page for details).- MultisampleCalling.singleSampleCalling.Stats.exons
- File?
Tab-delimited file with exons for indel frameshifts (chr,from,to; 1-based, inclusive, bgzip compressed).- MultisampleCalling.singleSampleCalling.Stats.firstAlleleOnly
- Boolean — Default:
false
Include only 1st allele at multiallelic sites.- MultisampleCalling.singleSampleCalling.Stats.include
- String?
Select sites for which the expression is true (see man page for details).- MultisampleCalling.singleSampleCalling.Stats.memory
- String — Default:
"256MiB"
The amount of memory this job will use.- MultisampleCalling.singleSampleCalling.Stats.regions
- String?
Restrict to comma-separated list of regions.- MultisampleCalling.singleSampleCalling.Stats.samplesFile
- File?
File of samples to include.- MultisampleCalling.singleSampleCalling.Stats.splitByID
- Boolean — Default:
false
Collect stats for sites with ID separately (known vs novel).- MultisampleCalling.singleSampleCalling.Stats.targets
- String?
Similar to regions but streams rather than index-jumps.- MultisampleCalling.singleSampleCalling.Stats.targetsFile
- File?
Similar to regionsFile but streams rather than index-jumps.- MultisampleCalling.singleSampleCalling.Stats.threads
- Int — Default:
0
Number of extra decompression threads [0].- MultisampleCalling.singleSampleCalling.Stats.timeMinutes
- Int — Default:
1 + 2 * ceil(size(select_all([inputVcf, compareVcf]),"G"))
The maximum amount of time the job will run in minutes.- MultisampleCalling.singleSampleCalling.Stats.userTsTv
- String?
<TAG[:min:max:n]>. Collect Ts/Tv stats for any tag using the given binning [0:1:100].- MultisampleCalling.singleSampleCalling.Stats.verbose
- Boolean — Default:
false
Produce verbose per-site and per-sample output.- MultisampleCalling.singleSampleCalling.statsRegions
- File?
Which regions need to be analysed by the stats tools.- MultisampleCalling.singleSampleCalling.timeMinutes
- Int — Default:
ceil((size(bam,"G") * 120))
The time in minutes expected for each haplotype caller task. Will be exposed as the time_minutes runtime attribute.- MultisampleCalling.standardMinConfidenceThresholdForCalling
- Float?
</dl> </details>
Minimum confidence treshold used by haplotype caller.