sequence-classification

A workflow for processing metagenomics data sets using Centrifuge.

Inputs for Classification

The following is an overview of all available inputs in Classification.

Required inputs

Classification.dockerImagesFile
File
The docker image used for this workflow. Changing this may result in errors which the developers may choose not to address.
Classification.sampleConfigFile
File
Samplesheet describing input fasta/fastq files.
Classification.sampleWorkflow.centrifuge.inputFormat
String — Default: "fastq"
The format of the read file(s).
Classification.sampleWorkflow.centrifuge.minHitLength
Int — Default: 22
Minimum length of partial hits.
Classification.sampleWorkflow.centrifuge.phred64
Boolean — Default: false
If set to true, phred+64 encoding is used.
Classification.sampleWorkflow.centrifugeIndex
Array[File]+
The files of the index for the reference genomes.

Other common inputs

Classification.outputDirectory
String — Default: "."
The directory to which the outputs will be written.
Classification.sampleWorkflow.centrifuge.excludeTaxIDs
String?
A comma-separated list of taxonomic IDs that will be excluded in classification procedure.
Classification.sampleWorkflow.centrifuge.hostTaxIDs
String?
A comma-separated list of taxonomic IDs that will be preferred in classification procedure.
Classification.sampleWorkflow.centrifuge.reportMaxDistinct
Int?
It searches for at most distinct, primary assignments for each read or pair. </dd>
Classification.sampleWorkflow.centrifuge.trim3
Int?
Trim bases from 3' (right) end of each read before alignment. </dd>
Classification.sampleWorkflow.centrifuge.trim5
Int?
Trim bases from 5' (left) end of each read before alignment. </dd>
Classification.sampleWorkflow.qualityControl.adapterForward
String? — Default: "AGATCGGAAGAG"
The adapter to be removed from the reads first or single end reads.
Classification.sampleWorkflow.qualityControl.adapterReverse
String? — Default: "AGATCGGAAGAG"
The adapter to be removed from the reads second end reads.
Classification.sampleWorkflow.qualityControl.contaminations
Array[String]+?
Contaminants/adapters to be removed from the reads.
</dl> ## Advanced inputs
Show/Hide
Classification.convertDockerImagesFile.dockerImage
String — Default: "quay.io/biocontainers/biowdl-input-converter:0.2.1--py_0"
The docker image used for this task. Changing this may result in errors which the developers may choose not to address.
Classification.convertDockerImagesFile.memory
String — Default: "128M"
The maximum amount of memory the job will need.
Classification.convertDockerImagesFile.timeMinutes
Int — Default: 1
The maximum amount of time the job will run in minutes.
Classification.convertSampleConfig.checkFileMd5sums
Boolean — Default: false
Whether or not the MD5 sums of the files mentioned in the samplesheet should be checked.
Classification.convertSampleConfig.old
Boolean — Default: false
Whether or not the old samplesheet format should be used.
Classification.convertSampleConfig.skipFileCheck
Boolean — Default: true
Whether or not the existance of the files mentioned in the samplesheet should be checked.
Classification.convertSampleConfig.timeMinutes
Int — Default: 1
The maximum amount of time the job will run in minutes.
Classification.multiqcTask.clConfig
String?
Equivalent to MultiQC's `--cl-config` option.
Classification.multiqcTask.comment
String?
Equivalent to MultiQC's `--comment` option.
Classification.multiqcTask.config
File?
Equivalent to MultiQC's `--config` option.
Classification.multiqcTask.dataFormat
String?
Equivalent to MultiQC's `--data-format` option.
Classification.multiqcTask.dirs
Boolean — Default: false
Equivalent to MultiQC's `--dirs` flag.
Classification.multiqcTask.dirsDepth
Int?
Equivalent to MultiQC's `--dirs-depth` option.
Classification.multiqcTask.exclude
Array[String]+?
Equivalent to MultiQC's `--exclude` option.
Classification.multiqcTask.export
Boolean — Default: false
Equivalent to MultiQC's `--export` flag.
Classification.multiqcTask.fileList
File?
Equivalent to MultiQC's `--file-list` option.
Classification.multiqcTask.fileName
String?
Equivalent to MultiQC's `--filename` option.
Classification.multiqcTask.flat
Boolean — Default: false
Equivalent to MultiQC's `--flat` flag.
Classification.multiqcTask.force
Boolean — Default: false
Equivalent to MultiQC's `--force` flag.
Classification.multiqcTask.fullNames
Boolean — Default: false
Equivalent to MultiQC's `--fullnames` flag.
Classification.multiqcTask.ignore
String?
Equivalent to MultiQC's `--ignore` option.
Classification.multiqcTask.ignoreSamples
String?
Equivalent to MultiQC's `--ignore-samples` option.
Classification.multiqcTask.interactive
Boolean — Default: true
Equivalent to MultiQC's `--interactive` flag.
Classification.multiqcTask.lint
Boolean — Default: false
Equivalent to MultiQC's `--lint` flag.
Classification.multiqcTask.megaQCUpload
Boolean — Default: false
Opposite to MultiQC's `--no-megaqc-upload` flag.
Classification.multiqcTask.memory
String?
The amount of memory this job will use.
Classification.multiqcTask.module
Array[String]+?
Equivalent to MultiQC's `--module` option.
Classification.multiqcTask.pdf
Boolean — Default: false
Equivalent to MultiQC's `--pdf` flag.
Classification.multiqcTask.sampleNames
File?
Equivalent to MultiQC's `--sample-names` option.
Classification.multiqcTask.tag
String?
Equivalent to MultiQC's `--tag` option.
Classification.multiqcTask.template
String?
Equivalent to MultiQC's `--template` option.
Classification.multiqcTask.timeMinutes
Int — Default: 2 + ceil((size(reports,"G") * 8))
The maximum amount of time the job will run in minutes.
Classification.multiqcTask.title
String?
Equivalent to MultiQC's `--title` option.
Classification.multiqcTask.zipDataDir
Boolean — Default: true
Equivalent to MultiQC's `--zip-data-dir` flag.
Classification.sampleWorkflow.centrifuge.memory
String — Default: "16G"
The amount of memory available to the job.
Classification.sampleWorkflow.centrifuge.threads
Int — Default: 4
The number of threads to be used.
Classification.sampleWorkflow.kReport.isCountTable
Boolean — Default: false
The format of the file is taxIDCOUNT. </dd>
Classification.sampleWorkflow.kReport.memory
String — Default: "4G"
The amount of memory available to the job.
Classification.sampleWorkflow.kReport.minimumLength
Int?
Require a minimum alignment length to the read.
Classification.sampleWorkflow.kReport.minimumScore
Int?
Require a minimum score for reads to be counted.
Classification.sampleWorkflow.kReport.noLCA
Boolean — Default: false
Do not report the lca of multiple assignments, but report count fractions at the taxa.
Classification.sampleWorkflow.kReport.showZeros
Boolean — Default: false
Show clades that have zero reads.
Classification.sampleWorkflow.kReport.timeMinutes
Int — Default: 10
The maximum amount of time the job will run in minutes.
Classification.sampleWorkflow.krona.memory
String — Default: "4G"
The amount of memory available to the job.
Classification.sampleWorkflow.krona.timeMinutes
Int — Default: 1
The maximum amount of time the job will run in minutes.
Classification.sampleWorkflow.qualityControl.Cutadapt.bwa
Boolean?
Equivalent to cutadapt's --bwa flag.
Classification.sampleWorkflow.qualityControl.Cutadapt.colorspace
Boolean?
Equivalent to cutadapt's --colorspace flag.
Classification.sampleWorkflow.qualityControl.Cutadapt.compressionLevel
Int — Default: 1
The compression level if gzipped output is used.
Classification.sampleWorkflow.qualityControl.Cutadapt.cores
Int — Default: 4
The number of cores to use.
Classification.sampleWorkflow.qualityControl.Cutadapt.cut
Int?
Equivalent to cutadapt's --cut option.
Classification.sampleWorkflow.qualityControl.Cutadapt.discardTrimmed
Boolean?
Equivalent to cutadapt's --quality-cutoff option.
Classification.sampleWorkflow.qualityControl.Cutadapt.discardUntrimmed
Boolean?
Equivalent to cutadapt's --discard-untrimmed option.
Classification.sampleWorkflow.qualityControl.Cutadapt.doubleEncode
Boolean?
Equivalent to cutadapt's --double-encode flag.
Classification.sampleWorkflow.qualityControl.Cutadapt.errorRate
Float?
Equivalent to cutadapt's --error-rate option.
Classification.sampleWorkflow.qualityControl.Cutadapt.front
Array[String] — Default: []
A list of 5' ligated adapter sequences to be cut from the given first or single end fastq file.
Classification.sampleWorkflow.qualityControl.Cutadapt.frontRead2
Array[String] — Default: []
A list of 5' ligated adapter sequences to be cut from the given second end fastq file.
Classification.sampleWorkflow.qualityControl.Cutadapt.infoFilePath
String?
Equivalent to cutadapt's --info-file option.
Classification.sampleWorkflow.qualityControl.Cutadapt.interleaved
Boolean?
Equivalent to cutadapt's --interleaved flag.
Classification.sampleWorkflow.qualityControl.Cutadapt.length
Int?
Equivalent to cutadapt's --length option.
Classification.sampleWorkflow.qualityControl.Cutadapt.lengthTag
String?
Equivalent to cutadapt's --length-tag option.
Classification.sampleWorkflow.qualityControl.Cutadapt.maq
Boolean?
Equivalent to cutadapt's --maq flag.
Classification.sampleWorkflow.qualityControl.Cutadapt.maskAdapter
Boolean?
Equivalent to cutadapt's --mask-adapter flag.
Classification.sampleWorkflow.qualityControl.Cutadapt.matchReadWildcards
Boolean?
Equivalent to cutadapt's --match-read-wildcards flag.
Classification.sampleWorkflow.qualityControl.Cutadapt.maximumLength
Int?
Equivalent to cutadapt's --maximum-length option.
Classification.sampleWorkflow.qualityControl.Cutadapt.maxN
Int?
Equivalent to cutadapt's --max-n option.
Classification.sampleWorkflow.qualityControl.Cutadapt.memory
String — Default: "~{300 + 100 * cores}M"
The amount of memory this job will use.
Classification.sampleWorkflow.qualityControl.Cutadapt.minimumLength
Int? — Default: 2
Equivalent to cutadapt's --minimum-length option.
Classification.sampleWorkflow.qualityControl.Cutadapt.nextseqTrim
String?
Equivalent to cutadapt's --nextseq-trim option.
Classification.sampleWorkflow.qualityControl.Cutadapt.noIndels
Boolean?
Equivalent to cutadapt's --no-indels flag.
Classification.sampleWorkflow.qualityControl.Cutadapt.noMatchAdapterWildcards
Boolean?
Equivalent to cutadapt's --no-match-adapter-wildcards flag.
Classification.sampleWorkflow.qualityControl.Cutadapt.noTrim
Boolean?
Equivalent to cutadapt's --no-trim flag.
Classification.sampleWorkflow.qualityControl.Cutadapt.noZeroCap
Boolean?
Equivalent to cutadapt's --no-zero-cap flag.
Classification.sampleWorkflow.qualityControl.Cutadapt.overlap
Int?
Equivalent to cutadapt's --overlap option.
Classification.sampleWorkflow.qualityControl.Cutadapt.pairFilter
String?
Equivalent to cutadapt's --pair-filter option.
Classification.sampleWorkflow.qualityControl.Cutadapt.prefix
String?
Equivalent to cutadapt's --prefix option.
Classification.sampleWorkflow.qualityControl.Cutadapt.qualityBase
Int?
Equivalent to cutadapt's --quality-base option.
Classification.sampleWorkflow.qualityControl.Cutadapt.qualityCutoff
String?
Equivalent to cutadapt's --quality-cutoff option.
Classification.sampleWorkflow.qualityControl.Cutadapt.restFilePath
String?
Equivalent to cutadapt's --rest-file option.
Classification.sampleWorkflow.qualityControl.Cutadapt.stripF3
Boolean?
Equivalent to cutadapt's --strip-f3 flag.
Classification.sampleWorkflow.qualityControl.Cutadapt.stripSuffix
String?
Equivalent to cutadapt's --strip-suffix option.
Classification.sampleWorkflow.qualityControl.Cutadapt.suffix
String?
Equivalent to cutadapt's --suffix option.
Classification.sampleWorkflow.qualityControl.Cutadapt.timeMinutes
Int — Default: 1 + ceil((size([read1, read2],"G") * 12.0 / cores))
The maximum amount of time the job will run in minutes.
Classification.sampleWorkflow.qualityControl.Cutadapt.times
Int?
Equivalent to cutadapt's --times option.
Classification.sampleWorkflow.qualityControl.Cutadapt.tooLongOutputPath
String?
Equivalent to cutadapt's --too-long-output option.
Classification.sampleWorkflow.qualityControl.Cutadapt.tooLongPairedOutputPath
String?
Equivalent to cutadapt's --too-long-paired-output option.
Classification.sampleWorkflow.qualityControl.Cutadapt.tooShortOutputPath
String?
Equivalent to cutadapt's --too-short-output option.
Classification.sampleWorkflow.qualityControl.Cutadapt.tooShortPairedOutputPath
String?
Equivalent to cutadapt's --too-short-paired-output option.
Classification.sampleWorkflow.qualityControl.Cutadapt.trimN
Boolean?
Equivalent to cutadapt's --trim-n flag.
Classification.sampleWorkflow.qualityControl.Cutadapt.untrimmedOutputPath
String?
Equivalent to cutadapt's --untrimmed-output option.
Classification.sampleWorkflow.qualityControl.Cutadapt.untrimmedPairedOutputPath
String?
Equivalent to cutadapt's --untrimmed-paired-output option.
Classification.sampleWorkflow.qualityControl.Cutadapt.wildcardFilePath
String?
Equivalent to cutadapt's --wildcard-file option.
Classification.sampleWorkflow.qualityControl.Cutadapt.zeroCap
Boolean?
Equivalent to cutadapt's --zero-cap flag.
Classification.sampleWorkflow.qualityControl.extractFastqcZip
Boolean — Default: false
Whether to extract Fastqc's report zip files
Classification.sampleWorkflow.qualityControl.FastqcRead1.adapters
File?
Equivalent to fastqc's --adapters option.
Classification.sampleWorkflow.qualityControl.FastqcRead1.casava
Boolean — Default: false
Equivalent to fastqc's --casava flag.
Classification.sampleWorkflow.qualityControl.FastqcRead1.contaminants
File?
Equivalent to fastqc's --contaminants option.
Classification.sampleWorkflow.qualityControl.FastqcRead1.dir
String?
Equivalent to fastqc's --dir option.
Classification.sampleWorkflow.qualityControl.FastqcRead1.format
String?
Equivalent to fastqc's --format option.
Classification.sampleWorkflow.qualityControl.FastqcRead1.javaXmx
String — Default: "1750M"
The maximum memory available to the program. Should be lower than `memory` to accommodate JVM overhead.
Classification.sampleWorkflow.qualityControl.FastqcRead1.kmers
Int?
Equivalent to fastqc's --kmers option.
Classification.sampleWorkflow.qualityControl.FastqcRead1.limits
File?
Equivalent to fastqc's --limits option.
Classification.sampleWorkflow.qualityControl.FastqcRead1.memory
String — Default: "2G"
The amount of memory this job will use.
Classification.sampleWorkflow.qualityControl.FastqcRead1.minLength
Int?
Equivalent to fastqc's --min_length option.
Classification.sampleWorkflow.qualityControl.FastqcRead1.nano
Boolean — Default: false
Equivalent to fastqc's --nano flag.
Classification.sampleWorkflow.qualityControl.FastqcRead1.noFilter
Boolean — Default: false
Equivalent to fastqc's --nofilter flag.
Classification.sampleWorkflow.qualityControl.FastqcRead1.nogroup
Boolean — Default: false
Equivalent to fastqc's --nogroup flag.
Classification.sampleWorkflow.qualityControl.FastqcRead1.threads
Int — Default: 1
The number of cores to use.
Classification.sampleWorkflow.qualityControl.FastqcRead1.timeMinutes
Int — Default: 1 + ceil(size(seqFile,"G")) * 4
The maximum amount of time the job will run in minutes.
Classification.sampleWorkflow.qualityControl.FastqcRead1After.adapters
File?
Equivalent to fastqc's --adapters option.
Classification.sampleWorkflow.qualityControl.FastqcRead1After.casava
Boolean — Default: false
Equivalent to fastqc's --casava flag.
Classification.sampleWorkflow.qualityControl.FastqcRead1After.contaminants
File?
Equivalent to fastqc's --contaminants option.
Classification.sampleWorkflow.qualityControl.FastqcRead1After.dir
String?
Equivalent to fastqc's --dir option.
Classification.sampleWorkflow.qualityControl.FastqcRead1After.format
String?
Equivalent to fastqc's --format option.
Classification.sampleWorkflow.qualityControl.FastqcRead1After.javaXmx
String — Default: "1750M"
The maximum memory available to the program. Should be lower than `memory` to accommodate JVM overhead.
Classification.sampleWorkflow.qualityControl.FastqcRead1After.kmers
Int?
Equivalent to fastqc's --kmers option.
Classification.sampleWorkflow.qualityControl.FastqcRead1After.limits
File?
Equivalent to fastqc's --limits option.
Classification.sampleWorkflow.qualityControl.FastqcRead1After.memory
String — Default: "2G"
The amount of memory this job will use.
Classification.sampleWorkflow.qualityControl.FastqcRead1After.minLength
Int?
Equivalent to fastqc's --min_length option.
Classification.sampleWorkflow.qualityControl.FastqcRead1After.nano
Boolean — Default: false
Equivalent to fastqc's --nano flag.
Classification.sampleWorkflow.qualityControl.FastqcRead1After.noFilter
Boolean — Default: false
Equivalent to fastqc's --nofilter flag.
Classification.sampleWorkflow.qualityControl.FastqcRead1After.nogroup
Boolean — Default: false
Equivalent to fastqc's --nogroup flag.
Classification.sampleWorkflow.qualityControl.FastqcRead1After.threads
Int — Default: 1
The number of cores to use.
Classification.sampleWorkflow.qualityControl.FastqcRead1After.timeMinutes
Int — Default: 1 + ceil(size(seqFile,"G")) * 4
The maximum amount of time the job will run in minutes.
Classification.sampleWorkflow.qualityControl.FastqcRead2.adapters
File?
Equivalent to fastqc's --adapters option.
Classification.sampleWorkflow.qualityControl.FastqcRead2.casava
Boolean — Default: false
Equivalent to fastqc's --casava flag.
Classification.sampleWorkflow.qualityControl.FastqcRead2.contaminants
File?
Equivalent to fastqc's --contaminants option.
Classification.sampleWorkflow.qualityControl.FastqcRead2.dir
String?
Equivalent to fastqc's --dir option.
Classification.sampleWorkflow.qualityControl.FastqcRead2.format
String?
Equivalent to fastqc's --format option.
Classification.sampleWorkflow.qualityControl.FastqcRead2.javaXmx
String — Default: "1750M"
The maximum memory available to the program. Should be lower than `memory` to accommodate JVM overhead.
Classification.sampleWorkflow.qualityControl.FastqcRead2.kmers
Int?
Equivalent to fastqc's --kmers option.
Classification.sampleWorkflow.qualityControl.FastqcRead2.limits
File?
Equivalent to fastqc's --limits option.
Classification.sampleWorkflow.qualityControl.FastqcRead2.memory
String — Default: "2G"
The amount of memory this job will use.
Classification.sampleWorkflow.qualityControl.FastqcRead2.minLength
Int?
Equivalent to fastqc's --min_length option.
Classification.sampleWorkflow.qualityControl.FastqcRead2.nano
Boolean — Default: false
Equivalent to fastqc's --nano flag.
Classification.sampleWorkflow.qualityControl.FastqcRead2.noFilter
Boolean — Default: false
Equivalent to fastqc's --nofilter flag.
Classification.sampleWorkflow.qualityControl.FastqcRead2.nogroup
Boolean — Default: false
Equivalent to fastqc's --nogroup flag.
Classification.sampleWorkflow.qualityControl.FastqcRead2.threads
Int — Default: 1
The number of cores to use.
Classification.sampleWorkflow.qualityControl.FastqcRead2.timeMinutes
Int — Default: 1 + ceil(size(seqFile,"G")) * 4
The maximum amount of time the job will run in minutes.
Classification.sampleWorkflow.qualityControl.FastqcRead2After.adapters
File?
Equivalent to fastqc's --adapters option.
Classification.sampleWorkflow.qualityControl.FastqcRead2After.casava
Boolean — Default: false
Equivalent to fastqc's --casava flag.
Classification.sampleWorkflow.qualityControl.FastqcRead2After.contaminants
File?
Equivalent to fastqc's --contaminants option.
Classification.sampleWorkflow.qualityControl.FastqcRead2After.dir
String?
Equivalent to fastqc's --dir option.
Classification.sampleWorkflow.qualityControl.FastqcRead2After.format
String?
Equivalent to fastqc's --format option.
Classification.sampleWorkflow.qualityControl.FastqcRead2After.javaXmx
String — Default: "1750M"
The maximum memory available to the program. Should be lower than `memory` to accommodate JVM overhead.
Classification.sampleWorkflow.qualityControl.FastqcRead2After.kmers
Int?
Equivalent to fastqc's --kmers option.
Classification.sampleWorkflow.qualityControl.FastqcRead2After.limits
File?
Equivalent to fastqc's --limits option.
Classification.sampleWorkflow.qualityControl.FastqcRead2After.memory
String — Default: "2G"
The amount of memory this job will use.
Classification.sampleWorkflow.qualityControl.FastqcRead2After.minLength
Int?
Equivalent to fastqc's --min_length option.
Classification.sampleWorkflow.qualityControl.FastqcRead2After.nano
Boolean — Default: false
Equivalent to fastqc's --nano flag.
Classification.sampleWorkflow.qualityControl.FastqcRead2After.noFilter
Boolean — Default: false
Equivalent to fastqc's --nofilter flag.
Classification.sampleWorkflow.qualityControl.FastqcRead2After.nogroup
Boolean — Default: false
Equivalent to fastqc's --nogroup flag.
Classification.sampleWorkflow.qualityControl.FastqcRead2After.threads
Int — Default: 1
The number of cores to use.
Classification.sampleWorkflow.qualityControl.FastqcRead2After.timeMinutes
Int — Default: 1 + ceil(size(seqFile,"G")) * 4
The maximum amount of time the job will run in minutes.
Classification.sampleWorkflow.qualityControl.runAdapterClipping
Boolean — Default: defined(adapterForward) || defined(adapterReverse) || length(select_first([contaminations, []])) > 0
Whether or not adapters should be removed from the reads.
</dl> </details> ## Do not set these inputs! The following inputs should ***not*** be set, even though womtool may show them as being available inputs. * Classification.sampleWorkflow.qualityControl.FastqcRead1.NoneFile * Classification.sampleWorkflow.qualityControl.FastqcRead1.NoneArray * Classification.sampleWorkflow.qualityControl.FastqcRead2.NoneFile * Classification.sampleWorkflow.qualityControl.FastqcRead2.NoneArray * Classification.sampleWorkflow.qualityControl.FastqcRead1After.NoneFile * Classification.sampleWorkflow.qualityControl.FastqcRead1After.NoneArray * Classification.sampleWorkflow.qualityControl.FastqcRead2After.NoneFile * Classification.sampleWorkflow.qualityControl.FastqcRead2After.NoneArray * Classification.multiqcTask.finished * Classification.multiqcTask.dependencies