PacBio-subreads-processing

A Workflow for preprocessing PacBio reads.

This is not a stable version!
You are currently viewing the documentation for a development version. It is not guaranteed that this documentation is up to date. Things will likely change without announcement or versioning incrementation. If there is no other documentation available, there are likely no releases available for this repository. The content is, therefore, likely still in development and not production ready. Use at your own risk!

SubreadsProcessing

Inputs

Required inputs

SubreadsProcessing.subreadsConfigFile
File — Default: None
Configuration file describing input subread BAMs and barcode files.

Other common inputs

SubreadsProcessing.ccs.minReadQuality
Float — Default: 0.99
Minimum predicted accuracy in [0, 1].

SubreadsProcessing.generateFastq
Boolean — Default: false
Generate fastq files from demultiplexed bam files.

SubreadsProcessing.lima.minLength
Int — Default: 50
Minimum sequence length after clipping.

SubreadsProcessing.lima.minPasses
Int — Default: 0
Minimal number of full passes.

SubreadsProcessing.lima.minScore
Int — Default: 0
Reads below the minimum barcode score are removed from downstream analysis.

SubreadsProcessing.lima.minScoreLead
Int — Default: 10
The minimal score lead required to call a barcode pair significant.

SubreadsProcessing.refine.requirePolyA
Boolean — Default: false
Require fl reads to have a poly(A) tail and remove it.

Advanced inputs

Show/Hide

SubreadsProcessing.bam2FastqLima.compressionLevel
Int — Default: 1
Gzip compression level [1-9]

SubreadsProcessing.bam2FastqLima.memory
String — Default: "2G"
The amount of memory available to the job.

SubreadsProcessing.bam2FastqLima.seqIdPrefix
String? — Default: None
Prefix for sequence IDs in headers.

SubreadsProcessing.bam2FastqLima.splitByBarcode
Boolean — Default: false
Split output into multiple fastq files, by barcode pairs.

SubreadsProcessing.bam2FastqLima.timeMinutes
Int — Default: 15
The maximum amount of time the job will run in minutes.

SubreadsProcessing.bam2FastqRefine.compressionLevel
Int — Default: 1
Gzip compression level [1-9]

SubreadsProcessing.bam2FastqRefine.memory
String — Default: "2G"
The amount of memory available to the job.

SubreadsProcessing.bam2FastqRefine.seqIdPrefix
String? — Default: None
Prefix for sequence IDs in headers.

SubreadsProcessing.bam2FastqRefine.splitByBarcode
Boolean — Default: false
Split output into multiple fastq files, by barcode pairs.

SubreadsProcessing.bam2FastqRefine.timeMinutes
Int — Default: 15
The maximum amount of time the job will run in minutes.

SubreadsProcessing.ccs.byStrand
Boolean — Default: false
Generate a consensus for each strand.

SubreadsProcessing.ccs.logLevel
String — Default: "WARN"
Set log level. Valid choices: (TRACE, DEBUG, INFO, WARN, FATAL).

SubreadsProcessing.ccs.maxLength
Int — Default: 50000
Maximum draft length before polishing.

SubreadsProcessing.ccs.memory
String — Default: "2G"
The amount of memory available to the job.

SubreadsProcessing.ccs.minLength
Int — Default: 10
Minimum draft length before polishing.

SubreadsProcessing.ccs.minPasses
Int — Default: 3
Minimum number of full-length subreads required to generate ccs for a ZMW.

SubreadsProcessing.ccs.timeMinutes
Int — Default: 1440
The maximum amount of time the job will run in minutes.

SubreadsProcessing.ccsCores
Int — Default: 2
The number of CPU cores to be used by ccs.

SubreadsProcessing.ccsMode
Boolean — Default: true
Ccs mode, use optimal alignment options.

SubreadsProcessing.dockerImages
Map[String,String] — Default: {"bam2fastx": "quay.io/biocontainers/bam2fastx:1.3.0--he1c1bb9_8", "biowdl-input-converter": "quay.io/biocontainers/biowdl-input-converter:0.2.1--py_0", "ccs": "quay.io/biocontainers/pbccs:4.2.0--1", "fastqc": "quay.io/biocontainers/fastqc:0.11.9--0", "isoseq3": "quay.io/biocontainers/isoseq3:3.3.0--0", "lima": "quay.io/biocontainers/lima:1.11.0--0", "multiqc": "quay.io/biocontainers/multiqc:1.9--pyh9f0ad1d_0"}
The docker image(s) used for this workflow. Changing this may result in errors which the developers may choose not to address.

SubreadsProcessing.fastqcLima.adapters
File? — Default: None
Equivalent to fastqc's --adapters option.

SubreadsProcessing.fastqcLima.casava
Boolean — Default: false
Equivalent to fastqc's --casava flag.

SubreadsProcessing.fastqcLima.contaminants
File? — Default: None
Equivalent to fastqc's --contaminants option.

SubreadsProcessing.fastqcLima.dir
String? — Default: None
Equivalent to fastqc's --dir option.

SubreadsProcessing.fastqcLima.extract
Boolean — Default: false
Equivalent to fastqc's --extract flag.

SubreadsProcessing.fastqcLima.javaXmx
String — Default: "1750M"
The maximum memory available to the program. Should be lower than `memory` to accommodate JVM overhead.

SubreadsProcessing.fastqcLima.kmers
Int? — Default: None
Equivalent to fastqc's --kmers option.

SubreadsProcessing.fastqcLima.limits
File? — Default: None
Equivalent to fastqc's --limits option.

SubreadsProcessing.fastqcLima.memory
String — Default: "2G"
The amount of memory this job will use.

SubreadsProcessing.fastqcLima.minLength
Int? — Default: None
Equivalent to fastqc's --min_length option.

SubreadsProcessing.fastqcLima.nano
Boolean — Default: false
Equivalent to fastqc's --nano flag.

SubreadsProcessing.fastqcLima.noFilter
Boolean — Default: false
Equivalent to fastqc's --nofilter flag.

SubreadsProcessing.fastqcLima.nogroup
Boolean — Default: false
Equivalent to fastqc's --nogroup flag.

SubreadsProcessing.fastqcLima.timeMinutes
Int — Default: 1 + ceil(size(seqFile,"G")) * 4
The maximum amount of time the job will run in minutes.

SubreadsProcessing.fastqcRefine.adapters
File? — Default: None
Equivalent to fastqc's --adapters option.

SubreadsProcessing.fastqcRefine.casava
Boolean — Default: false
Equivalent to fastqc's --casava flag.

SubreadsProcessing.fastqcRefine.contaminants
File? — Default: None
Equivalent to fastqc's --contaminants option.

SubreadsProcessing.fastqcRefine.dir
String? — Default: None
Equivalent to fastqc's --dir option.

SubreadsProcessing.fastqcRefine.extract
Boolean — Default: false
Equivalent to fastqc's --extract flag.

SubreadsProcessing.fastqcRefine.javaXmx
String — Default: "1750M"
The maximum memory available to the program. Should be lower than `memory` to accommodate JVM overhead.

SubreadsProcessing.fastqcRefine.kmers
Int? — Default: None
Equivalent to fastqc's --kmers option.

SubreadsProcessing.fastqcRefine.limits
File? — Default: None
Equivalent to fastqc's --limits option.

SubreadsProcessing.fastqcRefine.memory
String — Default: "2G"
The amount of memory this job will use.

SubreadsProcessing.fastqcRefine.minLength
Int? — Default: None
Equivalent to fastqc's --min_length option.

SubreadsProcessing.fastqcRefine.nano
Boolean — Default: false
Equivalent to fastqc's --nano flag.

SubreadsProcessing.fastqcRefine.noFilter
Boolean — Default: false
Equivalent to fastqc's --nofilter flag.

SubreadsProcessing.fastqcRefine.nogroup
Boolean — Default: false
Equivalent to fastqc's --nogroup flag.

SubreadsProcessing.fastqcRefine.timeMinutes
Int — Default: 1 + ceil(size(seqFile,"G")) * 4
The maximum amount of time the job will run in minutes.

SubreadsProcessing.libraryDesign
String — Default: "same"
Barcode structure of the library design.

SubreadsProcessing.lima.guess
Int — Default: 0
Try to guess the used barcodes, using the provided mean score threshold, 0 means guessing deactivated.

SubreadsProcessing.lima.guessMinCount
Int — Default: 0
Minimum number of ZMWs observed to whitelist barcodes.

SubreadsProcessing.lima.logLevel
String — Default: "WARN"
Set log level. Valid choices: (TRACE, DEBUG, INFO, WARN, FATAL).

SubreadsProcessing.lima.maxInputLength
Int — Default: 0
Maximum input sequence length, 0 means deactivated.

SubreadsProcessing.lima.maxScoredAdapters
Int — Default: 0
Analyze at maximum the provided number of adapters per ZMW, 0 means deactivated.

SubreadsProcessing.lima.maxScoredBarcodePairs
Int — Default: 0
Only use up to N barcode pair regions to find the barcode, 0 means use all.

SubreadsProcessing.lima.maxScoredBarcodes
Int — Default: 0
Analyze at maximum the provided number of barcodes per ZMW, 0 means deactivated.

SubreadsProcessing.lima.memory
String — Default: "2G"
The amount of memory available to the job.

SubreadsProcessing.lima.minEndScore
Int — Default: 0
Minimum end barcode score threshold is applied to the individual leading and trailing ends.

SubreadsProcessing.lima.minRefSpan
Float — Default: 0.5
Minimum reference span relative to the barcode length.

SubreadsProcessing.lima.minScoringRegion
Int — Default: 1
Minimum number of barcode regions with sufficient relative span to the barcode length.

SubreadsProcessing.lima.minSignalIncrease
Int — Default: 10
The minimal score difference, between first and combined, required to call a barcode pair different.

SubreadsProcessing.lima.peek
Int — Default: 0
Demux the first N ZMWs and return the mean score, 0 means peeking deactivated.

SubreadsProcessing.lima.peekGuess
Boolean — Default: false
Try to infer the used barcodes subset, by peeking at the first 50,000 ZMWs.

SubreadsProcessing.lima.scoredAdapterRatio
Float — Default: 0.25
Minimum ratio of scored vs sequenced adapters.

SubreadsProcessing.lima.scoreFullPass
Boolean — Default: false
Only use subreads flanked by adapters for barcode identification.

SubreadsProcessing.lima.timeMinutes
Int — Default: 30
The maximum amount of time the job will run in minutes.

SubreadsProcessing.limaCores
Int — Default: 2
The number of CPU cores to be used by lima.

SubreadsProcessing.multiqcTask.clConfig
String? — Default: None
Equivalent to MultiQC's `--cl-config` option.

SubreadsProcessing.multiqcTask.comment
String? — Default: None
Equivalent to MultiQC's `--comment` option.

SubreadsProcessing.multiqcTask.config
File? — Default: None
Equivalent to MultiQC's `--config` option.

SubreadsProcessing.multiqcTask.dataFormat
String? — Default: None
Equivalent to MultiQC's `--data-format` option.

SubreadsProcessing.multiqcTask.dirs
Boolean — Default: false
Equivalent to MultiQC's `--dirs` flag.

SubreadsProcessing.multiqcTask.dirsDepth
Int? — Default: None
Equivalent to MultiQC's `--dirs-depth` option.

SubreadsProcessing.multiqcTask.exclude
Array[String]+? — Default: None
Equivalent to MultiQC's `--exclude` option.

SubreadsProcessing.multiqcTask.export
Boolean — Default: false
Equivalent to MultiQC's `--export` flag.

SubreadsProcessing.multiqcTask.fileList
File? — Default: None
Equivalent to MultiQC's `--file-list` option.

SubreadsProcessing.multiqcTask.fileName
String? — Default: None
Equivalent to MultiQC's `--filename` option.

SubreadsProcessing.multiqcTask.flat
Boolean — Default: false
Equivalent to MultiQC's `--flat` flag.

SubreadsProcessing.multiqcTask.force
Boolean — Default: false
Equivalent to MultiQC's `--force` flag.

SubreadsProcessing.multiqcTask.fullNames
Boolean — Default: false
Equivalent to MultiQC's `--fullnames` flag.

SubreadsProcessing.multiqcTask.ignore
String? — Default: None
Equivalent to MultiQC's `--ignore` option.

SubreadsProcessing.multiqcTask.ignoreSamples
String? — Default: None
Equivalent to MultiQC's `--ignore-samples` option.

SubreadsProcessing.multiqcTask.interactive
Boolean — Default: true
Equivalent to MultiQC's `--interactive` flag.

SubreadsProcessing.multiqcTask.lint
Boolean — Default: false
Equivalent to MultiQC's `--lint` flag.

SubreadsProcessing.multiqcTask.megaQCUpload
Boolean — Default: false
Opposite to MultiQC's `--no-megaqc-upload` flag.

SubreadsProcessing.multiqcTask.memory
String? — Default: None
The amount of memory this job will use.

SubreadsProcessing.multiqcTask.module
Array[String]+? — Default: None
Equivalent to MultiQC's `--module` option.

SubreadsProcessing.multiqcTask.pdf
Boolean — Default: false
Equivalent to MultiQC's `--pdf` flag.

SubreadsProcessing.multiqcTask.sampleNames
File? — Default: None
Equivalent to MultiQC's `--sample-names` option.

SubreadsProcessing.multiqcTask.tag
String? — Default: None
Equivalent to MultiQC's `--tag` option.

SubreadsProcessing.multiqcTask.template
String? — Default: None
Equivalent to MultiQC's `--template` option.

SubreadsProcessing.multiqcTask.timeMinutes
Int — Default: 2 + ceil((size(reports,"G") * 8))
The maximum amount of time the job will run in minutes.

SubreadsProcessing.multiqcTask.title
String? — Default: None
Equivalent to MultiQC's `--title` option.

SubreadsProcessing.multiqcTask.zipDataDir
Boolean — Default: true
Equivalent to MultiQC's `--zip-data-dir` flag.

SubreadsProcessing.outputDirectory
String — Default: "."
The directory to which the outputs will be written.

SubreadsProcessing.refine.cores
Int — Default: 2
The number of cores to be used.

SubreadsProcessing.refine.logLevel
String — Default: "WARN"
Set log level. Valid choices: (TRACE, DEBUG, INFO, WARN, FATAL).

SubreadsProcessing.refine.memory
String — Default: "2G"
The amount of memory available to the job.

SubreadsProcessing.refine.minPolyALength
Int — Default: 20
Minimum poly(A) tail length.

SubreadsProcessing.refine.timeMinutes
Int — Default: 30
The maximum amount of time the job will run in minutes.

SubreadsProcessing.runIsoseq3Refine
Boolean — Default: false
Run isoseq3 refine for de-novo transcript reconstruction. Do not set this to true when analysing dna reads.

SubreadsProcessing.splitBamNamed
Boolean — Default: true
Split bam file(s) by resolved barcode pair name.


Generated using WDL AID (0.1.1)