BioWDL: ChIP-seq

A BioWDL pipeline for processing ChIP-seq data. Category:Multi-Sample Experimental

Currently viewing version: develop

BioWDL home page

View the Project on GitHub biowdl/ChIP-seq

Latest
All versions
develop
Page contents
Pipelines
Production Ready (0)
Multi-Sample (0)
Single-Sample (0)
Experimental (13)
Multi-Sample (7)
Single-Sample (6)

There are no releases available for this repository. The content is, therefore, likely still under development and not production ready. Use at own risk!

This pipeline can be used to anaylse ChIP-seq data. It performs quality control, mapping and peak-calling.

Usage

In order to run the complete multisample pipeline, you can run pipeline.wdl using Cromwell:

java -jar cromwell-<version>.jar run -i inputs.json pipeline.wdl

The inputs JSON can be generated using WOMtools as described in the WOMtools documentation.

The primary inputs are described below, additional inputs (such as precommands and JAR paths) are available. Please use the above mentioned WOMtools command to see all available inputs.

field type default  
outputDir String   The output directory.
refDict File   The reference dict file.
refFasta File   The reference fasta file.
refFastaIndex File   The index for the referece fasta.
sample.library.readgroup.
mapping.bwaMem.indexFiles
Array[File]   The index files for BWA mem.
sample.library.readgroup.
mapping.bwaMem.referenceFasta
File   The referece fasta file from the BWA mem index.
sampleConfigFiles Array[File]   The sample configuration files.

All inputs have to be preceded by with pipeline.. Type is indicated according to the WDL data types: File should be indicators of file location (a string in JSON). Types ending in ? indicate the input is optional, types ending in + indicate they require at least one element.

Sample configuration

The sample configuration should be a YML file which adheres to the following structure:

samples:
  <sample>:
    libraries:
      <library>:
        readgroups:
          <readgroup>:
              R1: <Path to first-end FastQ file.>
              R1_md5: <MD5 checksum of first-end FastQ file.>
              R2: <Path to second-end FastQ file.>
              R2_md5: <MD5 checksum of second-end FastQ file.>

Replace the text between < > with appropriate values. R2 values may be omitted in the case of single-end data. Multiple readgroups can be added per library and multiple libraries may be given per sample.

Tool versions

Included in the repository is an environment.yml file. This file includes all the tool version on which the workflow was tested. You can use conda and this file to create an environment with all the correct tools.

Output

TBD

About

This pipeline is part of BioWDL developed by the SASC team.

Contact

For any question related to this pipeline, please use the github issue tracker or contact the SASC team directly at: sasc@lumc.nl.