Page tree
Skip to end of metadata
Go to start of metadata

FastQC is a quality analysis tool for fastq formatted sequence reads.  More information at https://www.bioinformatics.babraham.ac.uk/projects/fastqc/

  1. Run on initial raw data
    1. output is  html formatted, one zipped output for each input
  2. Remove adapters and quality trim (trimmomatic)
  3. Re-run on cleaned data

Torque/PBS examples

  1. See the trimmomatic article for a complete workflow for QC, trimming, QC. 
    1. Trimmomatic will accept wildcards in the target filenames so only one command is typically needed to run on all files in a directory.
    2. The Kmer report is turned off in the bioinfo installation.  If you find this output useful (I do) you can provide a limits file that turns it on.  Feel free to use or copy mine:
      /home/mgribsko/reference/fastqc.limits
FastQC on all fastq files in a directory
#!/bin/bash
#PBS -q standby
#PBS -N fastqc
#PBS -l walltime=4:00:00
#PBS -l nodes=1:ppn=20
#PBS -l epilogue=/home/mgribsko/jobs/epilogue.sh
#PBS -l naccesspolicy=shared
 
cd $PBS_O_WORKDIR
module load fastqc
 
# run fastqc on all fastq files in directory


# options:
# --outdir fastqc 										# send output to directory fastqc - directory must exist
# --limits /home/mgribsko/reference/fastqc.limits		# use the specified configuration file (fastqc.limits)
fastqc --threads 20  *.fastq