FastQC is a quality analysis tool for fastq formatted sequence reads.  More information at

  1. Run on initial raw data
    1. output is  html formatted, one zipped output for each input
  2. Remove adapters and quality trim (trimmomatic)
  3. Re-run on cleaned data

Torque/PBS examples

  1. See the trimmomatic article for a complete workflow for QC, trimming, QC. 
    1. Trimmomatic will accept wildcards in the target filenames so only one command is typically needed to run on all files in a directory.
    2. The Kmer report is turned off in the bioinfo installation.  If you find this output useful (I do) you can provide a limits file that turns it on.  Feel free to use or copy mine:
FastQC on all fastq files in a directory
#PBS -q standby
#PBS -N fastqc
#PBS -l walltime=4:00:00
#PBS -l nodes=1:ppn=20
#PBS -l epilogue=/home/mgribsko/jobs/
#PBS -l naccesspolicy=shared
module load fastqc
# run fastqc on all fastq files in directory

# options:
# --outdir fastqc 										# send output to directory fastqc - directory must exist
# --limits /home/mgribsko/reference/fastqc.limits		# use the specified configuration file (fastqc.limits)
fastqc --threads 20  *.fastq