Page tree
Skip to end of metadata
Go to start of metadata

This is an example of how to use the UNIX shell to iterate over a list of files.  It is a little complicated since several different filenames are created from the input list of filenames.

Step-by-step guide

#!/bin/bash
#PBS -N trim-RNA
#PBS -q scholar
#PBS -l nodes=1:ppn=20
#PBS -l walltime=24:00:00
#PBS -l epilogue=/home/mgribsko/jobs/epilogue.sh
cd $PBS_O_WORKDIR
module load trimmomatic

# this is the list of file names for the R1 samples. The R2 filenames are generated below, so all I need are the R1s
# a space delimited list inside quotes, uses a continuation at the end of the first line, '\'
sample="021938_JD-6_ATCACG_run492_L007_R1_001.fastq 021939_JD-8_CGATGT_run492_L007_R1_001.fastq 021940_JD-5_TTAGGC_run492_L007_R1_001.fastq \
021941_MCA-2504_TGACCA_run492_L007_R1_001.fastq 021942_MCA-2952_ACAGTG_run492_L007_R1_001.fastq 021943_MCA-2974_GCCAAT_run492_L007_R1_001.fastq"
# the trimmomatic trimming command. it's just here for convenience
trimmer="ILLUMINACLIP:adapter.fa:2:20:7 LEADING:10 TRAILING:10 SLIDINGWINDOW:4:13 MINLEN:30 "
# my datafiles are in a different directory so I define a path and store it in the variable data
data='../../clean/'
# loop over all the file names in $sample
for r1 in $sample; do
    # copy the filename, r1, to a new file name, r2 and substitute R1 in the name with R2
    # this generates the name of the R2 file
   r2=$r1
    r2="${r1/R1/R2}"
#echo file: $dir$r1 $dir$r2
     # generate the names for the four output files, R1.unpaired, R1.paired, R2.unpaired, and R2.paired 
# from the names of the R1 and R2 input files
# notice I skipped copying the name of r1 into r1p, and just substituted .fastq to .fastq.paired and put the result in a new variable
r1p="${r1/.fastq/.paired.fastq}"
r1u="${r1/.fastq/.unpaired.fastq}"
r2p="${r2/.fastq/.paired.fastq}"
r2u="${r2/.fastq/.unpaired.fastq}"
    # run the trimmomatic command, note the path to the datafiles, $data
trimmomatic PE -threads 20 $data/$r1 $data/$r2 $r1p $r1u $r2p $r2u $trimmer
done
# this is the end of the loop
############################################################
# post-cleaning - fastqc
############################################################
module load fastqc
mkdir fastqc_after
fastqc -q -t 20 -o fastqc_after *.fastq

There are many, many tutorials on UNIX/bash shell programming.  Here is one that is brief, not technical, and covers the basics

bash scripting tutorial