Child pages
  • Drell-Yan analysis Procedure (8 TeV)
Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 49 Next »

Drell-Yan analysis Procedure

This twiki documents the most important steps of the Drell-Yan cross section measurement. It is intended to familiarize you with the technical aspects of the analysis procedure. 

The pdf file of the AN-13-420 attached here contains the notes with the macro name used to produce each plot. In addition, there is  a table which contains the same information.

Step 1: Producing ntuples

  • Samples
  • The CMSSW_53X MC samples are used for 8 TeV analysis. Below is the list of starting GEN-SIM-RECO samples used in the muon and electro analyses:

DYToMuMuM-10To20 & Powheg-Pythia6 & CT10TuneZ2star 

DYToMuMuM-20 & Powheg-Pythia6 & CT10TuneZ2star 

DYToMuMuM-200 & Powheg-Pythia6 & TuneZ2star 

DYToMuMuM-400 & Powheg-Pythia6 & TuneZ2star 

DYToMuMuM-500 & Powheg-Pythia6 & TuneZ2star 

DYToMuMuM-700 & Powheg-Pythia6 & TuneZ2star 

DYToMuMuM-800 & Powheg-Pythia6 & TuneZ2star 

DYToMuMuM-1000 & Powheg-Pythia6 & TuneZ2star

DYToMuMuM-1500 & Powheg-Pythia6 & TuneZ2star 

DYToMuMuM-2000 & Powheg-Pythia6 & TuneZ2star 

DYToEEM-10To20 & Powheg-Pythia6 & CT10TuneZ2star 

DYToEEM-20 & Powheg-Pythia6 & CT10TuneZ2star 

DYToEEM-200 & Powheg-Pythia6 & TuneZ2star 

DYToEEM-400 & Powheg-Pythia6 & TuneZ2star 

DYToEEM-500 & Powheg-Pythia6 & TuneZ2star 

DYToEEM-700 & Powheg-Pythia6 & TuneZ2star 

DYToEEM-800 & Powheg-Pythia6 & TuneZ2star 

DYToEEM-1000 & Powheg-Pythia6 & TuneZ2star 

DYToEEM-1500 & Powheg-Pythia6 & TuneZ2star 

DYToEEM-2000 & Powheg-Pythia6 & TuneZ2star 

DYToTauTauM-10To20 & Powheg-Pythia6-tauola & TuneZ2star 

DYToTauTauM-20 & Powheg-Pythia6-tauola &CT10TuneZ2star 

WJetsToLNu & madgraph-tarball & TuneZ2star 

WWJetsTo2L2Nu &  madgraph-tauola & TuneZ2star 

WZJetsTo2L2Q &  madgraph-tauola& TuneZ2star 

WZJetsTo3LNu &  madgraph-tauola& TuneZ2star 

ZZJetsTo2L2Nu &  madgraph-tauola& TuneZ2star 

ZZJetsTo2L2Q &  madgraph-tauola& TuneZ2star 

ZZJetsTo4L &  madgraph-tauola& TuneZ2star 

TTMtt-700to1000 &  Powheg-tauola& TuneZ2star 

TTMtt-1000toInf  &  Powheg-tauola& TuneZ2star 

TTJetsFullLeptMGDecays &  madgraph& TuneZ2star 

TTJetsFullLeptMGDecays &  madgraph& TuneZ2star 

TT & Powheg-tauola & TuneZ2star 

TW & Powheg-tauola & TuneZ2star 

TbarW & Powheg-tauola & TuneZ2star 

QCDPt-15to20MuPt5Enriched & Pythia6 &TuneZ2star 

QCDPt-20to30MuPt5Enriched & Pythia6 &TuneZ2star 

QCDPt-30to50MuPt5Enriched & Pythia6 &TuneZ2star 

QCDPt-50to80MuPt5Enriched & Pythia6 &TuneZ2star

QCDPt-80to120MuPt5Enriched & Pythia6 & TuneZ2star

QCDPt-120to150MuPt5Enriched & Pythia6 &TuneZ2star 

QCDPt-150MuPt5Enriched & Pythia6 & TuneZ2star 

MC generation is 53X

  • DATA:
  • We use SingleMu and DoubleMu Primary Datasets (PD), January2013 ReReco version

    /DoubleMu/Run2012A-22Jan2013-v1/AOD : 190645-193621

    /DoubleElectron/Run2012A-22Jan2013-v1/AOD

    /DoubleMuParked/Run2012B-22Jan2013-v1/AOD : 193834-196531

    /DoubleElectron/Run2012B-22Jan2013-v1/AOD

    /DoubleMuParked/Run2012C-22Jan2013-v1/AOD : 198049-203742

    /DoubleElectron/Run2012C-22Jan2013-v1/AOD

    /DoubleMuParked/Run2012D-22Jan2013-v1/AOD : 203777-208686

    /DoubleElectron/Run2012D-22Jan2013-v1/AOD

    /SingleMu/Run2012A-22Jan2013-v1/AOD : 190645-193621

    /SingleMu/Run2012B-22Jan2013-v1/AOD : 193834-196531

    /SingleMu/Run2012C-22Jan2013-v1/AOD : 198049-203742

    /SingleMu/Run2012D-22Jan2013-v1/AOD : 203777-208686

    /MuEG/Run2012A-22Jan2013-v1/AOD

    /MuEG/Run2012B-22Jan2013-v1/AOD

    /MuEG/Run2012C-22Jan2013-v1/AOD

    /MuEG/Run2012D-22Jan2013-v1/AOD

    /Photon/Run2012A-22Jan2013-v1/AOD

    /SinglePhoton/Run2012B-22Jan2013-v1/AOD

    /SinglePhoton/Run2012C-22Jan2013-v1/AOD

    /SinglePhotonParked/Run2012D-22Jan2013-v1/AOD

  • JSONs: Cert190456-2086868TeV22Jan2013ReRecoCollisions12JSON.txt, Jan22Jan2013
  • Double muon and double electron samples are used for the main analysis, single muon samples are used for the efficiency correction estimation steps. Other samples are used for the backgrounds estimation purposes.
  • Relevant software: CMSSW_5_3_3_patch2
    • Use latest global tags for MC data for a given release as documented here
    • DY analysis package (Purdue), DY analysis package (MIT) are used 
 cmsrel CMSSW_5_3_4
cd CMSSW_5_3_4/src
cmsenv
git clone git@github.com:ASvyatkovskiy/DrellYanAnalysis .
scram b -j8
export DYWorkDir=$PWD/DimuonAnalysis/DYPackage
cd $DYWorkDir/ntuples

Note: that for proper compilation slc5 machine is necessary, and the code might not compile out of the box on slc6 or later CMSSW release versions (it would need to be ported first).

To simply perform a local test of the ntuple-maker run:

 cmsRun ntuple_cfg.py

to produce the ntuples over full dataset use CRAB:

crab -create -submit -cfg crab.cfg
crab -get all -c <crab_0_datetime>

Notes on the ntuples used

The ntuples of different TTree format are used in the analysis. First, the Purdue flat format: it contains branches of fundamental data types like int, bool and float. It stored the branches used for the dimuon analysis only, and is optimized for storage - it is possible to fit in the scratch space as the total size of the ntuples is within 2TB. The signal and background MC samples are located at: /mnt/hadoop/store/user/asvyatko/DYstudy/dataAnalysis13/rootfiles/, and the data ntuples are located at: /mnt/hadoop/store/group/ewk/DY2013/:

1) In /mnt/hadoop/store/group/ewk/DY2013/

drwxrwxr-x 228 cms1005 cmsprio 4096 Jun 25  2013 Data_RunAJan2013

drwxrwxr-x 268 cms1005 cmsprio 4096 Oct 26  2013 Data_RunAJan2013_Oct

drwxrwxr-x 341 cms1005 cmsprio 4096 Oct 26  2013 Data_RunBJan2013_Oct_p1

drwxrwxr-x 438 cms1005 cmsprio 4096 Oct 27  2013 Data_RunBJan2013_Oct_p2

drwxrwxr-x 319 cms1005 cmsprio 4096 Jun 26  2013 Data_RunBJan2013_p1

drwxrwxr-x 340 cms1005 cmsprio 4096 Oct  7  2013 Data_RunBJan2013_p1_ES

drwxrwxr-x 399 cms1005 cmsprio 4096 Jun 26  2013 Data_RunBJan2013_p2

drwxrwxr-x 440 cms1005 cmsprio 4096 Oct  6  2013 Data_RunBJan2013_p2_ES

drwxrwxr-x 475 cms1005 cmsprio 4096 Oct 30  2013 Data_RunCJan2013_Oct_p1

drwxrwxr-x 431 cms1005 cmsprio 4096 Oct 27  2013 Data_RunCJan2013_Oct_p2

drwxrwxr-x 460 cms1005 cmsprio 4096 Jun 27  2013 Data_RunCJan2013_p1

drwxrwxr-x 474 cms1005 cmsprio 4096 Oct 12  2013 Data_RunCJan2013_p1_ES

drwxrwxr-x 402 cms1005 cmsprio 4096 Jun 28  2013 Data_RunCJan2013_p2

drwxrwxr-x 432 cms1005 cmsprio 4096 Oct  6  2013 Data_RunCJan2013_p2_ES

drwxrwxr-x 493 cms1005 cmsprio 4096 Oct 28  2013 Data_RunDJan2013_Oct_p1

drwxrwxr-x 417 cms1005 cmsprio 4096 Oct 27  2013 Data_RunDJan2013_Oct_p2

drwxrwxr-x 489 cms1005 cmsprio 4096 Jun 29  2013 Data_RunDJan2013_p1

drwxrwxr-x 494 cms1005 cmsprio 4096 Oct  6  2013 Data_RunDJan2013_p1_ES

drwxrwxr-x 418 cms1005 cmsprio 4096 Jun 29  2013 Data_RunDJan2013_p2

drwxrwxr-x 415 cms1005 cmsprio 4096 Oct  6  2013 Data_RunDJan2013_p2_ES

drwxrwxr-x 304 cms1005 cmsprio 4096 Apr 22  2013 QCD120to170

drwxrwxr-x 103 cms1005 cmsprio 4096 Apr 17  2013 QCD15to20

drwxrwxr-x 223 cms1005 cmsprio 4096 Apr 22  2013 QCD170to300

drwxrwxr-x 329 cms1005 cmsprio 4096 Apr 26  2013 QCD20to30

drwxrwxr-x 333 cms1005 cmsprio 4096 Apr 19  2013 QCD30to50

drwxrwxr-x 347 cms1005 cmsprio 4096 Apr 19  2013 QCD50to80

drwxrwxr-x 333 cms1005 cmsprio 4096 Apr 19  2013 QCD80to120

2) In /mnt/hadoop/store/user/asvyatko/DYstudy/dataAnalysis13/rootfiles/

drwxrwxr-x   83 cms1005 root 4096 Nov  1  2013 DYM1000_PUOct

drwxrwxr-x  398 cms1005 root 4096 Nov  7  2013 DYM1020_PUOct_FULL

drwxrwxr-x   83 cms1005 root 4096 Nov  1  2013 DYM1500_PUOct

drwxrwxr-x   83 cms1005 root 4096 Nov  1  2013 DYM2000_PUOct

drwxrwxr-x   83 cms1005 root 4096 Nov  1  2013 DYM200_PUOct

drwxrwxr-x  428 cms1005 root 4096 May 20  2014 DYM20_MadGraph_Light

drwxrwxr-x 1001 cms1005 root 4096 Nov  1  2013 DYM20_PUOct_Stefano

drwxrwxr-x   83 cms1005 root 4096 Nov  1  2013 DYM400_PUOct

drwxrwxr-x   83 cms1005 root 4096 Nov  1  2013 DYM500_PUOct

drwxrwxr-x   83 cms1005 root 4096 Nov  1  2013 DYM700_PUOct

drwxrwxr-x   83 cms1005 root 4096 Nov  1  2013 DYM800_PUOct

drwxrwxr-x  103 cms1005 root 4096 May 14  2014 DYtautau1020_Light

drwxrwxr-x  463 cms1005 root 4096 May 17  2014 DYtautau20_Light

drwxrwxr-x  420 cms1005 root 4096 May 14  2014 WJets_Light

drwxrwxr-x  203 cms1005 root 4096 May  7  2014 WWJetsTo2L2Nu_Light

drwxrwxr-x   42 cms1005 root 4096 May 10  2014 WZJets3LNu_Light

drwxrwxr-x   62 cms1005 root 4096 May  9  2014 WZJetsTo2L2Q_Light

drwxrwxr-x   22 cms1005 root 4096 May  7  2014 ZZJetsTo2L2Nu_Light

drwxrwxr-x   39 cms1005 root 4096 May  7  2014 ZZJetsTo2L2Q_Light

drwxrwxr-x  324 cms1005 root 4096 May  9  2014 ZZJetsTo4L_Light

drwxrwxr-x   86 cms1005 root 4096 May  9  2014 tW_Light

drwxrwxr-x   83 cms1005 root 4096 May  8  2014 tbarW_Light

drwxrwxr-x  204 cms1005 root 4096 May  6  2014 tt1000_Light

drwxrwxr-x  211 cms1005 root 4096 May  7  2014 tt700_Light

drwxrwxr-x  203 cms1005 root 4096 May  8  2014 ttjets_v1_Light

drwxrwxr-x  256 cms1005 root 4096 May  7  2014 ttjets_v2_Light


Second, the Purdue objectified format which contains both electron and muon branches. The tree contains 101 branches, and the average compression factor of the tree is 3.50.It is not as efficient in the analysis and requires relatively more time to process as must be stored on the fuse mounted partition. It's size is about 5TB. It has 165 branches in the tree, and the average tree compression factor is 35.69. Ntuples of this type are located at: /mnt/hadoop/store/group/ewk/DY2013/

Finally the MIT ntuple format which contains electron, photon and e-mu branches stored as TClonesArrays of objects, it's size is very small (about 1TB), and it is stored here: /mnt/hadoop/store/user/asvyatko/DYstudy/dataAnalysis13/MITrootfiles. It contains over 311 branches in the tree, and the average tree compression factor is 2.14.

-rw-r--r-- 1 cms1005 root   587300667 Nov 25 21:42 r12a-del-j22-v1_tight-loose_skim.root

-rw-r--r-- 1 cms1005 root  2671161333 Nov 25 21:43 r12b-del-j22-v1_tight-loose_skim.root

-rw-r--r-- 1 cms1005 root  4052114248 Nov 25 21:46 r12c-del-j22-v1_tight-loose_skim.root

-rw-r--r-- 1 cms1005 root  4035054951 Nov 25 21:48 r12d-del-j22-v1_tight-loose_skim.root

-rw-r--r-- 1 cms1005 root   123602971 Nov 25 21:48 s12-qcdbc170250-v7a_tight-loose_skim.root

-rw-r--r-- 1 cms1005 root   118476429 Nov 25 21:48 s12-qcdbc2030-v7a_tight-loose_skim.root

-rw-r--r-- 1 cms1005 root   135250578 Nov 25 21:49 s12-qcdbc250350-v7a_tight-loose_skim.root

-rw-r--r-- 1 cms1005 root   138742992 Nov 25 21:49 s12-qcdbc3080-v7a_tight-loose_skim.root

-rw-r--r-- 1 cms1005 root   129070704 Nov 25 21:49 s12-qcdbc350-v7a_tight-loose_skim.root

-rw-r--r-- 1 cms1005 root   131205237 Nov 25 21:49 s12-qcdbc80170-v7a_tight-loose_skim.root

-rw-r--r-- 1 cms1005 root  2035355456 Nov 25 21:50 s12-qcdem170250-v7a_tight-loose_skim.root

-rw-r--r-- 1 cms1005 root  2248026709 Nov 25 21:51 s12-qcdem2030-v7a_tight-loose_skim.root

-rw-r--r-- 1 cms1005 root  2179316770 Nov 25 21:52 s12-qcdem250350-v7a_tight-loose_skim.root

-rw-r--r-- 1 cms1005 root  2107927634 Nov 25 21:53 s12-qcdem3080-v7a_tight-loose_skim.root

-rw-r--r-- 1 cms1005 root  1004102720 Nov 25 21:53 s12-qcdem350-v7a_tight-loose_skim.root

-rw-r--r-- 1 cms1005 root  2209661151 Nov 25 21:54 s12-qcdem80170-v7a_tight-loose_skim.root

-rw-r--r-- 1 cms1005 root  2992897217 Nov 25 21:55 s12-tt2l-mg-v7c_massLowTo700_tight-loose_skim.root

-rw-r--r-- 1 cms1005 root   156479786 Nov 25 22:01 s12-ttj-m1000-pwg-v7a_tight-loose_skim.root

-rw-r--r-- 1 cms1005 root   398757471 Nov 25 22:01 s12-ttj-m700-1000-pwg-v7a_tight-loose_skim.root

-rw-r--r-- 1 cms1005 root  5491290256 Nov 25 22:03 s12-wjets-v7a_tight-loose_skim.root

-rw-r--r-- 1 cms1005 root   593033131 Nov 25 22:04 s12-wtop-dr-v7a_ntuple.root

-rw-r--r-- 1 cms1005 root   588396554 Nov 25 22:04 s12-wtopb-dr-v7a_ntuple.root

-rw-r--r-- 1 cms1005 root  2012835745 Nov 25 22:05 s12-wwj-v7a_ntuple.root

-rw-r--r-- 1 cms1005 root   706844616 Nov 25 22:05 s12-wz2q2l-v7a_tight-loose_skim.root

-rw-r--r-- 1 cms1005 root  2425203042 Nov 25 22:06 s12-wz3ln-v7a_ntuple.root

-rw-r--r-- 1 cms1005 root   138511853 Nov 25 22:06 s12-zeem1000to1500-v7a_ntuple.root

-rw-r--r-- 1 cms1005 root  8777207914 Nov 25 23:28 s12-zeem1020-v7a_genpho_ntuple.root

-rw-r--r-- 1 cms1005 root  1870967267 Nov 25 22:07 s12-zeem1020-v7a_tight-loose_skim.root

-rw-r--r-- 1 cms1005 root   134906207 Nov 25 22:07 s12-zeem1500to2000-v7a_ntuple.root

-rw-r--r-- 1 cms1005 root 51629600670 Nov 25 23:49 s12-zeem20-v7a_genpho_ntuple.root

-rw-r--r-- 1 cms1005 root 16910573790 Nov 25 22:26 s12-zeem20-v7a_tight-loose_skim.root

-rw-r--r-- 1 cms1005 root   141741517 Nov 25 22:08 s12-zeem200-v7a_ntuple.root

-rw-r--r-- 1 cms1005 root   156961474 Nov 25 22:07 s12-zeem2000-v7a_ntuple.root

-rw-r--r-- 1 cms1005 root   129440032 Nov 25 22:08 s12-zeem200to400-v7a_ntuple.root

-rw-r--r-- 1 cms1005 root 16888262832 Nov 25 22:15 s12-zeem20to200-v7a_tight-loose_skim.root

-rw-r--r-- 1 cms1005 root 16910013460 Nov 25 22:21 s12-zeem20to500-v7a_tight-loose_skim.root

-rw-r--r-- 1 cms1005 root   148722935 Nov 25 22:27 s12-zeem400-v7a_ntuple.root

-rw-r--r-- 1 cms1005 root    86869728 Nov 25 22:27 s12-zeem400to500-v7a_ntuple.root

-rw-r--r-- 1 cms1005 root   152797927 Nov 25 22:27 s12-zeem500-v7a_ntuple.root

-rw-r--r-- 1 cms1005 root   113790925 Nov 25 22:27 s12-zeem500to700-v7a_ntuple.root

-rw-r--r-- 1 cms1005 root   130202483 Nov 25 22:27 s12-zeem500to800-v7a_ntuple.root

-rw-r--r-- 1 cms1005 root   152764716 Nov 25 22:28 s12-zeem700-v7a_ntuple.root

-rw-r--r-- 1 cms1005 root    70450352 Nov 25 22:28 s12-zeem700to800-v7a_ntuple.root

-rw-r--r-- 1 cms1005 root   155869368 Nov 25 22:28 s12-zeem800-v7a_ntuple.root

-rw-r--r-- 1 cms1005 root   102010215 Nov 25 22:28 s12-zeem800to1000-v7a_ntuple.root

-rw-r--r-- 1 cms1005 root 42680408069 Nov 26 00:08 s12-zmmm20-v7a_genpho_ntuple.root

-rw-r--r-- 1 cms1005 root   691278867 Nov 25 22:40 s12-ztt1020-v7a_tight-loose_skim.root

-rw-r--r-- 1 cms1005 root  3381097268 Nov 25 22:42 s12-zttm20-v7a_tight-loose_skim.root

-rw-r--r-- 1 cms1005 root  1005986880 Nov 25 22:43 s12-zz2l2n-v7a_ntuple.root

-rw-r--r-- 1 cms1005 root  2321028278 Nov 25 22:43 s12-zz2l2q-v7a_ntuple.root

-rw-r--r-- 1 cms1005 root  1615199949 Nov 25 22:44 s12-zz4l-v7a_tight-loose_skim.root

Step 2: FEWZ reweighting

To get started with FEWZ follow this nice tutorial dedicated to exactly this topic: https://twiki.cern.ch/twiki/bin/viewauth/CMS/FEWZHowToUse_QuickTutorialIt goes in depth on the setting up the environment, preparing configuration files etc. After the FEWZ cross sections are prepared, there is few other things one needs to do with it. In depth understanding of this tutorial is a prerequisite to work with FEWZ.

First, parse the raw output files to the maps, which can be used in the further analysis. The files have name of that form: 'NNLO.NNLO*dat'. The output FEWZ files which serve as an input for all the macros below can be found at

/mnt/hadoop/store/user/asvyatko/DYstudy/dataAnalysis13/FEWZraw.tar folder

python FEWZ_map_converter_2012_MY.py

to assemble these maps into ROOT files do

root -l MapYM2D.C

additional scripts are available to make potting comparisons (to make meaningful comparison one needs multiple sets of FEWZ outputs, i.e. with different PDF sets, different order):

root -l comparer.C

For studies of the PDF uncertainties based on FEWZ use the following parser:

python FEWZ_map_converter_2012_MY_PDF.py

For studies of the replicas and PDF constraints with NNPDF reweighing technique use the following workflow:

python FEWZ_replica_preprocessor.py
python FEWZ_map_converter_2012_MY_replica.py
python FEWZ_replica_wrapper.py
root -l MapYM2D_replicas.C

Step 3: Event Selection

Once the ntuples are ready, one can proceed to the actual physics analysis. The first step of the analysis is the event selection. Currently, we use the so-called cut-based approach to discriminate between signal and background. For more on event selection read chapter 5 in the analysis note CMS-AN-13-420. Before starting to run a macro, set up the working area. Find all the necessary scripts in:

cd $DYWorkDir/test/ControlPlots

The code for event selection consists of 3 main files (and a few auxiliary). First of all the TSelector class which is customized for event selection used in a given analysis, necessary weights (pileup, FEWZ and momentum scale correction) are applied in the macro. The Monte-Carlo weights are also hardcoded inside the macro for each MC sample used. Next, is the wrapper ROOT macro which calls the TSelector to run on a given dataset. This wrapper is shown below, and explained step-by-step:

//macro takes 3 arguments, which are passed from the python script. These are: the histogram name (invariant mass, or for instance rapidity), ntuple weight or/custom (this option is deprecated - we always use custom weight), and the type of momentum scale correction (also deprecated - the correction does not depend on the run range in 8 TeV analysis)
void analyseYield(const char* WHICHHIST, const char* NTUPLEWEIGHT, const char* MOMCORRTYPE) {

  // Depending on the directory with data, the protocol used to access data will be different: "file" or "xrootd" are the most commonly used.
  TString protocol = "file://";
  //TString protocol = "root://xrootd.rcac.purdue.edu/";


  //Pointer to the location of the data used. Can be on /mnt/hadoop or on the scratch
  TString dirname = "/mnt/hadoop/store/group/ewk/DY2013/";

  // Next, the TFileCollection is created. This section is specific for each dataset: data or MC, so we prepare this wrapper macro for each sample
  TFileCollection* c1 = new TFileCollection("data","data");
  //Splitting criteria by runs/eras is happening here switch to RunAB, RunC, RunD. This is handy for studies of run dependencies 
  if (MOMCORRTYPE == "RunAB") c1->Add(protocol+dirname+"Data_RunAJan2013_Oct"+"/*.root");
  if (MOMCORRTYPE == "RunAB") c1->Add(protocol+dirname+"Data_RunBJan2013_Oct_p1"+"/*.root");
  if (MOMCORRTYPE == "RunAB") c1->Add(protocol+dirname+"Data_RunBJan2013_Oct_p2"+"/*.root");
  if (MOMCORRTYPE == "RunC1") c1->Add(protocol+dirname+"Data_RunCJan2013_Oct_p1"+"/*.root");
  if (MOMCORRTYPE == "RunC2") c1->Add(protocol+dirname+"Data_RunCJan2013_Oct_p2"+"/*.root");
  if (MOMCORRTYPE == "RunD1") c1->Add(protocol+dirname+"Data_RunDJan2013_Oct_p1"+"/*.root");
  if (MOMCORRTYPE == "RunD2") c1->Add(protocol+dirname+"Data_RunDJan2013_Oct_p2"+"/*.root");

  //Set the location of ProofLite Sandbox. It is more convenient to use the custom path rather than $HOME/.proof
  gEnv->SetValue("ProofLite.Sandbox", "<path to your working dir>/test/ControlPlots/proofbox/");
  
  //splitting criteria: how many worker nodes to use for the run: using more than 10-15 nodes usually will cause instability and lead to a crash subsequently
  TProof* p = TProof::Open("workers=20"); 
  p->RegisterDataSet("DATA", c1,"OV");
  p->ShowDataSets();
 
  //Deprecated - just leave as is, always
  TObjString* useNtupleWeightFlag = new TObjString(NTUPLEWEIGHT);
  p->AddInput(new TNamed("useNtupleWeightFlag",NTUPLEWEIGHT));

  //The histogram should always be "invm" - it will give both 1D and 2D histograms. But if one needs to study N-1 selection, then the string should be the name of the cut to exclude
  TObjString* histogramThis = new TObjString(WHICHHIST);
  p->AddInput(new TNamed("histogramThis",WHICHHIST));
  //This is now useless, but for later studies it might become useful again, if there is a run dependency for the momentum scale correction
  TObjString* momCorrType = new TObjString(MOMCORRTYPE);
  p->AddInput(new TNamed("momCorrType",MOMCORRTYPE));

  gROOT->Time();
  p->SetParameter("PROOF_LookupOpt", "all");
  //This invokes the TSelector: "recoTree/DiMuonTree" is the name of the ROOT tree inside the file, "EventSelector_CP.C" is the name os the TSelector
  p->Process("DATA#/recoTree/DiMuonTree","EventSelector_CP.C+");
}
  • There is one extra level here -  the python script. It calls the above ROOT wrapper macro and typically looks like this:
#!/usr/bin/env python
from subprocess import Popen

#This normally is just "imvm", but for N-1 control plots like 18-25 in the AN-13-420 one needs to set to a custom cut name, for instance: 'relPFisoNoEGamma','chi2dof','trackerHits','pixelHits','CosAngle','muonHits','nMatches','dxyBS','relPFisoNoEGamma','vtxTrkProb','trigMatches','pT','eta']
histos = ['invm'] 
 
#normally one needs to run over all of them. Splitting to a set of runs is useful because loading very large number of files into one session can cause instability
eras = ['RunAB','RunC1','RunC2','RunD1','RunD2'] 
#Simply invoke ROOT wrapper macro using Popen
for run in eras:
    for hist in histos:
        Popen('root -b -l -q \'analyseYield.C(\"'+hist+'\",\"False\",\"'+run+'\")\'',shell=True).wait()

Once this is understood, one can run the macro. To produce plots like 35-37 use the analyse.py macro, which calls the wrapper for TSelector for the DY analysis (as described above):

mkdir runfolder
python analyseYield_mc.py
python analyseYield_data.py

Important information about the reweightings. Pileup reweighing is accessed from the ntuple, directly from the branch on a per event basis. The FEWZ weights are extracted from theoretical calculation, and are provided as arrays inside the efficiencyWeightToBin2012.C file located in the same directory (or any other directory, as long as there is an appropriate include in the header of the TSelector). The FEWZ weights are looked up based on the GEN mass as follows inside the code, only for signal MC:

//look up FEWZ weight

FEWZ_WEIGHT = weight(genDiMuPt, fabs(genRapidity), genMass, true);

To Finally, the Rochester momentum scale correction recipe is described here: http://www-cdf.fnal.gov/~jyhan/cms_momscl/cms_rochcor_manual.html

Few words about the normalization. The data events are not renormalized. The MC weights are weighted according to the probability of each event to be observed in a real collision event and according to the number of events generated in the sample. Therefore  

Event_weight ~ (Cross section x filter efficiency)/(Number of generated events)

For better accuracy we use the number of events actually ran on, rather than the number generated. We calculate it in the event loop, and apply it in the EventSelector::Terminate() method. In both the 7 and 8 TeV analysis, we normalized the MC tack (signal and backgrounds) to the number of events in data in the Z peak region (before the efficiency corrections). A special post-processing macro takes care of this:

python postprocessor.py
cp runfolder/stack* ../Inputs/rawYield

This python script adds up individual ROOT files with hadd and invokes ROOT macros parser.C and parser_2D.C which has a method for normalization of MC stack to data in the Z peak region.

After that, switch to the Dielectron working directory and produce necessary yield histograms before continuing with the style plotting. First, you will have to get a custom rootlogon file. Note: some of the libraries loaded in this rootlogon might interfere with the Proof environment in your machine.

cd ../FullChain
cp ../rootlogon_MIT.C ~/.rootlogon.C

Inspect the wrapper_EE.sh file inside and set the do_selection flag to 1 (true), and check the input files to run on are properly specified in the conf_file 

//top of the file
filename_data="../config_files/test.conf"
//scroll down a little
do_selection=1
do_prepareYields=1

Then run in two steps: (1) produce reduced ntuples, (2) prepare binned yields for analysis

./wrapper_EE.sh

To switch between 1D and 2D cases open the ../Include/DYTools.hh file and change the flag to const int study2D=1;.

After that, the style macro is used to plot the publication quality plots.

cd ../style/DY
root -l plot.C

the style macro is used This would plot the 1D yields distribution (the switch between the electrons and muons is done manually inside the macro by adjusting the paths).

To plot the 2D distributions do:

root -l ControlPlots_2D.C

Step 4: Acceptance and Efficiency estimation

Another constituent of the cross-section measurement is the acceptance-efficiency.

  • Acceptance is determined using GEN level information

To be able to produce the acceptance and efficiency one needs to change to a different folder, and run a different TSelector. But the general flow TSelector->ROOT wrapper->python wrapper is almost the same:

cd $DYWorkDir/AccEffMCtruth
python analyseMCtruth.py

The script will produce the root file with histograms corresponding to the mass and rapidity spectra after the acceptance cuts, selection cuts or both which are then used to calculate the acceptances, efficiencies and acceptance-efficiency products with and without pileup and FEWZ reweighing by executing:

root -l plotMCtruth.C
root -l plotMCtruth_2D.C

To get the corresponding distributions in the electron channel change to FullChain folder:

cd ../FullChain
//doAcceptance = 1
//doEfficiency = 1 
./wrapper_EE.sh

The macro output a root file starting with out1* or out2* containing the histograms corresponding to the acceptance, efficiency and their product. To produce the publication level plots, the style macro described in the previous section needs to be used again

cd ../style/DY
root -l plot.C

Note, that whenever you use the style/plot.C macro, you have to configure the mode inside. This looks something like this:

  bool drawFig1   = true; // data vs mc before acc correction 
  bool drawFig2   = false; // acc in muon
  bool drawFig3   = false; // acc in electron
  bool drawFig4   = false; // eff on reco, trig
  bool drawFig5   = false; // eff on iso
  bool drawFig6   = false; // unfolding matrix
  bool drawFig7   = false; // FSR
  bool drawFig8   = false; // final plot
  bool drawFig9   = false;  // data vs mc after acc correction
  bool drawFig10   = false;  // eff in electron
  bool drawFig11   = false; // final plot with histogram type

And you mark the histogram you would like to plot with the true. To get the 2D plots do:

root -l plot_acc_2D.C

Step 5: Data-driven efficiency correction

Only in the muon channel, the electron efficiency scale factors are obtained from the EGamma group, and not re-measured independently.

Next, the data-driven efficiency corrections are applied. This is done using the standard CMSSW recipe, so a lot of additional packages needs to be checked out. Follow this twiki: https://twiki.cern.ch/twiki/bin/viewauth/CMS/MuonTagAndProbe to set up your working area for the ntuple production (alternatively, one can use the trees already produced!)

  • The procedure goes in two steps: T&P tree production -> rerun seldom (ideally once), it depends only on the definitions of the tag and probe
cd ../ntuples
cmsRun tpTree_Producer_Data.py
cmsRun tpTree_Producer_MC.py
  • If you haven't produced TP trees you can always use the official ntuples located as described in MuonTagAndProbe twiki:
  • Second step of the procedure is fitting: separate job for trigger and all the muonID related efficiencies -> reran frequently and usually interactively (change binning, definitions)
cd ../test/TagAndProbe
cmsRun fitMuonID_data_all_2012_1_53X.py
cmsRun fitMuonID_mc_all_2012_1_53X.py

After familiarizing yourself with the TagAndProbe package, you need to produce the muon efficiencies as a function of pT and eta. You can use the wrapper.py script specifying which variables to bin the efficiency in and what runs/MC samples to process.

Finally, produce the plots with

 python auxPlotProducer.py

Step 5: Background estimation

QCD data driven background estimation

In 8 TeV analysis, the main method to estimate the QCD background in the dimuon channel is the ABCD method (the fake-rate method is used in the electron channel). Before starting, let me summarize the ABCD method in a nutshell:

ABCD method

1) choose 2 variables: assume two variables are independent 

2) assume the fraction should be same if there is no correlation: N_A / N_B = N_C / N_D

3) In our study, use two variables: sign of muon pair, muon isolation

4) QCD fraction in each region has a dependence. We produce the correction factor for each region: B, C, D

5) Produce N_B, N_C, N_D from data sample, and estimate N_A from them at the end (applying the correction factors)

Now, let's go step by step.

First, change to the ABCD folder:

cd $DYWorkDir/ABCDmethod

The procedure consists of few steps and is guided by the wrapper.py script located inside the folder:

Popen("python QCDFrac_p1.py",shell=True).wait()
Popen("python qcdFracHadder.py rootfiles",shell=True).wait()
Popen("python ABCD2vari_init.py",shell=True).wait()
Popen("python ABCD2vari_p1.py",shell=True).wait()

Thus, for each of the MC samples and for the real data a set of sequences is ran. First the QCDFrac_*.py, which invoke the EventSelector_Bkg.C TSelector class for various values of charge and isolation (the variables defining the signal and background regions), based on the histograms filled, the coefficients are calculated. Second, the qcdFracHadder.py scripts is ran on the on the output of the first step. It is a utility script which repacks the histograms in an appropriate format. Third, the ABCD2vari_init.py script which actually performs the etiolation of ABCD coefficients in each region. Finally, the ABCD2vari_*.py scripts invoke the EventSelector_Bkg2.C TSelector class, passing the ABCD coefficients as TObjString objects inside the macro.

The post-processing and the output harvesting step is performed by the following python script: 

python abcdPostprocessor.py

It uses the output of the second TSelector as an inout, hadds it and produces a root file with th histogram which is then used in the analysis.

E-mu data-driven background estimation method

To estimate all the non-QCD backgrounds we employ the so-called e-mu data driven background estimation method. The same method is applied in the muon and electron channels. The code used for that purpose was originally adapted from Manny and it uses the so-called Bambu workflow. First, let's change into the e-mu working directory:

cd $DYWorkDir/EmuBackground

First, reduced ntuples are generated from the original Bambu ntuples:

root -l (shared libraries should compile if they have not already done so)
root [0] .L selectEmuEvents.C+
root [1] selectEmuEvents("../config_files/data_emu.conf")
root [2].q

The above script needs to be ran twice in 2 modes: SS (same-sign pairs) and OS (opposite-sign pairs). The switch is don win the selectEmuEvents.C script by switching:

if (!isOppositeSign) continue; //if (isOppositeSign) continue;

And also changing the ntupDir name. One will have to edit the data_emu.conf to point to the local ntuples before running.

After running this step, the reduced ntuples should be output to a directory (../root_files/selected_events/DY/ntuples/EMU/). One would also need to run selectEvents.C to generate reduced electron ntuples.These ntuples must contain two branches, mass (dilepton invariant mass) and weight. After this is done, the e-mu macro can be ran:

#compile code
> gmake eMuBkgExe
#This should produce the binary eMuBkgExe. There are many options to run it. See the the possible options below
./eMuBkgExe #run emu method for 1D analysis and produce plots
./eMuBkgExe --doDMDY #run 2D analysis and produce plots
./eMuBkgExe --doDMDY --saveRootFile #same as above but output ROOT file with yield, statistical and systematic info as true2eBkgDataPoints.root

This macro is also ran in two regimes: using SS and OS ntuples as an input, and the proper true2eBackground file are produced and saved. The reason why we need to rerun SS and OS cases is because we rely on this for estimation of missing QCD contribution in the e-mu spectrum. These true2eBackground files and the dilepton yields serve as an input to the final step of e-mu background estimation, the production of a final root file with histograms:

root -l calculateEMu.C
root -l calculateEMu_2D.C
root -l calculateEMu_EE.C
root -l calculateEMu_2D_EE.C

As you can see, 4 different macros are re-ran for electrons, muons, 1D and 2D.

One other source of background considered in this analysis is the photon induced background. This background is irreducible and is not estimated based on MC. The bulk of the calculations of this background is done in FEWZ3, by switching the photon induced components on and off. Once the output files are ready, one can simply parse them, get the bin-by-bin correction:

python dimitriExtracter.py
python dimitriParser.py

Following scripts can be used to visualize and compare the PI background yields: 

root -l PIvalidation.C

Once the correction is prepared in a root file, it is simply loaded in the shapeR plotting macro as discussed in the sections below.

Step 6: Unfolding

Unfolding is applied to correct for migration of entries between bins caused by mass resolution effects (FSR correction is taken into account as a separate step, although it also uses the unfolding technique).  In 8 TeV analysis, we use the iterative Bayesian unfolding technique. Provides a common interface between channels for symmetry and ease in combination and systematic studies. Both the iterative Bayesian and matrix inversion technique (used in 7 TeV) are implemented and described below.

To do any unfolding with MC, this requires 3 things:

  • Producing the response matrix
  • Making the histogram of measured events
  • Making the true histogram & closure test

First, change to the unfolding working directory (common for electron and muons).

cd ../Unfold

The main steps for unfolding procedure go as follows:

1. Produce the response matrix.

2. To produce the unfolded yield

3. Visualize the yields and ratios of yields

The first step is rather time consuming, and is done by:

python ResMatrix.py

which takes care of the response matrix production for both the 1D and 2D cases. Inputs for these macros are ntuples, and the output is the ROOT file with the true and measured MC yields as well as the response matrix. To visualize the resulting response matrices do:

root -l Matrix.C

The above matrix access the ROOT file which we produced on the previous step and simply does drawing. There is a switch inside this macro allows to change 1D and 2D plots, written with a comment inline.

Once this is done, one can continue to apply the unfolding technique. Open the unfold.C file and familiarize yourself with various flags (pre-processor pragmas to be more precise), namely:

#define USE_OVERFLOWS true
#define DATA_DRIVEN_BKG true
#define DO_ROO true
#define DO_BAYES true
#define FEWZ_CORRECTED true

The input to this file is the true, measured MC yields and the response matrix as well as the observed, reconstructed signal and background MC yields. A possibility to read the data-driven backgrounds is also implemented and controlled by the DATA_DRIVEN_BKG flag.

The Bayesian unfolding is performed using the RooUnfold package which is described here: http://hepunx.rl.ac.uk/~adye/software/unfold/RooUnfold.html, and it relies on the code committed here https://github.com/ASvyatkovskiy/DrellYanAnalysis/tree/master/DimuonAnalysis/DYPackage/test/Unfold/RooUnfold-1.1.1. Alternative would be the response matrix inversion (which uses this implementation: https://github.com/ASvyatkovskiy/DrellYanAnalysis/tree/master/DimuonAnalysis/DYPackage/test/Unfold/include) or singular value decomposition (SVD). For convenience, the latter two options are available for the user, but they are not enabled by default. To enable the SVD one can set:

#define DO_ROO true
#define DO_BAYES false

To enable matrix inversion:

#define DO_ROO false
#define DO_BAYES false

Above is the default setting, as we use iterative Bayesian method as default for 8 TeV in both channels. First run in the closure test mode by setting the run to 'POWHEG' inside the wrapper unfold.py, then switch to '' flag which will do the actually unfolding – we do not make any distinction between the run ranges in 8 TeV as the scale corrections are run independent

python unfold.py

It outputs 1 ROOT file with the yields before and after unfolding.

Repeat all for 2D:

python unfold_2D.py

Similarly to 1D, it calls the unfold_2D.C macro which takes the true, measured MC yields and the response matrix as well as the observed, reconstructed signal and background MC yields as inputs. A possibility to read the data-driven backgrounds is also implemented and controlled by the DATA_DRIVEN_BKG flag. It outputs 1 ROOT file with the yields before and after unfolding.

Then to repeat for electrons change to the FullChain folder and set the appropriate  flags for runnings the response matrix production step

cd ../FullChain
vim wrapper_EE.sh
do_unfolding=1

That summarizes the unfolding step, and the output of this step will be used on the following analysis steps.

Step 7: FSR correction

 The effect of FSR is manifested by photon emission off the final state lepton. It leads to a change of the dimuon invariant mass and as a result a dilepton has invariant mass distinct from the propagator (or Z/gamma*) mass.

For our analysis we estimate the effect of FSR and the corresponding correction by estimating the unfolding correction in invariant mass and rapidity bins. This is done by applying an exact same unfolding procedure as for the mass resolution effects described above. A minor difference is that we also apply bin-by-bin corrections for event classes that do not enter the response matrix.

Change to the FSRunfold directory

cd ../FSRunfold

And run the similar steps as above:

python FSRResMatrix.py

This script will give you the response matrix in 1D and 2D and also additional bin-by-bin corrections for events not entering the response matrix. In addition, there is an option to run fully bin-by-bin as a cross check. If you inspect the contents of this python script, you will be able too understand what actually is done. After the jobs are complete, you need to merge the individual root files using hadd, and then run the fracEff.C script to extract the additional corrections:

root -l fracEff.C

Similarly to the detector resolution unfolding step, you can inspect the response matrix:

root -l Matrix.C

To get these all quantities in the electron channel, similarly to the detector resolution unfolding case you just need to:

cd ../FullChain
vim wrapper_EE.py
doUnfoldingFsr=1
python wrapper_EE.py

Step 8: Cross section calculation

Once all the constituents of the cross section are in place, one can continue with the cross section calculation results. First, the results are calculated in each individual channel and then they are combined using the BLUE method (as described in the Step 10 section). To calculate the 1D cross section in muon channel change to:

cd ../ShapeR
python shapeDY.py

this will produce an output root file in the ../Outputs directory. All the necessary input files are expected to be available in the ../Inputs directory. To get the 2D cross section change to

cd ../shapeR2D
python shapeR2D.py

The output file is also going to be created in the ../Outputs directory. To get the electron cross section do as usually:

cd ../FullChain
//set do_crossSectionFsr=1
./wrapper_EE.sh

One would have to rerun this step twice switching the flag between 1D and 2D.

This produces the necessary root files with the histograms of the cross section and uncertainties

Step 9: Covariance matrix calculation

The covariance matrix gives the uncertainties of the measurements together with the correlations between the analysis bins and different systematic sources. There are several distinctive steps in the covariance analysis: (1) calculation of the unfolding covariance matrix, (2) calculation of the efficiency covariance matrix and (3) the calculation of the FSR covariance matrix. The code necessary for this calculations is available in:

cd ../Covariance

Note, that this approach is working well only for the muon channel, for electron channel the result is produced by Andrius and electron group using a different technique.

The inputs necessary for the covariance matrix calculation derived on the previous steps are (1) efficiency correlation matrix produced with EffCorrAndSyst code package, (2) mass resolution unfolding response matrix, (3) FSR response matrix, (4) systematic uncertainty tables in ASCII format, (5) optionally the theory ROOT files are used if one is looking for chi2 tests. Change to the covariance directory:

cd ../Covariance

The open the ROOT session and locad the covariance matrix macro:

root -l 
.L covariantMatrix.C

The consecutively call the "write" functions in the same ROOT session:

writeEffCorrToRootFile()
writeShapeToRootFile()
writeSystToRootFile()
writeTheoryToRootFile()

Here, the writeEffCorrToRootFile(case)  stores efficiency covariance in covMatrices_case_store.root, the

writeShapeToRootFile(case) stores data in shape_r_case_store.root, the 

writeSystToRootFile(case) stores data in shape_rSyst_case_store.root and 

writeTheoryToRootFile(case) stores predictions in shape_r_Theory_case_store.root (the latter is needed for chi2 tests only).

The above files are used by  estimateCovMatrix(case)   which does the actual covariance matrix estimation:

estimateCovMatrix()

which produces covariance_finalResults_case.root. As specified in the text one can pass the argument to the function (1D, 2D, normalized or unnormalized), but I prefer to change the defaults and run with nor argument - this is either to keep track of your steps. The arguments (with default values) are "2D"/"1D", "mu"/"el","inAcc"/"fullAcc", "preFSR"/"postFSR". Only relevant combinations are covered ("2D" with "el" will crash).

After covariance matrix is produced, one can visualize it with:

root -l quickPlotter.C
root -l drawNicePlots_DYStoyan.C

The final step of the covariance matrix analysis is production of the ASCII files for HEPDATA or for BLUE studies.

python wrapper.py
python combine.py

Step 10: Electron-muon combination with the BLUE method

Having the root files for individual cross section measurements i the dielectron and dimuon channels, we need to combine them for a higher precision. The combination is performed with the BLUE method, which takes 2 vectors of measured values of the cross section and the covariance matrices.

First of all, switch the rootlogon for this task:

cp ../../rootlogon_defaultgluon.C ~/.rootlogon.C

Next, we need to make sure that the inputs are in the form the BLUE macro expects it (i.e. ASCII, not root):

cd ../BLUE
python bluePrinter.py
python bluePrinter_2D.py

This macro takes a ROOT file as an input and outputs the ASCII file as an output. The input ROOT files are produced using macros in the ShapeR and shapeR2D folders and supposed to be in the ../Outputs directory.

We can use the txt2Plot.py macro to validate the txt input by visualizing it.

After we have the inputs in proper format, we just need to run the resultCombiner.C macro. To pass all the inputs properly (which should be in the current folder), we specify them in the wrapper.py script and run it as

python wrapper.py
python wrapper_2D.py

The output will be the ASCII format again, but we normally need it in root. So we have to run another converter file after we finished:

python outToPeople.py
python outToPeople_2D.py

After that, we have the root file with the cross section histogram of the same format as we have for the individual cross sections, and we can visualize it (produce a plot for the publication) on the same step as we did for other cross sections in the previous section

cd ../ShapeR
root -l plot_Comb.C //for 1D
root -l plotter.C   //for 2D, make sure that the input file is pointing at the combination

Step 11: Double ratio calculations

Once the cross actions have been obtained, the double ratios – ratios of the normalized differential and double-differential cross sections – can be calculated. Most of the macros for the double ratio calculation is located in the ../ShapeR folder, so first change to that folder:

cd ../ShapeR

Given the input ROOT files with the cross sections at 8 TeV, and the old archived ROOT files with the 7 TeV results, the only thing we need to produce in addition is the uncertainties properly combined. This is done by the following few macros (note that the 1D uncertainties file is automatically produced by the ShapeR/shapeDY.py macro, so need to worry about it):

python uncertEE.py
python uncert_2D.py
python uncertEE_2D.py 

The electron uncertainties are produced from the input files provided by the electron group. They are combined with th acceptance and modeling uncertainties provided by us.

Once the uncertainties are prepared, the double ratios can be produced. The macro does what the name intuitively suggests, it calculates the ratio of the cross sections per mass/rapidity bin from the 8 and 7 TeV normalized cross sections, and assigns proper uncertainty which it takes from the output of the previous step.

python doubleratio_1Dee.py
python doubleratio_1Dmu.py
python doubleratio_1Dcomb.py
python doubleratio_2Dmu.py
python doubleratio_2Dcomb.py

Step 12: Plotting the final results

The final results are the absolute cross sections in bins of mass and rapidity in dielectron, dimuon channels and combination. As well as the double ratios. To plot the 1D differential cross sections do:

cd ../ShapeR

root -l plot_Comb.C

root -l plot_EE.C

root -l plot_MuMu.C

To plot the 1D double ratios (switch between the lepton channels is inside):

root -l plot_dratio.C

To plot the 2D cross sections and double ratios do:

root -l plotter.C
root -l plot_dratio2D.C


Appendix: List of macros used and paths

Figure idMacro pathComments
1Produced by electron group 

2

No macro needed, direct TH1D::Draw from the ntuple

Plot just shows the contents of the branch pileUpReweigth

3

 

https://github.com/ASvyatkovskiy/DYAnalysis/blob/master/test/ControlPlots/plot_PU.C

Plot can differ slightly depending on which MC sample is used as a starting point,
but should overall be close to 1.0 at all masses 

4

 

https://github.com/ASvyatkovskiy/DYAnalysis/blob/master/test/TagAndProbe/auxPlotProducer.py

This requires the production opt T&P ntup;es, running the fit and only then these plots can be produced (See above tutorial)

Additional configuration inside the script is necessary to choose the binning type on the x axis and the efficiency type (reconstruction, isolation, trigger)

5

Produced by electron group 

6

 

 

 

 

https://github.com/ASvyatkovskiy/DYAnalysis/blob/master/test/FEWZValidation/plotWeights.C

https://github.com/ASvyatkovskiy/DYAnalysis/blob/master/test/FEWZValidation/visualizeMap.C

This requires running FEWZ to get the cross sections as a function of PT. Once the cross section maps are produced it can be visualized. The binning needs to be adjusted between plots 6-9 inside the macro.

7

Same as 7) 

8

Same as 7) 

9

Same as 7) 

10

https://github.com/ASvyatkovskiy/DYAnalysis/blob/master/test/FEWZValidation/validate1D.C

Simple comparison of the weighted and unweighted mass spectra

11

https://github.com/ASvyatkovskiy/DYAnalysis/blob/master/test/FEWZValidation/validate1D.C

Same as 10) but the difference is plotted instead of the superimposed

12

https://github.com/ASvyatkovskiy/DYAnalysis/blob/master/test/FEWZValidation/validate1D.C

Comparison of the POWHEG/FEWZ ratios with and without weights applied

13

 

https://github.com/ASvyatkovskiy/DYAnalysis/blob/master/test/Inputs/sys/modeling2013_1D_smoothed_41.root

Plotted directly from the root file, after the smoothing is applied to the uncertainty. Simply open the root file, and use TH1: Draw() method

14

https://github.com/ASvyatkovskiy/DYAnalysis/blob/master/test/FEWZValidation/plotWeights.C

FEWZ weight distribution used in MC

15

Same as 14) 

16

Same as 14) 

17

https://github.com/ASvyatkovskiy/DYAnalysis/blob/master/test/FEWZValidation/validate1D.CThe last plot in the bottom of the macro. Is essentially the same as plots 14-17 but in a compact format.

18

https://github.com/ASvyatkovskiy/DYAnalysis/blob/master/test/ControlPlots/ControlPlots.py

Proper histogram needs to be checked in the list called histos. In addition, the root files produced by default are not going to contain proper N-1 filled histograms for the variables used for the selection. So one needs to perform a separate rerun for each variable (suggested to leave this plot for the end).

19

Same as 18) 
Figure idMacro pathComments

20

Same as 18) 

21

Same as 18) 

22

Same as 18)File needs to be covered to the format expected by the plotting macro

23

Same as 18)File needs to be covered to the format expected by the plotting macro

24

Same as 18)File needs to be covered to the format expected by the plotting macro

25

Same as 18)File needs to be covered to the format expected by the plotting macro

26

https://github.com/ASvyatkovskiy/DYAnalysis/blob/master/test/EmuBackground/eMuMethod/ calculateEMu*C

There is a group of macros to produce this and a few following plots contained in this folder, the macro names "EE" and "2D" suggest which one is supposed to be used.

In addition, there is flags checking if you prefer to run closure test or otherwise.

27

Same as 26) 

28

Same as 26) 

29

Same as 26) 

30

Same as 26) 

31

 

Comparison plots, obtained form other analysis notes 

32

 

https://github.com/ASvyatkovskiy/DYAnalysis/tree/master/test/ABCDmethod

No macro, produced from root file which is produced with the code in the ABCDmethod directory

33

Same as 32) 

34

Produced by electron group 

35

https://github.com/ASvyatkovskiy/DYAnalysis/blob/master/test/style/plot.C

Needs adjustment of the macro to plot inside the plot.C macro, also adjustment of the legends and input root file paths inside the DY.C in that folder

36

https://github.com/ASvyatkovskiy/DYAnalysis/blob/master/test/style/ControlPlots_2D.C

Switch the input file names between the electron and muons

37

https://github.com/ASvyatkovskiy/DYAnalysis/blob/master/test/style/ControlPlots_2D.Cadd "_EE" to the file name and change the labels

38

https://github.com/ASvyatkovskiy/DYAnalysis/tree/master/test/PIbkgThe procedure to prepare the scale factors is rather lengthy, but once the FEWZ cross section are available, one can parse them into root file and plot with the .C macros available in the folder (see above tutorial for more details)
Figure idMacro pathComments

39

Same as 38) 

40

https://github.com/ASvyatkovskiy/DYAnalysis/blob/master/test/ControlPlots/rochStudy.C 

41

https://github.com/ASvyatkovskiy/DYAnalysis/blob/master/test/FullChain/escaleVal.C 

42

Same as 41)Needs adjustment of the input histogram name, which corresponds to either using the full sample or a given eta-eta class

43

https://github.com/ASvyatkovskiy/DYAnalysis/blob/master/test/PullTests/testPulls.C

 

44

https://github.com/ASvyatkovskiy/DYAnalysis/blob/master/test/style/plot.C

Search for "resMatrix" in the file, and make sure the input file name is pointing to the right file.

45

 

https://github.com/ASvyatkovskiy/DYAnalysis/blob/master/test/Unfold/yield.C

can be also automatically produce by this script unfold.C, so it is one of the two.

46

https://github.com/ASvyatkovskiy/DYAnalysis/blob/master/test/Unfold/Matrix.CThere is a switch between 1D and 2D in the macro
47https://github.com/ASvyatkovskiy/DYAnalysis/blob/master/test/Unfold/unfold_2D.CProduced as a side plot by the main unfolding macro.

48

https://github.com/ASvyatkovskiy/DYAnalysis/blob/master/test/Unfold/plotYieldRatio.C 

49

Produced by electron group 

50

https://github.com/ASvyatkovskiy/DYAnalysis/blob/master/test/style/plot.C 

51

https://github.com/ASvyatkovskiy/DYAnalysis/blob/master/test/style/plot_acc_2D.C 

52

 

https://github.com/ASvyatkovskiy/DYAnalysis/blob/master/test/GAPInclusion/plotComp.C 
53Produced by electron group 
54Produced directly from T&P ntuple, no macro used 
55

Produced by electron group

 
56Produced by electron group 
Figure idMacro pathComments
57Produced by electron group 
58Produced by electron group 
59Produced by electron group 
60Produced by electron group 
61Produced by electron group 
62Produced by electron group 
63Produced by electron group 
64Produced by electron group 
65Produced by electron group 
66

https://github.com/ASvyatkovskiy/DYAnalysis/blob/master/test/TagAndProbe/auxPlotProducer.py

 
67Same as 66) 
68Same as 67) 
69

https://github.com/ASvyatkovskiy/DYAnalysis/blob/master/test/Inputs/effcorr/effCorrStyle1Dplot.C

Style plot. All the macros to prepare this plot are in https://github.com/ASvyatkovskiy/DYAnalysis/tree/master/test/EffCorrAndSys
70Same as 69) 
71Produced directly from T&P ntuple, no macro used 

72

https://github.com/ASvyatkovskiy/DYAnalysis/blob/master/test/style/plot.CNeeds to switch to the acceptance plots in the plot.C and also point to the proper path inside the DY.C
73

https://github.com/ASvyatkovskiy/DYAnalysis/blob/master/test/FSRunfold/fracEff.C

 
74

https://github.com/ASvyatkovskiy/DYAnalysis/blob/master/test/style/plot.C

Needs to switch to the acceptance plots in the plot.C and also point to the proper path inside the DY.C
75https://github.com/ASvyatkovskiy/DYAnalysis/blob/master/test/FSRunfold/fracEff.C 
Figure idMacro pathComments
76

https://github.com/ASvyatkovskiy/DYAnalysis/blob/master/test/FSRunfold/Matrix.C

There is a switch between 1D and 2D inside the macro
77

https://github.com/ASvyatkovskiy/DYAnalysis/blob/master/test/FullChain/dressingValidator.C

Macro for ad-hoc dressed lepton studies performed with the MIT ntuple workflow
78Same as 77) 
79Same as 77) 
80Same as 77) 
81Produced directly from T&P ntuple, no macro used 
82Produced directly from T&P ntuple, no macro used 
83Produced directly from T&P ntuple, no macro used 
84

https://github.com/ASvyatkovskiy/DYAnalysis/blob/master/test/Inputs/sys/FSR2D_out8TeV.root

Plotted directly from the root file, all the histograms are available there
85

Plotted directly from the root file: https://github.com/ASvyatkovskiy/DYAnalysis/blob/master/test/Inputs/sys/pdfu_8TeV.root

Plotted directly from the root file, all the histograms are available there
86

https://github.com/ASvyatkovskiy/DYAnalysis/blob/master/test/Covariance/covariantMatrix.C

 
87Same as 86) 
88Same as 86) 
89Same as 86) 
90Same as 86) 
91

https://github.com/ASvyatkovskiy/DYAnalysis/blob/master/test/ShapeR/plot_MuMu.C,

https://github.com/ASvyatkovskiy/DYAnalysis/blob/master/test/ShapeR/plot_EE.C

 
92

Same as 91)

Needs to set isShapeR flag properly, that does the switch between the shape R and the absolute cross section inside the macro
93

https://github.com/ASvyatkovskiy/DYAnalysis/blob/master/test/shapeR2D/plotter.C

 
94Same as 93)Properly set the EE/MuMu or Comb flags, also the shapeR vs absolute cross section

 

 

  • No labels