Questions for tau Dilepton Analysis Blessing, April 28 2004
Sarah Demers, Kevin McFarland, Tony Vaiciulis
Analysis page
with links to talks
For Documentation see:
CDF Note 6954: Checking muon to tau fake with Z -> mumu data
CDF Note 6923: Optimizing HT and Jet ET cuts in the ttbar tau dilepton Analysis
CDF Note 6922: A Z Mass Cut to reduce Z -> tautau background in the tau
dilepton Analysis
CDF Note 6921: Event Selection and Acceptance for Top to Tau Dilepton Analysis
CDF Note 6784: Estimating the Jet to Hadronic Tau Fake Rate
CDF Note 6408: Determining the Electron Fake Rate to Hadronic taus from the Data
Question 1:
The analysis would benefit from more MC statistics
Answer 1:
This analysis, with roughly
one signal event and one background event expected, is far from limited
by MC statistics. While more MC statistics would lower our uncertainties,
our background summary tables shows that our largest
uncertainties in background are due to the jet->tau fakes (both
statistical and systematic uncertainties) which we get entirely
from the data. In our acceptance the systematic error is a factor
of two larger than the statistical error.
Question 2:
Muon, Electron and Tau ID should be added to paper.
Answer 2:
The tau ID is documented in CDF Note 6921 and we added tables showing
our electron and muon ID to the note, though they are documented in
many other places as well. The updated note will be posted by Tuesday,
April 27th.
Question 3:
What are track quality requirements on the shoulder tracks for the tau?
Answer 3:
The shoulder track is required to have a Pt greater than 1 GeV and to
be a deftrack with at least 10 axial and 10 stereo hits.
Question 4:
PDF uncertainty should not be limited by statistics. Could you use the
generator level method?
Answer 4:
We changed our method of determining the systematic uncertainty
due to uncertainties in the PDFs. We now use the method
recommended at the April 4, 2004, Joint Physics Meeting.
This method, which includes an event by event reweighting in ttopei
sample using X and Q^2 and different PDF sets, results in a 1%
systematic due to PDF uncertainty.
More details:
The default signal Monte Carlo sample, Pythia ttopei, is based on the
CTEQ5L Parton Distribution Function set using alpha_s = 0.118.
We consider here the systematic uncertainty caused by varying the internal
parameters of the PDF set as well as varying alpha_s and the choice of
PDF group.
With the new PDF set CTEQ6M, the CTEQ group made available
40 complementary PDF sets CTEQ6M.01...CTEQ6M.40 each of which represents
an
up or down variation along one of the twenty eigenvectors (corresponding
to the ~20 free parameters) which collectively form an orthonormal
basis set spanning the PDF parameter space (hep-ph/0201195 v3 and
references
therein). Each up and down variation pair
represents the range of PDF behavior that is consistent with the current
global data.
Each event in the Pythia ttopei inclusive ttbar sample is reweighted
according to the ratio of the CTEQ6M PDF values and the CTEQ6M.xx PDF
values.
(See April 2, 2004, Joint Physics Meeting minutes) Then all normal
selection cuts are applied using full simulation and reconstruction.
The total acceptance change, with respect to CTEQ6M, caused by all of the
variations is 0.6%.
In a similar way, a relative event by event reweighting is done using
the CTEQ5L and MRST72 PDF sets resulting in a 0.8% change in acceptance
due to the choice of PDF group.
To estimate the acceptance sensitivity to the value of alpha_s
we compare the PDF sets MRST72 (alpha_s = 0.1175) and MRST75
(alpha_s = 0.1125)
using the reweighting procedure described above. The acceptance change
is 0.1%.
Considering the three contributions of 0.6%, 0.8%, and 0.1% we take 1%
as the systematic uncertainty due to PDF uncertainty.
Question 5:
For acceptance (a_s): Why do you take half the difference as uncertainty when
you compare the central
a_s value with the -1
sigma variation?
Answer 5:
Our method to estimate the systematic effect due to uncertainty in alpha_s
has been changed to adhere to the recommendation presented at the Joint
Physics Meeting of April 4, 2004. See answer to question 4.
Question 7:
What is the default for acceptance used in table 6 of note 6921? (Should
6.4% -> 11%
and 3.7% ->
2%?)
Answer 7:
Table 6 has been updated with a better method to estimate
the systematic uncertainty due to uncertainties in FSR.
The default acceptance has been added to the table for clarification
(1.08 +- 0.06). I think it is more clear now what is done.
Note that our final acceptance central value of 1.03 is different
from the 1.08 listed above. The
comparisons described above were done with our "old" muon veto cut
(vetoing on tau candidates
that can be matched to muon stubs) instead of our "new" muon veto cut
requiring the Et/seed track Pt of the tau candidate to be greater than 0.5)
Also, we did not go through the steps of subtracting out the background due
to our jet fake method in these systematic samples. We therefore use 1.08
here in order to compare apples to apples.
Question 8:
Uncertainty on Jet ID? You are requiring at least two Jets?
Answer 8:
Our jet ID uncertainty is dominated by the uncertainty on the jet energy
scale. We take this into account in our systematic error. The larger point
of the question, that CDF should consider the effect of our jet clustering
algorithm, is noted. We are using JetCluModule for our jet clustering in
this analysis and we do not have a systematic error associated with that
choice.
Question 9:
What is the uncertainty on the kinematics? (Theta of muon and electron,
transverse energy of muon and electron?)
Answer 9:
We take into account scale factors due to electron and muon ID, triggers, and
reconstruction. This corrects for differences between the data and MC as
far as the transverse energies and angular distributions are concerned.
Question 10:
What uncertainty is introduced by the opposite sign requirement?
Answer 10:
The range of momentums of our electrons and muons are such that we are not
worried about an error as a result of mismeasuring their signs. The same
holds for the tracks in the tau. The possibility of miscounting the tracks
in the taus does introduce a possible uncertainty. However, we have some
confidence that we are modeling this well because of the agreement we see
in the W->taunu study between data and monte carlo tau candidate
track multiplicity distributions.
Question 11:
What is the error on the luminosity?
Answer 11:
We have an error of 6% on the luminosity, with a central value of
193.5 pb^-1 .
Question 12:
2.5% is given as sys uncertainty on electron/muon ID (6319) but the note
referenced does not support this number. What is the explanation?
Answer 12:
This systematic uncertainty should be 5% (as it is currently in the analysis
and documentation.) The large systematic error comes from the assumption
that our electron and muon ID is not a function of the number of jets in the
event, or the local environment in the event. We are consistent with other
analyses in the top group with this 5% value for the systematic. (See the
lepton + jets analysis CDF Note 6844, p21.) The CDF Note sourced as
documentation for the 5% systematic is CDF Note 6858, "High pT Lepton ID
Efficiency Scale Factor Studies" which has not yet been posted.
Question 13:
Why is the error on trigger efficiency neglected (muons in particular)?
Answer 13:
The first two versions of CDFNote 6921 had some of the electron and muon
scale factors labled incorrectly. We are now using the blessed results from the top group for winter 2003/spring 2004, and they are as follows:
CEM ID scale factor: 0.965+/- 0.006
CMUP ID scale factor: 0.94 +/- 0.01
CMX ID scale factor: 1.015 +/- 0.007
CEM trigger efficiency: 0.966 +/- 0.001
CMUP trigger efficiency: 0.890 +/- 0.009
CMX trigger efficiency: 0.966 +/- 0.007
CMUP reconstruction scale factor: 0.927 +/- 0.010
CMX reconstruction scale factor: 0.992 +/- 0.011
Note that there is no scale factor associated with electron reconstruction. These scale
factors are applied to the monte carlo backgrounds (Z->tautau, WW, WZ) as well as to the acceptance (ttbar inclusive monte carlo, ttopei.)
Question 14:
In section 6 (note 6921) it would be appropriate to quote and compare with
Run1 published results PRL 79:3585-3590 (1997)
Answer14:
We added a table to the results section that shows the predicted and measured
events seen in Run1.
Question 15:
We would prefer not to quote any limits, especially not with the present
statistics.
Answer 15:
We think that setting the limit on r_tau is a conservative thing
to do and that it really showswhat this analysis is intended to
measure. However, we will present results in a way
that the top group is comfortable with, with the input of the statistics
committee.
Question 16:
Why is the quoted electron scale factor for monte carlo 0.965 for electrons
and not 1 as seen in Z/W x-section?
Answer 16:
The electron and muon scale factors we use are listed in the answer
to question 13. We use values blessed by the top group for the
winter 2003/spring 2004 conferences that correspond to the electron
and muon cuts we use in our analysis.
Question 17:
Can you estimate ISR systematic in a more sensible way than turning it off
completely?
Answer 17:
A better method to estimate the systematic uncertainty due
to ISR uncertainty might be to use Pythia ttbar samples
ttopbe (less ISR) and
ttopce (more ISR).
However, due to the size of the samples we expect that
the statistical uncertainty will prevent a more meaningful
estimate of the ISR systematic than what we already have.
Question 18:
Are you treating FSR and underlying event as the same thing? If by FSR you
mean radiation off one of the outgoing legs in the hard scattering process (what I think it should be) then the two are different. UE is how well you model the soft part of the collision due to remnants of the incoming protons.
Answer 18:
We have improved our method of estimating the FSR systematic.
We now estimate the acceptance uncertainty due to imprecise knowledge
of Final State Radiation by comparing inclusive ttbar Pythia sample ttopde
with less FSR with Pythia sample ttopee with more FSR.
The difference in acceptance is
smaller than the statistical uncertainty of 7%, which we
therefore use as an estimate of the systematic uncertainty.
So our method has improved, but the result is still limited
by the statistical ucnertainty of the Monte Carlo samples.
Question 19:
About PDFs: I do not understand why you vary alpha_s
in PDFs. Alpha_s in PDF should be in agreement with
the one used in the hard scattering calculation as
this is the way they were defined and fitted from
available HEP data. Besides, I beleive there is a
pretty developed approach based on CTEQ6.1 (I am
no expert, but I presume the NLO differential x-section
is available for tt-bar) when you calculate acceptance
by varying the eigenvectors and summing up the deviations
in acceptance in quadratures.
Answer 19:
The CTEQ fit to world data assumes a fixed alpha_s. A variation of
alpha_s within its uncertainties can cause a change in PDFs
which may change our acceptance. A variation of alpha_s is recommended
by the CTEQ group as a way to explore a somewhat orthogonal
direction in uncertainty space. We have improved our method
of estimating acceptance uncertainty by using the "eigenvector varying"
method that you mention. See answer to question 4. It was recently
decided at a CDF Joint Physics Group meeting that the systematic
uncertainty in acceptance due to PDF uncertainties should be
determined by a combination of the "eigenvector varying" method
and a variation in alpha_s and this is what we have done.
Question 20:
Again, ID efficiency for taus using embedding will not
work for values related to hadronic shower lateral
profile. I talked to Pasha about it long time ago and
I think he agreed with me. I think that MC is not
relaible at all and unless you drop the calorimeter
isolation altogether, efficiency of this cut should
be measured from W->tau.
Answer 20:
We have measured a tau ID scale factor of 0.95+/-0.10
using W->taunu data and monte carlo.
This scale factor is applied to all of our monte carlo calculations that
include real taus (acceptance, Z->tautau, WW, WZ.) The study is documented
in CDF Note 6921.
Question 21:
Acceptance systematics: I am not sure what is the
effect of deficiencies in MET simulation on your
acceptance (you use cuts on MET, HT etc.). I think
it has to be evaluated in some way. Even if it is
small, I think it is important to know that it is
small.
Answer 21:
We chose our missing Et cut to be very efficient so our acceptance
is not sensitive to small changes in the value of our missing Et cut.
To quantify this, we see that we gain 1.5% of our acceptance by
decreasing our missing Et by 5 GeV from 20 GeV to 15 GeV, and we lose
2% of our acceptance when we increase our missing Et cut from 20 GeV
to 25 GeV. These numbers (1.5% and 2%) are dwarfed by the 5.8% systematic
uncertainty that we measure as a result of the jet energy scale
uncertainty.
Below you can see the shape of the missin gEt after analysis cuts in the
e,tau (top) and mu,tau (bottom) channels. The Z mass veto has not been applied.
Question 22:
Fake rates and OS/LS ratio in W+jet events. Do you
understand what's going on with the OS/LS ratio in
0,1-jet bins? I think that the procedure when you
first apply OS cut and then you apply fake rate to
this sample is incorrect. It is the same problem
of OS/LS in W+jet background for WW analysis (Avi
and Co agreed that this ratio is very far from
1:1 or even 2:1 that you seem to use). Problem is
that events to which you apply fake rate do not
have a well defined charge (unless you cut hard
on track isolation). On contrary, W+jet events
that do pass all cuts are heavily OS (b/c they
are made to be OS by the production mechanism).
So even if the have OS:LS=2:1 for "dirty" tau fakes,
this ratio can be as high as 5:1 for "clean" tau
fakes.
Answer 22:
We do not "use" any ratio of OS/LS events, but only measure the fakes
in the data. It should be clear that we do not assume anything about
the OS/LS ratio.
If there is a dependence on isolation of the tau candidate we
account for that to some extent with our fake rate, which is paramaterized
as a function of calorimeter isolation.
Our large systematic errors
cover the differences seen in the dijet samples (jet20, jet50, jet70)
as well as the SUMET sample. Our check of the fake rates is in the
jet multiplicity tables.
Question 23:
Fake rates: you say that jet->tau fake rate is of
the order of 1%. I think this is a typo. If you
do fakes as a function of ET and Iso, for events
with good isolation the fake rate should huge (I
hope it is still less than real tau id efficiency,
otherwise fake rate technique will not work at all).
I am actually puzzled with what you do when you
apply fake rate only to events that do not pass
all ID cuts. I understand the f/(1-f) idea, but
it has to be applied to each bin in ET vs Iso
separately and it should not work at all if f is
not small (the region where iso is small, and most
of passing fakes are from there). I think the way
you apply fake rates is not clear (I understand
how you calculate it) and I want to understand it
in better detail. Could you expand this section and
give some numbers?
Answer 23:
We have six categories of isolation with 10 bins of Et in each isolation
category, or 60 total categories. Our calorimeter isolation is defined
as the amount of transverse energy deposited in a cone of radius R = 1.0 (in eta and phi), not including calorimeter towers belonging to the tau candidate divided by the tau candidate transverse energy. Our six bins of isolation are as follows:
Isolation bin 1: 0.0 < tau Iso <= 0.05
Isolation bin 2: 0.05 < tau Iso <= 0.10
Isolation bin 3: 0.10 < tau Iso <= 0.20
Isolation bin 4: 0.20 < tau Iso <= 0.30
Isolation bin 5: 0.30 < tau Iso <= 0.50
Isolation bin 6: tau Iso > 0.50
The bins of Et begin with 15 GeV and are 10 GeV wide. Note that the fake
we calculate with the dijet and SUMET samples is a
relative fake rate, meaning our denominator includes several
of the tau candidate cuts and our numerator is after all tau ID cuts.
The following is the
details of the fake rate calculation, where we show the number of
denominator and numerator events in the Jet 50 dataset used to
determine the fake rates (with errors) and the fake rate (with
statistical errors) for each isolation/Et bin. We use the jet50
fake rate because the Et spectrum of the jets is most similar
to our tau candidate Et spectrum in the W+jet events.
The actual fake
rate is in bold:
bin den denE num numE fake fakeE
Iso1 bins:
0 115.00 10.72 11.00 3.32 9.57 2.74
1 355.00 18.84 31.00 5.57 8.73 1.50
2 893.00 29.88 38.00 6.16 4.26 0.68
3 1653.00 40.66 47.00 6.86 2.84 0.41
4 2253.00 47.47 47.00 6.86 2.09 0.30
5 2007.00 44.80 24.00 4.90 1.20 0.24
6 1425.00 37.75 9.00 3.00 0.63 0.21
7 1000.00 31.62 3.00 1.73 0.30 0.17
8 643.00 25.36 1.00 1.00 0.16 0.16
9 378.00 19.44 0.00 -0.00 0.00 0.00
Iso2 bins:
0 555.00 23.56 26.00 5.10 4.68 0.90
1 1644.00 40.55 55.00 7.42 3.35 0.44
2 3281.00 57.28 54.00 7.35 1.65 0.22
3 4874.00 69.81 70.00 8.37 1.44 0.17
4 4832.00 69.51 42.00 6.48 0.87 0.13
5 3187.00 56.45 12.00 3.46 0.38 0.11
6 1762.00 41.98 3.00 1.73 0.17 0.10
7 949.00 30.81 2.00 1.41 0.21 0.15
8 462.00 21.49 0.00 -0.00 0.00 0.00
9 249.00 15.78 0.00 -0.00 0.00 0.00
Iso3 bins:
0 2737.00 52.32 82.00 9.06 3.00 0.33
1 6045.00 77.75 86.00 9.27 1.42 0.15
2 8599.00 92.73 76.00 8.72 0.88 0.10
3 8408.00 91.70 59.00 7.68 0.70 0.09
4 5039.00 70.99 24.00 4.90 0.48 0.10
5 2232.00 47.24 8.00 2.83 0.36 0.13
6 983.00 31.35 2.00 1.41 0.20 0.14
7 389.00 19.72 0.00 -0.00 0.00 0.00
8 160.00 12.65 0.00 -0.00 0.00 0.00
9 88.00 9.38 0.00 -0.00 0.00 0.00
Iso4 bins:
0 4355.00 65.99 78.00 8.83 1.79 0.20
1 6157.00 78.47 48.00 6.93 0.78 0.11
2 5635.00 75.07 29.00 5.39 0.51 0.10
3 3287.00 57.33 14.00 3.74 0.43 0.11
4 1260.00 35.50 2.00 1.41 0.16 0.11
5 461.00 21.47 0.00 -0.00 0.00 0.00
6 135.00 11.62 0.00 -0.00 0.00 0.00
7 50.00 7.07 0.00 -0.00 0.00 0.00
8 18.00 4.24 0.00 -0.00 0.00 0.00
9 11.00 3.32 0.00 -0.00 0.00 0.00
Iso5 bins:
0 8503.00 92.21 76.00 8.72 0.89 0.10
1 7578.00 87.05 46.00 6.78 0.61 0.09
2 4417.00 66.46 10.00 3.16 0.23 0.07
3 1707.00 41.32 1.00 1.00 0.06 0.06
4 508.00 22.54 0.00 -0.00 0.00 0.00
5 171.00 13.08 1.00 1.00 0.58 0.58
6 61.00 7.81 0.00 -0.00 0.00 0.00
7 28.00 5.29 0.00 -0.00 0.00 0.00
8 7.00 2.65 0.00 -0.00 0.00 0.00
9 4.00 2.00 0.00 -0.00 0.00 0.00
Iso6 bins:
0 21303.00 145.96 100.00 10.00 0.47 0.05
1 8388.00 91.59 45.00 6.71 0.54 0.08
2 2560.00 50.60 6.00 2.45 0.23 0.10
3 672.00 25.92 0.00 -0.00 0.00 0.00
4 257.00 16.03 0.00 -0.00 0.00 0.00
5 74.00 8.60 0.00 -0.00 0.00 0.00
6 29.00 5.39 0.00 -0.00 0.00 0.00
7 8.00 2.83 0.00 -0.00 0.00 0.00
8 5.00 2.24 0.00 -0.00 0.00 0.00
9 1.00 1.00 0.00 -0.00 0.00 0.00
Note that there are large systematic errors (26%) associated with the
fake rates in addition to the statistical errors show (from Jet
50) and the statistical errors from the W+jets data that the fake
rate is applied to. Also, we do not apply a fake rate of 0.0 to
any of the events in the W+jets sample, but we use the last bin
(or last few bins) with numerator events
to average with the remaining bins to make up for the lack of statistics.
Our total W+jet sample that the fake rate is applied to is about 103 events
with:
7% in Iso1
16% in Iso2
25% in Iso3
11%in Iso4
12% in Iso5
30% in Iso6
Given the distribution of our events we think it is a true statement that
the average fake rate we apply is on the order of 1%. The details of the
fake rates can be found in CDFNote 6784.
for comments/questions, email Sarah at demers@fnal.gov, Tony at
vaiciuli@fnal.gov and Kevin at ksmcf@fnal.gov