Questions for tau Dilepton Analysis Blessing, April 28 2004

Sarah Demers, Kevin McFarland, Tony Vaiciulis

Analysis page with links to talks


For Documentation see:
CDF Note 6954: Checking muon to tau fake with Z -> mumu data
CDF Note 6923: Optimizing HT and Jet ET cuts in the ttbar tau dilepton Analysis
CDF Note 6922: A Z Mass Cut to reduce Z -> tautau background in the tau dilepton Analysis
CDF Note 6921: Event Selection and Acceptance for Top to Tau Dilepton Analysis
CDF Note 6784: Estimating the Jet to Hadronic Tau Fake Rate
CDF Note 6408: Determining the Electron Fake Rate to Hadronic taus from the Data


Question 1:
The analysis would benefit from more MC statistics


Answer 1:
This analysis, with roughly one signal event and one background event expected, is far from limited by MC statistics. While more MC statistics would lower our uncertainties, our background summary tables shows that our largest uncertainties in background are due to the jet->tau fakes (both statistical and systematic uncertainties) which we get entirely from the data. In our acceptance the systematic error is a factor of two larger than the statistical error.



Question 2:
Muon, Electron and Tau ID should be added to paper.


Answer 2:
The tau ID is documented in CDF Note 6921 and we added tables showing our electron and muon ID to the note, though they are documented in many other places as well. The updated note will be posted by Tuesday, April 27th.



Question 3:
What are track quality requirements on the shoulder tracks for the tau?


Answer 3:
The shoulder track is required to have a Pt greater than 1 GeV and to be a deftrack with at least 10 axial and 10 stereo hits.


Question 4:
PDF uncertainty should not be limited by statistics. Could you use the generator level method?


Answer 4:
We changed our method of determining the systematic uncertainty due to uncertainties in the PDFs. We now use the method recommended at the April 4, 2004, Joint Physics Meeting. This method, which includes an event by event reweighting in ttopei sample using X and Q^2 and different PDF sets, results in a 1% systematic due to PDF uncertainty.

More details:

The default signal Monte Carlo sample, Pythia ttopei, is based on the CTEQ5L Parton Distribution Function set using alpha_s = 0.118. We consider here the systematic uncertainty caused by varying the internal parameters of the PDF set as well as varying alpha_s and the choice of PDF group.

With the new PDF set CTEQ6M, the CTEQ group made available 40 complementary PDF sets CTEQ6M.01...CTEQ6M.40 each of which represents an up or down variation along one of the twenty eigenvectors (corresponding to the ~20 free parameters) which collectively form an orthonormal basis set spanning the PDF parameter space (hep-ph/0201195 v3 and references therein). Each up and down variation pair represents the range of PDF behavior that is consistent with the current global data. Each event in the Pythia ttopei inclusive ttbar sample is reweighted according to the ratio of the CTEQ6M PDF values and the CTEQ6M.xx PDF values. (See April 2, 2004, Joint Physics Meeting minutes) Then all normal selection cuts are applied using full simulation and reconstruction. The total acceptance change, with respect to CTEQ6M, caused by all of the variations is 0.6%.

In a similar way, a relative event by event reweighting is done using the CTEQ5L and MRST72 PDF sets resulting in a 0.8% change in acceptance due to the choice of PDF group. To estimate the acceptance sensitivity to the value of alpha_s we compare the PDF sets MRST72 (alpha_s = 0.1175) and MRST75 (alpha_s = 0.1125) using the reweighting procedure described above. The acceptance change is 0.1%. Considering the three contributions of 0.6%, 0.8%, and 0.1% we take 1% as the systematic uncertainty due to PDF uncertainty.


Question 5:
For acceptance (a_s): Why do you take half the difference as uncertainty when you compare the central a_s value with the -1 sigma variation?


Answer 5:
Our method to estimate the systematic effect due to uncertainty in alpha_s has been changed to adhere to the recommendation presented at the Joint Physics Meeting of April 4, 2004. See answer to question 4.



Question 7:
What is the default for acceptance used in table 6 of note 6921? (Should 6.4% -> 11% and 3.7% -> 2%?)


Answer 7:
Table 6 has been updated with a better method to estimate the systematic uncertainty due to uncertainties in FSR. The default acceptance has been added to the table for clarification (1.08 +- 0.06). I think it is more clear now what is done.

Note that our final acceptance central value of 1.03 is different from the 1.08 listed above. The comparisons described above were done with our "old" muon veto cut (vetoing on tau candidates that can be matched to muon stubs) instead of our "new" muon veto cut requiring the Et/seed track Pt of the tau candidate to be greater than 0.5) Also, we did not go through the steps of subtracting out the background due to our jet fake method in these systematic samples. We therefore use 1.08 here in order to compare apples to apples.



Question 8:
Uncertainty on Jet ID? You are requiring at least two Jets?


Answer 8:
Our jet ID uncertainty is dominated by the uncertainty on the jet energy scale. We take this into account in our systematic error. The larger point of the question, that CDF should consider the effect of our jet clustering algorithm, is noted. We are using JetCluModule for our jet clustering in this analysis and we do not have a systematic error associated with that choice.


Question 9:
What is the uncertainty on the kinematics? (Theta of muon and electron, transverse energy of muon and electron?)


Answer 9:
We take into account scale factors due to electron and muon ID, triggers, and reconstruction. This corrects for differences between the data and MC as far as the transverse energies and angular distributions are concerned.



Question 10:
What uncertainty is introduced by the opposite sign requirement?


Answer 10:
The range of momentums of our electrons and muons are such that we are not worried about an error as a result of mismeasuring their signs. The same holds for the tracks in the tau. The possibility of miscounting the tracks in the taus does introduce a possible uncertainty. However, we have some confidence that we are modeling this well because of the agreement we see in the W->taunu study between data and monte carlo tau candidate track multiplicity distributions.


Question 11:
What is the error on the luminosity?


Answer 11:
We have an error of 6% on the luminosity, with a central value of 193.5 pb^-1 .


Question 12:
2.5% is given as sys uncertainty on electron/muon ID (6319) but the note referenced does not support this number. What is the explanation?


Answer 12:
This systematic uncertainty should be 5% (as it is currently in the analysis and documentation.) The large systematic error comes from the assumption that our electron and muon ID is not a function of the number of jets in the event, or the local environment in the event. We are consistent with other analyses in the top group with this 5% value for the systematic. (See the lepton + jets analysis CDF Note 6844, p21.) The CDF Note sourced as documentation for the 5% systematic is CDF Note 6858, "High pT Lepton ID Efficiency Scale Factor Studies" which has not yet been posted.


Question 13:
Why is the error on trigger efficiency neglected (muons in particular)?


Answer 13:
The first two versions of CDFNote 6921 had some of the electron and muon scale factors labled incorrectly. We are now using the blessed results from the top group for winter 2003/spring 2004, and they are as follows:

CEM ID scale factor: 0.965+/- 0.006
CMUP ID scale factor: 0.94 +/- 0.01
CMX ID scale factor: 1.015 +/- 0.007
CEM trigger efficiency: 0.966 +/- 0.001
CMUP trigger efficiency: 0.890 +/- 0.009
CMX trigger efficiency: 0.966 +/- 0.007
CMUP reconstruction scale factor: 0.927 +/- 0.010
CMX reconstruction scale factor: 0.992 +/- 0.011

Note that there is no scale factor associated with electron reconstruction. These scale factors are applied to the monte carlo backgrounds (Z->tautau, WW, WZ) as well as to the acceptance (ttbar inclusive monte carlo, ttopei.)


Question 14:
In section 6 (note 6921) it would be appropriate to quote and compare with Run1 published results PRL 79:3585-3590 (1997)


Answer14:
We added a table to the results section that shows the predicted and measured events seen in Run1.


Question 15:
We would prefer not to quote any limits, especially not with the present statistics.


Answer 15:
We think that setting the limit on r_tau is a conservative thing to do and that it really showswhat this analysis is intended to measure. However, we will present results in a way that the top group is comfortable with, with the input of the statistics committee.


Question 16:
Why is the quoted electron scale factor for monte carlo 0.965 for electrons and not 1 as seen in Z/W x-section?


Answer 16:
The electron and muon scale factors we use are listed in the answer to question 13. We use values blessed by the top group for the winter 2003/spring 2004 conferences that correspond to the electron and muon cuts we use in our analysis.


Question 17:
Can you estimate ISR systematic in a more sensible way than turning it off completely?


Answer 17:
A better method to estimate the systematic uncertainty due to ISR uncertainty might be to use Pythia ttbar samples ttopbe (less ISR) and ttopce (more ISR). However, due to the size of the samples we expect that the statistical uncertainty will prevent a more meaningful estimate of the ISR systematic than what we already have.



Question 18:
Are you treating FSR and underlying event as the same thing? If by FSR you mean radiation off one of the outgoing legs in the hard scattering process (what I think it should be) then the two are different. UE is how well you model the soft part of the collision due to remnants of the incoming protons.


Answer 18:
We have improved our method of estimating the FSR systematic. We now estimate the acceptance uncertainty due to imprecise knowledge of Final State Radiation by comparing inclusive ttbar Pythia sample ttopde with less FSR with Pythia sample ttopee with more FSR. The difference in acceptance is smaller than the statistical uncertainty of 7%, which we therefore use as an estimate of the systematic uncertainty. So our method has improved, but the result is still limited by the statistical ucnertainty of the Monte Carlo samples.



Question 19:
About PDFs: I do not understand why you vary alpha_s in PDFs. Alpha_s in PDF should be in agreement with the one used in the hard scattering calculation as this is the way they were defined and fitted from available HEP data. Besides, I beleive there is a pretty developed approach based on CTEQ6.1 (I am no expert, but I presume the NLO differential x-section is available for tt-bar) when you calculate acceptance by varying the eigenvectors and summing up the deviations in acceptance in quadratures.


Answer 19:
The CTEQ fit to world data assumes a fixed alpha_s. A variation of alpha_s within its uncertainties can cause a change in PDFs which may change our acceptance. A variation of alpha_s is recommended by the CTEQ group as a way to explore a somewhat orthogonal direction in uncertainty space. We have improved our method of estimating acceptance uncertainty by using the "eigenvector varying" method that you mention. See answer to question 4. It was recently decided at a CDF Joint Physics Group meeting that the systematic uncertainty in acceptance due to PDF uncertainties should be determined by a combination of the "eigenvector varying" method and a variation in alpha_s and this is what we have done.


Question 20:
Again, ID efficiency for taus using embedding will not work for values related to hadronic shower lateral profile. I talked to Pasha about it long time ago and I think he agreed with me. I think that MC is not relaible at all and unless you drop the calorimeter isolation altogether, efficiency of this cut should be measured from W->tau.


Answer 20:
We have measured a tau ID scale factor of 0.95+/-0.10 using W->taunu data and monte carlo. This scale factor is applied to all of our monte carlo calculations that include real taus (acceptance, Z->tautau, WW, WZ.) The study is documented in CDF Note 6921.


Question 21:
Acceptance systematics: I am not sure what is the effect of deficiencies in MET simulation on your acceptance (you use cuts on MET, HT etc.). I think it has to be evaluated in some way. Even if it is small, I think it is important to know that it is small.


Answer 21:
We chose our missing Et cut to be very efficient so our acceptance is not sensitive to small changes in the value of our missing Et cut. To quantify this, we see that we gain 1.5% of our acceptance by decreasing our missing Et by 5 GeV from 20 GeV to 15 GeV, and we lose 2% of our acceptance when we increase our missing Et cut from 20 GeV to 25 GeV. These numbers (1.5% and 2%) are dwarfed by the 5.8% systematic uncertainty that we measure as a result of the jet energy scale uncertainty.

Below you can see the shape of the missin gEt after analysis cuts in the e,tau (top) and mu,tau (bottom) channels. The Z mass veto has not been applied.
missingEt.gif



Question 22:
Fake rates and OS/LS ratio in W+jet events. Do you understand what's going on with the OS/LS ratio in 0,1-jet bins? I think that the procedure when you first apply OS cut and then you apply fake rate to this sample is incorrect. It is the same problem of OS/LS in W+jet background for WW analysis (Avi and Co agreed that this ratio is very far from 1:1 or even 2:1 that you seem to use). Problem is that events to which you apply fake rate do not have a well defined charge (unless you cut hard on track isolation). On contrary, W+jet events that do pass all cuts are heavily OS (b/c they are made to be OS by the production mechanism). So even if the have OS:LS=2:1 for "dirty" tau fakes, this ratio can be as high as 5:1 for "clean" tau fakes.


Answer 22:
We do not "use" any ratio of OS/LS events, but only measure the fakes in the data. It should be clear that we do not assume anything about the OS/LS ratio. If there is a dependence on isolation of the tau candidate we account for that to some extent with our fake rate, which is paramaterized as a function of calorimeter isolation. Our large systematic errors cover the differences seen in the dijet samples (jet20, jet50, jet70) as well as the SUMET sample. Our check of the fake rates is in the jet multiplicity tables.


Question 23:
Fake rates: you say that jet->tau fake rate is of the order of 1%. I think this is a typo. If you do fakes as a function of ET and Iso, for events with good isolation the fake rate should huge (I hope it is still less than real tau id efficiency, otherwise fake rate technique will not work at all). I am actually puzzled with what you do when you apply fake rate only to events that do not pass all ID cuts. I understand the f/(1-f) idea, but it has to be applied to each bin in ET vs Iso separately and it should not work at all if f is not small (the region where iso is small, and most of passing fakes are from there). I think the way you apply fake rates is not clear (I understand how you calculate it) and I want to understand it in better detail. Could you expand this section and give some numbers?


Answer 23: We have six categories of isolation with 10 bins of Et in each isolation category, or 60 total categories. Our calorimeter isolation is defined as the amount of transverse energy deposited in a cone of radius R = 1.0 (in eta and phi), not including calorimeter towers belonging to the tau candidate divided by the tau candidate transverse energy. Our six bins of isolation are as follows:

Isolation bin 1: 0.0 < tau Iso <= 0.05
Isolation bin 2: 0.05 < tau Iso <= 0.10
Isolation bin 3: 0.10 < tau Iso <= 0.20
Isolation bin 4: 0.20 < tau Iso <= 0.30
Isolation bin 5: 0.30 < tau Iso <= 0.50
Isolation bin 6: tau Iso > 0.50

The bins of Et begin with 15 GeV and are 10 GeV wide. Note that the fake we calculate with the dijet and SUMET samples is a relative fake rate, meaning our denominator includes several of the tau candidate cuts and our numerator is after all tau ID cuts. The following is the details of the fake rate calculation, where we show the number of denominator and numerator events in the Jet 50 dataset used to determine the fake rates (with errors) and the fake rate (with statistical errors) for each isolation/Et bin. We use the jet50 fake rate because the Et spectrum of the jets is most similar to our tau candidate Et spectrum in the W+jet events. The actual fake rate is in bold:

bin den denE num numE fake fakeE

Iso1 bins:
0 115.00 10.72 11.00 3.32 9.57 2.74
1 355.00 18.84 31.00 5.57 8.73 1.50
2 893.00 29.88 38.00 6.16 4.26 0.68
3 1653.00 40.66 47.00 6.86 2.84 0.41
4 2253.00 47.47 47.00 6.86 2.09 0.30
5 2007.00 44.80 24.00 4.90 1.20 0.24
6 1425.00 37.75 9.00 3.00 0.63 0.21
7 1000.00 31.62 3.00 1.73 0.30 0.17
8 643.00 25.36 1.00 1.00 0.16 0.16
9 378.00 19.44 0.00 -0.00 0.00 0.00

Iso2 bins:
0 555.00 23.56 26.00 5.10 4.68 0.90
1 1644.00 40.55 55.00 7.42 3.35 0.44
2 3281.00 57.28 54.00 7.35 1.65 0.22
3 4874.00 69.81 70.00 8.37 1.44 0.17
4 4832.00 69.51 42.00 6.48 0.87 0.13
5 3187.00 56.45 12.00 3.46 0.38 0.11
6 1762.00 41.98 3.00 1.73 0.17 0.10
7 949.00 30.81 2.00 1.41 0.21 0.15
8 462.00 21.49 0.00 -0.00 0.00 0.00
9 249.00 15.78 0.00 -0.00 0.00 0.00

Iso3 bins:
0 2737.00 52.32 82.00 9.06 3.00 0.33
1 6045.00 77.75 86.00 9.27 1.42 0.15
2 8599.00 92.73 76.00 8.72 0.88 0.10
3 8408.00 91.70 59.00 7.68 0.70 0.09
4 5039.00 70.99 24.00 4.90 0.48 0.10
5 2232.00 47.24 8.00 2.83 0.36 0.13
6 983.00 31.35 2.00 1.41 0.20 0.14
7 389.00 19.72 0.00 -0.00 0.00 0.00
8 160.00 12.65 0.00 -0.00 0.00 0.00
9 88.00 9.38 0.00 -0.00 0.00 0.00

Iso4 bins:
0 4355.00 65.99 78.00 8.83 1.79 0.20
1 6157.00 78.47 48.00 6.93 0.78 0.11
2 5635.00 75.07 29.00 5.39 0.51 0.10
3 3287.00 57.33 14.00 3.74 0.43 0.11
4 1260.00 35.50 2.00 1.41 0.16 0.11
5 461.00 21.47 0.00 -0.00 0.00 0.00
6 135.00 11.62 0.00 -0.00 0.00 0.00
7 50.00 7.07 0.00 -0.00 0.00 0.00
8 18.00 4.24 0.00 -0.00 0.00 0.00
9 11.00 3.32 0.00 -0.00 0.00 0.00

Iso5 bins:
0 8503.00 92.21 76.00 8.72 0.89 0.10
1 7578.00 87.05 46.00 6.78 0.61 0.09
2 4417.00 66.46 10.00 3.16 0.23 0.07
3 1707.00 41.32 1.00 1.00 0.06 0.06
4 508.00 22.54 0.00 -0.00 0.00 0.00
5 171.00 13.08 1.00 1.00 0.58 0.58
6 61.00 7.81 0.00 -0.00 0.00 0.00
7 28.00 5.29 0.00 -0.00 0.00 0.00
8 7.00 2.65 0.00 -0.00 0.00 0.00
9 4.00 2.00 0.00 -0.00 0.00 0.00

Iso6 bins:
0 21303.00 145.96 100.00 10.00 0.47 0.05
1 8388.00 91.59 45.00 6.71 0.54 0.08
2 2560.00 50.60 6.00 2.45 0.23 0.10
3 672.00 25.92 0.00 -0.00 0.00 0.00
4 257.00 16.03 0.00 -0.00 0.00 0.00
5 74.00 8.60 0.00 -0.00 0.00 0.00
6 29.00 5.39 0.00 -0.00 0.00 0.00
7 8.00 2.83 0.00 -0.00 0.00 0.00
8 5.00 2.24 0.00 -0.00 0.00 0.00
9 1.00 1.00 0.00 -0.00 0.00 0.00

Note that there are large systematic errors (26%) associated with the fake rates in addition to the statistical errors show (from Jet 50) and the statistical errors from the W+jets data that the fake rate is applied to. Also, we do not apply a fake rate of 0.0 to any of the events in the W+jets sample, but we use the last bin (or last few bins) with numerator events to average with the remaining bins to make up for the lack of statistics.

Our total W+jet sample that the fake rate is applied to is about 103 events with:

7% in Iso1
16% in Iso2
25% in Iso3
11%in Iso4
12% in Iso5
30% in Iso6

Given the distribution of our events we think it is a true statement that the average fake rate we apply is on the order of 1%. The details of the fake rates can be found in CDFNote 6784.


for comments/questions, email Sarah at demers@fnal.gov, Tony at vaiciuli@fnal.gov and Kevin at ksmcf@fnal.gov