Kyle Cranmer (NYU). CERN School HEP, Romania, Sept. 2011. Center for.
Cosmology and. Particle Physics. Kyle Cranmer,. New York University. Practical ...
Center for Cosmology and Particle Physics
Practical Statistics for Particle Physics
Kyle Cranmer, New York University
Kyle Cranmer (NYU)
CERN School HEP, Romania, Sept. 2011
1
Center for Cosmology and Particle Physics
Lecture 2
Kyle Cranmer (NYU)
CERN School HEP, Romania, Sept. 2011
68
Outline
Center for Cosmology and Particle Physics
Lecture 1: Building a probability model ‣
preliminaries, the marked Poisson process
‣
incorporating systematics via nuisance parameters
‣
constraint terms
‣
examples of different “narratives” / search strategies
Lecture 2: Hypothesis testing ‣
simple models, Neyman-Pearson lemma, and likelihood ratio
‣
composite models and the profile likelihood ratio
‣
review of ingredients for a hypothesis test
Lecture 3: Limits & Confidence Intervals ‣ the meaning of confidence intervals as inverted hypothesis tests ‣ asymptotic properties of likelihood ratios ‣ Bayesian approach Kyle Cranmer (NYU)
CERN School HEP, Romania, Sept. 2011
69
Center for Cosmology and Particle Physics
Hypothesis Testing
Kyle Cranmer (NYU)
CERN School HEP, Romania, Sept. 2011
70
Hypothesis testing
Center for Cosmology and Particle Physics
One of the most common uses of statistics in particle physics is Hypothesis Testing (e.g. for discovery of a new particle) ‣
●
Null-Hypothesis, H0: eg. background-only
●
Alternate-Hypothesis H1: eg. signal-plus-background
one makes a measurement and then needs to decide whether to reject or accept H0 Probability
‣
assume one has pdf for data under two hypotheses:
0.05 0.045 0.04 0.035 0.03 0.025 0.02 0.015 0.01 0.005 0
50 Events
±10
60
80
100
120
140
160
180
Events Observed Kyle Cranmer (NYU)
CERN School HEP, Romania, Sept. 2011
71
Hypothesis testing
Center for Cosmology and Particle Physics
Before we can make much progress with statistics, we need to decide what it is that we want to do. ‣ first let us define a few terms: ●
Rate of Type I error α
●
Rate of Type II β
●
Power = 1 − β
Treat the two hypotheses asymmetrically ‣ the Null is special. ●
Fix rate of Type I error, call it “the size of the test”
Now one can state “a well-defined goal” ‣ Maximize power for a fixed rate of Type I error Kyle Cranmer (NYU)
CERN School HEP, Romania, Sept. 2011
72
Hypothesis testing
Center for Cosmology and Particle Physics
The idea of a “5σ “ discovery criteria for particle physics is really a conventional way to specify the size of the test ‣
usually 5σ corresponds to α = 2.87 · 10−7 ●
eg. a very small chance we reject the standard model
Probability
In the simple case of number counting it is obvious what region is sensitive to the presence of a new signal ‣ but in higher dimensions it is not so easy 0.05 0.045 0.04 0.035 0.03 0.025 0.02 0.015 0.01 0.005 0
50 Events
±10
60
80
100
120
140
160
180
Events Observed Kyle Cranmer (NYU)
CERN School HEP, Romania, Sept. 2011
73
Hypothesis testing
Center for Cosmology and Particle Physics
The idea of a “5σ “ discovery criteria for particle physics is really a conventional way to specify the size of the test ‣
usually 5σ corresponds to α = 2.87 · 10−7 ●
eg. a very small chance we reject the standard model
In the simple case of number counting it is obvious what region is sensitive to the presence of a new signal ‣ but in higher dimensions it is not so easy 1.5
1
1
1
x2
0.5
00
x2
1.5
x1
1.5
0.5
0.5
x0
Kyle Cranmer (NYU)
1
1.5
00
0.5
0.5
x0
1
1.5
CERN School HEP, Romania, Sept. 2011
00
0.5
x1
1
1.5
73
Hypothesis testing
Center for Cosmology and Particle Physics
The idea of a “5σ “ discovery criteria for particle physics is really a conventional way to specify thean size of the testdecision bound Finding optimal −7 ‣ usually 5σ corresponds to α = 2.87 · 10 H0 ●
Maybe select events with “cuts”: eg. a very small chance we reject the standard model xi < ci
In the simple case of number counting it is obvious what region is H1 cj sensitive to the presence ofxj a kα P (x|H0 ) Any other region of the same size will have less power The likelihood ratio is an example of a Test Statistic, eg. a real-valued function that summarizes the data in a way relevant to the hypotheses that are being tested Kyle Cranmer (NYU)
CERN School HEP, Romania, Sept. 2011
76
A short proof of Neyman-Pearson W
Center for Cosmology and Particle Physics
WC P (x|H1 ) > kα P (x|H0 )
Consider the contour of the likelihood ratio that has size a given size (eg. probability under H0 is 1-α)
Kyle Cranmer (NYU)
CERN School HEP, Romania, Sept. 2011
77
A short proof of Neyman-Pearson
Center for Cosmology and Particle Physics
Now consider a variation on the contour that has the same size
Kyle Cranmer (NYU)
CERN School HEP, Romania, Sept. 2011
78
A short proof of Neyman-Pearson
P(
|H0 ) = P (
Center for Cosmology and Particle Physics
|H0 )
Now consider a variation on the contour that has the same size (eg. same probability under H0)
Kyle Cranmer (NYU)
CERN School HEP, Romania, Sept. 2011
79
A short proof of Neyman-Pearson
P (x|H1 ) < kα P (x|H0 )
P(
P(
|H1 ) < P (
|H0 ) = P (
Center for Cosmology and Particle Physics
|H0 )
|H0 )kα
Because the new area is outside the contour of the likelihood ratio, we have an inequality Kyle Cranmer (NYU)
CERN School HEP, Romania, Sept. 2011
80
A short proof of Neyman-Pearson
P (x|H1 ) < kα P (x|H0 )
P(
P(
|H1 ) < P (
|H0 ) = P ( |H0 )kα
Center for Cosmology and Particle Physics
|H0 ) P(
P (x|H1 ) > kα P (x|H0 )
|H1 ) > P (
|H0 )kα
And for the region we lost, we also have an inequality Together they give... Kyle Cranmer (NYU)
CERN School HEP, Romania, Sept. 2011
81
A short proof of Neyman-Pearson
P (x|H1 ) < kα P (x|H0 )
P(
P(
|H1 ) < P (
P(
|H0 ) = P ( |H0 )kα
|H1 ) < P (
Center for Cosmology and Particle Physics
|H0 ) P(
P (x|H1 ) > kα P (x|H0 )
|H1 ) > P (
|H0 )kα
|H1 )
The new region region has less power. Kyle Cranmer (NYU)
CERN School HEP, Romania, Sept. 2011
82
2 discriminating variables
Center for Cosmology and Particle Physics
Often one uses the output of a neural network or multivariate algorithm in place of a true likelihood ratio. ‣
That’s fine, but what do you do with it?
‣
If you have a fixed cut for all events, this is what you are doing:
y1
y2 x1 fb (q)
x2 fs (q)
�
sfs (x, y) q = ln Q = −s + ln 1 + bfb (x, y)
�
q Kyle Cranmer (NYU)
CERN School HEP, Romania, Sept. 2011
83
Experiments vs. Events q2
Ideally, you want to cut on the likelihood ratio for your experiment ‣
Center for Cosmology and Particle Physics
fb (q12 )
fs+b (q12 )
equivalent to a sum of log likelihood ratios
α
q12 = q1 + q2
Easy to see that includes experiments where one event had a high LR and the other one was relatively small
q1
q1 y1
Kyle Cranmer (NYU)
q 12
1−β
q2 y2
x1
CERN School HEP, Romania, Sept. 2011
x2 84
LEP Higgs
Center for Cosmology and Particle Physics
!Nchan !ni si fs (xij )+bi fb (xij ) A simple likelihood P ois(ni |si + bi ) j L(x|H1 ) i si +bi Q= = !Nchan ! ni L(x|H0 ) ratio with no free P ois(ni |bi ) j fb (xij ) i parameters $ # N" ni chan " q = ln Q = −stot +
0.12
(a)
LEP
-2 ln(Q)
Probability density
i
2
mH = 115 GeV/c Observed Expected for background Expected for signal plus background
0.1 0.08
j
si fs (xij ) ln 1 + bi fb (xij )
50
LEP
40 30 20 10
0.06
0
0.04 -10
0.02 0 -15
Observed Expected for background Expected for signal plus background
-20
-10
-5
0
5
10
15
-30
106
108
-2 ln(Q) Kyle Cranmer (NYU)
CERN School HEP, Romania, Sept. 2011
110
112
114
116
118
120
2
mH(GeV/c )
85
The Test Statistic and its distribution
Center for Cosmology and Particle Physics
Consider this schematic diagram signal + background
observed
Probability Density
background-only
1-CL ! b signal like
CL " s+b
Test Statistic
background like
The “test statistic” is a single number that quantifies the entire experiment, it could just be number of events observed, but often its more sophisticated, like a likelihood ratio. What test statistic do we choose? And how do we build the distribution? Usually “toy Monte Carlo”, but what about the uncertainties... what do we do with the nuisance parameters? Kyle Cranmer (NYU)
CERN School HEP, Romania, Sept. 2011
86
Center for or the signal, the transverse mass distribution peaks near the Higgs boson mass. Cosmology and
0
The Marked Poisson model ‣
Signal Total BG tt ZZ WZ WW Z W
observables: n events each with some value of discriminating variable m H
‣
auxiliary measurements: ai
‣
parameters: !
Events [fb-1]
Recall our marked Poisson model
ATLAS Preliminary (simulation) H " ll! ! (m =200 GeV, s = 7 TeV)
Particle Physics
7
ATLAS Preliminary (simulation) H " ll! ! (m =300 GeV, s = 7 TeV)
Signal Total BG tt ZZ WZ WW Z W
H
6 5 4 3 2 1
160
180
200
220
240
260
280
0 150
300
200
250
300
350
Transverse Mass [GeV]
H
+ b(α)) Signal
n � s(α)fs (mj |α) + b(α)fb (mj |α)
Total BG tt ZZ WZ WW Z W
j
450
500
Transverse Mass [GeV] Events [fb-1]
ATLAS P (m,Preliminary a|α) =(simulation) Pois(n|s(α) H " ll! ! (m =400 GeV, s = 7 TeV)
400
�
ATLAS Preliminary (simulation) × s(α) +ll!b(α) H" ! (m =500 GeV, s = 7 TeV)
0.6
H
0.5
i∈syst
G(aiSignal |αi , σi )
Useful to separate parameters into !=(",#) 0.4
Total BG tt ZZ WZ WW Z W
‣
0.3 parameters of interest ": cross sections, masses, coupling constants, ...
‣
0.2 nuisance parameters #: reconstruction efficiencies, energy scales, ...
‣
0.1 note: not all of the nuisance parameters need to have constraint terms 300
350
Kyle Cranmer (NYU)
400
450
500
0 300
350
400
CERN School HEP, Romania, Sept. 2011 Transverse Mass [GeV]
450
500
550
600
650
87 Transverse Mass [GeV]
on
for ts Our s + bnumber events, b counting is uncertain,example and one either wishes to performCenter a significan Cosmology and Particle Physics st the null hypothesis s = 0 or create a confidence interval on s. Here s is conside From our general eter of interest and bmodel is referred to as a nuisance parameter (and should be gene n setup, the background rate b � dingly in what follows). In the� is uncertain, b s(α)fs (mj |α) + b(α)fb (mj |α) P (m, a|α) = Pois(n|s(α) + b(α)) × G(ai |αi , σi ) nstrained by an auxiliary or sideband measurement where one expects τ b even s(α) + b(α) j i∈syst ures noff events. This simple situation (often referred to as the ‘on/off’ problem) Consider a simple number counting model with s(!)! s, b(!)! b, and ssed by the following probability density function: replace the constraint G(a|",#)! Pois(noff |$b) with $ known.
P (non , noff |s, b) = Pois(non |s + b) Pois(noff |τ b).
We could simply use non as our test statistic, but to calculate the p-value that this to situation the sideband is also modeled as a Poisson weinneed know distribution of nmeasurement on. ∞ he expected number of counts due to background events can be related to th � p = known ratio Pois(n + b) × Pois(n urement by a perfectly τ . on In|smany cases a more accurate relation b off |τ b) � �� � nonand =nobsthe unknown background rate b may be a Gaussia deband measurement noff constant anObservations: absolute or relative uncertainty ∆b . These cases were also considered in Ref re referred to as the ‘Gaussian mean problem’. ‣ The distribution of n explicitly depends on both s and b. on
hile the prototype problem simplification, ‣ The distribution of noffisisaindependent of sit has been an instructive exampl and perhaps, most important lesson is that the uncertainty on the background ‣ If $b is very different from noff, then the data are not consistent with the een cast as a well-defined statistical uncertainty instead of a vaguely-defined syst model parameters. However, the p-value derived from non is not small. tainty. To make this point more clearly, consider that it is common practice in H Kyle Cranmer (NYU)
CERN School HEP, Romania, Sept. 2011
88
mendations are made to aid in future comparisons and combinations with ATLA
Center for totype problem is a simplification, it has been an instructive example. The Cosmology and With nuisance parameters: Hybrid Solutions Particle Physics results. , most important lesson is that the uncertainty on the background rate b Goal of Bayesian-frequentist is to provide a frequentist well-defined statistical uncertainty hybrid insteadsolutions of a vaguely-defined systematic the clearly, main measurement, eliminating aketreatment this pointof more consider that it while is common practicenuisance in HEP to Preliminaries (deal with systematics) with an intuitive Bayesian technique. emparameters as
�
∞ �
P (nonproblem’ |s) = dbhas Pois(n + b) π(b), as useful p = simplification P (non(2) |s)of a common on |sconsidered ple ‘prototype been non =nobs on and its coverage properties have been studied in Ref. [1] and generalized by R weTracing were actually in aGaussian) case described byuncertain the ‘on/off’ problem,b,then it would be bette distribution (usually for the parameter is non back the of %(b) roblem consists of aorigin number counting analysis, where onewhich observes even ink of ‘smeared’, π(b) as the‘randomized’, posterior resulting from theout’ sideband measurement or ‘integrated when pseudo-and ‣ clearly s (ie. s+ b events, b is uncertain, and one either wishes to perform a significanc state prior η(b); identify control samplescreating (sidebands) use: t what is the nature of π(b)? The important fact which often evades serious t the null hypothesis s = 0 or create a confidence P (noffinterval |b)η(b) on s. Here s is consider hat π(b) is a Bayesian prior, may be well-justified. It � π(b)which = P (b|n )or = may-not . off eter of interest and b is referred to as a nuisance (and should be gene (nparameter |b)η(b) off y some previous measurements either based on dbP Monte Carlo, sidebands, or ingly in even whatinfollows). Inone thedoes setup, the background rate b is uncertain, b However, those cases not escape an underlying Bayesian y doing this itan is clear that the term P (n |b) is need an objective probability that In a purely Frequentist approach weoff must a where test statistic thatdensity nstrained by auxiliary or sideband measurement one expects τ b even oint here is not about the use of Bayesian inference, but about the clear acused in events. a frequentist context and that we η(b) is the original Bayesian prior (eg. assigned to depends on both n n must consider both random when on and situation off and res n This simple (often referred to as the ‘on/off’ problem) off and facilitating the ability to perform alternative statistical tests. nowledge generating toy Monte Carlo) density function: sed by the following probability ecommendation: Where possible, one should express uncertainty on a paramete
1 (ie. Pois(noff |τ b) in Eq. 1). atistical (eg. random) process
P (non , noff |s, b) = Pois(non |s + b) Pois(noff |τ b).
ecommendation: When using Bayesian techniques, one should explicitly express hat in the thisprior situation theobjective sideband measurement is alsodensity modeled as a (as Poisson parate from the part of the probability function in Eq.p
he expected number of counts due to background events can be related to the
Kyle Cranmer CERN School HEP, Romania, Sept. 2011 Now let us(NYU) consider some specific methods for addressing the on/off problem and89t
alizations, and provide comments on these generalizations. Where possible, co Center for 10 mendations are made to aid in future comparisons and combinations with Cosmology and ATLA Does it matter? Particle Physics results.
This on/off problem has been studied extensively.
y
an approximation of the full construction, that does contours for btrue=100, critical regions for ! = 1 130 not necessarily cover. To the extent that the use of the profile likelihood ratio as a test statistic provides No Systematics 120 Z5" similar tests, the profile construction has good coverZN 110 age properties. The main motivation the profile ‣ instead of arguing about the for merits of #G profile #P profile construction is that it scales well with the number of 100 various methods, just go and check their ad hoc nuisance parameters and has that the “clipping” is built as useful simplification ple ‘prototype problem’ been considered of a commo correct coverage 90 ratein of Type I error (only one value of the nuisance parameters is conion andsidered). its coverage properties have been studied80 in Ref. [1] and generalized by R ‣ Results indicated large in It appears thatathe choozdiscrepancy experiment actually problem consists of number counting analysis, 70 where one observes non even both the full construction (called “FC cor“claimed” significance and “true” ts s + bperformed events, b is uncertain, and one either60wishes to perform a significan rect syst.”) and the profile construction (called “FC for various methods st thesignificance null hypothesis = 0 or create a confidence 50 interval on s. Here s is conside profile”) in order to s compare with the strong confi36 dence technique. ‣ eg. 5# is really for sometopoints eter of interest and ~4# b is referred as a nuisance40 parameter (and should be gene 60 80 100 120 140 160 180 200 Another perceived problem with the full conx dingly in what follows). In the setup, the background rate b is uncertain, b struction is thatmatter. bad over-coverage can result from So, yes, it does the by projection onto the parameters of interest. It Figure 7. A comparison the various methods τ critical boun nstrained an auxiliary or sideband measurement where ofone expects b even ary xcrit (y) (see text). The concentric ovals represent co should be made very clear that the coverage probaures noff bility events. This simple situation (often referred ‘on/off’ problem) tours of LG to fromas Eq. the 15. is a function of both the parameters of interest ssed by and thethefollowing probability function: nuisance parameters. If the density data are consistent with the null hypothesis for any value of the nuisance parameters, then one should probably not for L and LP and |τ presented in Table 2. G Pois(n P (n , n |s, b) = Pois(n |s + b) b). on on off off reject it. This argument is stronger for nuisance paThe result of the full Neyman construction a rameters directly related to the background hypoththe profile construction are not presented. The f and less strong forthe thosesideband that account measurement for instruthat in esis, this situation is also modeled Poisson a http://www.physics.ox.ac.uk/phystat05/proceedings/files/Cranmer_LHCStatisticalChallenges.ps Neyman construction covers as by a construction, mentation effects. In fact, there is a family of methit was previously demonstrated for a similar ca
Preliminaries
heKyle expected number of counts due toHEP, background events can be related to th 90 Cranmer (NYU) CERN School Romania, Sept. 2011
alizations, and provide comments on these generalizations. Where possible, co Center for mendations are made to aid in future comparisons and combinations with Cosmology and ATLA Does it matter? Particle Physics results. Gaussian-mean problem (relative "b), Z =5
!Z !b
!b
1000
N
900
Preliminaries 800
100
!Z
This on/off problem has been studied extensively.
Gaussian-mean problem (relative "b), Z =5
N
90 1
1 80
instead of arguing about the merits of 700 70 0 0 various methods, just go and check their 600 60 ple ‘prototype problem’ has been considered as useful simplification of a commo rate of Type I500error -1 -1 50 g ion and its coverage properties have been studied in Ref. [1] and generalized by R in age s r 400 ‣ Results indicated large discrepancy in -2 40 where one cobserves rea ove -2 problem consists of a number counting analysis, non even n I C 300 30 “claimed” significance and “true” e rts s + b events, b200 is uncertain, and one either wishes to perform a significan d n 20 -3 -3 U significance for various methods st the null hypothesis s = 0 or create a confidence 10interval on s. Here s is conside 100 ‣ eg. -4 -4 5# is really ~4# for sometopoints eter of interest and b is0.05referred as a 0.25 nuisance parameter (and should be gene 0 0 0 0.1 0.15 0.2 0.3 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 f dingly in what follows). rate b is uncertainty uncertain, b relative background So, yes, it does matter. In the setup, the f background Follow-up workone by Bob Cousinsτ & nstrained by an auxiliary or sideband measurement where expects b even Jordan ] ures noff events. This simple situation (often referred toTucker, as the[physics/0702156 ‘on/off’ problem) ssed by the following probability density function: ‣
P (non , noff |s, b) = Pois(non |s + b) Pois(noff |τ b).
that in this situation the sideband measurement is also modeled as a Poisson http://www.physics.ox.ac.uk/phystat05/proceedings/files/Cranmer_LHCStatisticalChallenges.ps heKyle expected number of counts due toHEP, background events can be related to th 90 Cranmer (NYU) CERN School Romania, Sept. 2011
The Profile Likelihood Ratio
Center for Cosmology and Particle Physics
Consider our general model with a single parameter of interest µ ‣
let µ=0 be no signal, µ=1 nominal signal
In the LEP approach the likelihood ratio is equivalent to:
QLEP ‣
P (m|µ = 1, ν) = P (m|µ = 0, ν)
but this variable is sensitive to uncertainty on & and makes no use of auxiliary measurements a
Alternatively, one can define profile likelihood ratio
P (m, a|µ, νˆ ˆ(µ; m, a) ) λ(µ) = P (m, a|ˆ µ, νˆ) ‣
where νˆ ˆ(µ; m, a) is best fit with µ fixed (the constrained maximum likelihood estimator, depends on data)
‣
and ν ˆ and µˆ are best fit with both left floating (unconstrained)
‣
Tevatron used QTev = !(µ=1)/!(µ=0) as generalization of QLEP
Kyle Cranmer (NYU)
CERN School HEP, Romania, Sept. 2011
91
for here the ai are the parameters used to parameterize the fake-tau background and !Center represents all n Cosmology and An example Particle Physics nce parameters of the model: "H , mZ , "Z , rQCD , a1 , a2 , a3 . When using the alternate parameterizat theEssentially, signal, the exactyou formneed of Equation is modified coincide with parameters to fit 14 your modeltoto the data twice: of that model. Figure 14 shows the fit to the signal candidates for a mH = 120 GeV Higg with (a,c) and with once with everything floating, and once with signal fixed to 0 d) the signal contribution. It can be seen that the background shapes and normalizations are trying P (m, = control 0, νˆ ˆ(µsamples = 0; m, ) GeV,a|µ but the are a) constraining the variati commodate the excess near m## = 120 λ(µ = 0) = ble 13 shows the significance calculated from thePprofile likelihood (m, a|ˆ µ, νˆ) ratio for the ll-channel, the annel, and the combined fit for various Higgs boson masses.
P (m, a|µ = 0, νˆˆ(µ = 0; m, a) )
14
14
ATLAS
12
VBF H(120)$##$lh s = 14 TeV, 30 fb-1
10
Events / ( 5 GeV )
Events / ( 5 GeV )
P (m, a|ˆ µ, νˆ)
8
12
8 6
4
4
2
2 80
Kyle Cranmer (NYU)
100
120
140
160 180 M## (GeV)
VBF H(120)$##$lh s = 14 TeV, 30 fb-1
10
6
0 60
ATLAS
0 60
80
CERN School HEP, Romania, Sept. 2011
100
120
140
160 180 M## (GeV)
92
Properties of the Profile Likelihood Ratio
Center for Cosmology and Particle Physics
After a close look at the profile likelihood ratio
P (m, a|µ, νˆ ˆ(µ; m, a) ) λ(µ) = P (m, a|ˆ µ, νˆ) one can see the function is independent of true values of & ‣
though its distribution might depend indirectly
Wilks’s theorem states that under certain conditions the distribution of -2 ln ! ("="0) given that the true value of " is "0 converges to a chi-square distribution ‣
more on this tomorrow, but the important points are:
‣
“asymptotic distribution” is known and it is independent of & ●
more complicated if parameters have boundaries (eg. µ$ 0)
Thus, we can calculate the p-value for the background-only hypothesis without having to generate Toy Monte Carlo! Kyle Cranmer (NYU)
CERN School HEP, Romania, Sept. 2011
93
Toy Monte Carlo
Center for Cosmology and Particle Physics
Explicitly build distribution by generating “toys” / pseudo experiments assuming a specific value of µ and !. ‣
randomize both main measurement m and auxiliary measurements a
‣
fit the model twice for the numerator and denominator of profile likelihood ratio
‣
evaluate -2ln "(µ) and add to histogram
Choice of µ is straight forward: typically µ=0 and µ=1, but choice of ! is less clear ‣
more on this tomorrow
This can be very time consuming. Plots below use millions of toy pseudoexperiments on a model with ~50 parameters. signalplusbackground background
10
signalplusbackground
10
background
test statistic data
1
test statistic data
1
2-channel
5-channel
-1
10
10-1
3.35σ
-2
4.4σ
10-2
10
10-3 10-3 10-4 10-4
10-5
10-5 10-6
10-6 0
5
10
Kyle Cranmer (NYU)
15
20
25
30 35 40 Profile Likelihood Ratio
10-7
0
5
10
15
CERN School HEP, Romania, Sept. 2011
20
25
30 35 40 45 Profile Likelihood Ratio
94
What makes a statistical method
Center for Cosmology and Particle Physics
To describe a statistical method, you should clearly specify ‣
‣
choice of a test statistic ●
simple likelihood ratio (LEP)
QLEP = Ls+b (µ = 1)/Lb (µ = 0)
●
ratio of profiled likelihoods (Tevatron)
QT EV = Ls+b (µ = 1, νˆˆ)/Lb (µ = 0, νˆˆ� )
●
profile likelihood ratio (LHC)
λ(µ) = Ls+b (µ, νˆ ˆ)/Ls+b (ˆ µ, νˆ)
how you build the distribution of the test statistic ● toy MC randomizing nuisance parameters according to π(ν) •
‣
aka Bayes-frequentist hybrid, prior-predictive, Cousins-Highland
●
toy MC with nuisance parameters fixed (Neyman Construction)
●
assuming asymptotic distribution (Wilks and Wald, more tomorrow)
what condition you use for limit or discovery ●
more on this tomorrow
Kyle Cranmer (NYU)
CERN School HEP, Romania, Sept. 2011
95
Experimentalist Justification
Center for Cosmology and Particle Physics
So far this looks a bit like magic. How can you claim that you incorporated your systematic just by fitting the best value of your uncertain parameters and making a ratio? It won’t unless the the parametrization is sufficiently flexible.
Probability
So check by varying the settings of your simulation, and see if the profile likelihood ratio is still distributed as a chi-square Nominal (Fast Sim) miss Smeared PT Q2 scale 1 Q2 scale 2 Q2 scale 3 Q2 scale 4 Leading-order t t Leading-order WWbb Full Simulation
-1
10
ATLAS
10-2 -3
10
10-4
-1
L dt=10 fb
10-5 -6
10
0
2
4
Kyle Cranmer (NYU)
6
8
10
12
14 16 18 20 log Likelihood Ratio
Here it is pretty stable, but it’s not perfect (and this is a log plot, so it hides some pretty big discrepancies) For the distribution to be independent of the nuisance parameters your parametrization must be sufficiently flexible.
CERN School HEP, Romania, Sept. 2011
96
A very important point
Center for Cosmology and Particle Physics
If we keep pushing this point to the extreme, the physics problem goes beyond what we can handle practically The p-values are usually predicated on the assumption that the true distribution is in the family of functions being considered ‣
eg. we have sufficiently flexible models of signal & background to incorporate all systematic effects
‣
but we don’t believe we simulate everything perfectly
‣
..and when we parametrize our models usually we have further approximated our simulation. ●
nature -> simulation -> parametrization
At some point these approaches are limited by honest systematics uncertainties (not statistical ones). Statistics can only help us so much after this point. Now we must be physicists! Kyle Cranmer (NYU)
CERN School HEP, Romania, Sept. 2011
97