Download StatForum_24jun11

Statistics Forum Follow-up info for Physics Coordination 24 June, 2011 Glen Cowan, Eilam Gross G. Cowan Follow-up from the Statistics Forum / CERN, 24 June 2011 1 Main questions What do we see as the main way forward with CMS? What do we recommend in the short term (summer 2011)? What do we recommend after summer 2011? G. Cowan Follow-up from the Statistics Forum / CERN, 24 June 2011 2 The way forward with CMS We met again with CMS in the evening of 23 June 2011 (ATLAS: Cowan, Gross, Murray, Read, Cranmer; CMS: Cousins, Lyons, Dorigo, Demortier) Cousins more or less ruled out supporting either CLs or PCL as a long-term recommendation for CMS. We tried to clarify if this was his view or that of CMS. He believes his own view, which is to use Feldman-Cousins unified (two-sided) intervals would be followed in CMS. We replied that the prevailing view in ATLAS has been to quote a one-sided upper limit, and it was difficult to envisage adopting F-C in place of this. So at present there is no single frequentist method that would have long-term support from both ATLAS and CMS. In the short term, there is support for CLs in both collaborations as an interim solution to allow for comparison of limits. G. Cowan Follow-up from the Statistics Forum / CERN, 24 June 2011 3 The way forward with CMS (2) Bayesian methods emerged as a solution with support from both sides. On the one hand this had always been viewed as a useful complement to the frequentist limit. Furthermore, one can study and report the frequentist properties of Bayesian intervals (i.e., the fraction of times they would cover the true parameter value), and in many examples this turns out to be very good. Both sides agreed to consider Bayesian methods with priors chosen to have good frequentist properties as a common method. At a more detailed level it will take some more time to agree on and implement the procedures. So in the short term this is not a realistic solution for analyses where Bayesian methods have not already been developed. G. Cowan Follow-up from the Statistics Forum / CERN, 24 June 2011 4 Recommendation on minimum power for PCL from 16% to 50% For summer 2011 (and beyond), we recommend quoting PCL limits with the minimum power of 50%. The reasons for moving the minimum power to 50% are both theoretical and practical: 50% avoids the possibility of having a conservative treatment of systematics lead to a stronger limit. Some computational issues related to low-count analyses are less problematic with 50%. There is a slight reduction in the burden on the analyst, since the 50% quantile (median) needed for the power constraint is easier to find than the 16% quantile (-1 sigma error band). G. Cowan Follow-up from the Statistics Forum / CERN, 24 June 2011 5 Recommendation on minimum power for PCL from 16% to 50% (2) 50% minimum power gives a slight reduction in the “psychological burden” on conference speakers, in that the fraction of times one sees a sizable difference between PCL and CLs would be less, and then only in cases where a strong downward fluctuation leads to a stronger CLs limit (see graph on next page and recall that under the background-only model, muHhat lives 68% of the time between -1 and 1). Owing to the short notice before EPS, it may be desirable to leave the minimum power at 16% for the short term. This should depend on whether groups feel they need more time to shift from 16% to 50%. In practice this step should not take any more time, and in some cases will save time. G. Cowan Follow-up from the Statistics Forum / CERN, 24 June 2011 6 (unknown) true value → Upper limits for Gaussian problem measurement → G. Cowan Follow-up from the Statistics Forum / CERN, 24 June 2011 7 Conclusions We recommend using PCL with a minimum power of 50% as the primary result. For the short term, we support also reporting CLs provided to allow for comparison with CMS. In the longer term, the Bayesian approach appears to have common support in both ATLAS and CMS. This will take some time to implement for many analyses; for others it is already available. Search analyses should also report the discovery significance (pvalue of the background-only hypothesis). G. Cowan Follow-up from the Statistics Forum / CERN, 24 June 2011 8 Extra material (repeated from 23 June talk) G. Cowan Follow-up from the Statistics Forum / CERN, 24 June 2011 9 ATLAS/CMS discussions on one-sided limits Some prefer to report one-sided frequentist upper limits (CLs, PCL); others prefer unified (Feldman-Cousins) limits, where the lower edge may or may not exclude zero. The prevailing view in the ATLAS Statistics Forum has been that in searches for new phenomena, one wants to know whether a cross section is excluded on the basis that its predicted rate is too high relative to the observation, not excluded on some other grounds (e.g., a mixture of too high or too low). Among statisticians there is support for both approaches. G. Cowan Follow-up from the Statistics Forum / CERN, 24 June 2011 10 Discussions concerning flip-flopping One-sided limits (CLs, PCL) can suffer from “flip-flopping”, i.e., violation of coverage probability if one decides, based on the data, whether to report an upper limit or a measurement with error bars (two-sided interval). This can be avoided by “always” reporting: (1) An upper limit based on a one-sided test. (2) The discovery significance (equivalent to p-value of background-only hypothesis). In practice, “always” can mean “for every analysis carried out as a search”, i.e., until the existence of the process is well established (e.g., 5σ). I.e. we only require what is done in practice to map approximately onto the idealized infinite ensemble. G. Cowan Follow-up from the Statistics Forum / CERN, 24 June 2011 11 Discussions on CLs and F-C CLs has been criticized as a method for preventing spurious exclusion as it leads to significant overcoverage that is in practice not communicated to the reader. This was the motivation behind PCL. We have also not supported using the upper edge of a FeldmanCousins interval as a substitute for a one-sided upper limit, since when used in this way F-C has lower power. Furthermore F-C unified intervals protect against small (or null) intervals by counting the probability of upward data fluctuations, which are not relevant if the goal is to establish an upper limit. G. Cowan Follow-up from the Statistics Forum / CERN, 24 June 2011 12 Discussions concerning PCL PCL has been criticized as it does not obviously map onto a Bayesian result for some choice of prior (CLs = Bayesian for special cases, e.g., x ~ Gauss(μ, σ), constant prior for μ ≥ 0). We are not convinced of the need for this. The frequentist properties of PCL are well defined, and as with all frequentist limits one should not interpret them as representing Bayesian credible intervals. Further criticism of PCL is related to an unconstrained limit that could exclude all values of μ. A remnant of this problem could survive after application of the power constraint (cf. “negatively biased relevant subsets”). PCL does not have negatively biased relevant subsets (nor does our unconstrained limit, as it never excludes μ = 0). On both points, debate still ongoing. G. Cowan Follow-up from the Statistics Forum / CERN, 24 June 2011 13

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download StatForum_24jun11