Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

Shifting Standards: Experiments in Particle Physics in the Twentieth Century
Shifting Standards: Experiments in Particle Physics in the Twentieth Century
Shifting Standards: Experiments in Particle Physics in the Twentieth Century
Ebook573 pages7 hours

Shifting Standards: Experiments in Particle Physics in the Twentieth Century

Rating: 0 out of 5 stars

()

Read preview

About this ebook

In Shifting Standards, Allan Franklin provides an overview of notable experiments in particle physics. Using papers published in Physical Review, the journal of the American Physical Society, as his basis, Franklin details the experiments themselves, their data collection, the events witnessed, and the interpretation of results. From these papers, he distills the dramatic changes to particle physics experimentation from 1894 through 2009.

Franklin develops a framework for his analysis, viewing each example according to exclusion and selection of data; possible experimenter bias; details of the experimental apparatus; size of the data set, apparatus, and number of authors; rates of data taking along with analysis and reduction; distinction between ideal and actual experiments; historical accounts of previous experiments; and personal comments and style.
From Millikan's tabletop oil-drop experiment to the Compact Muon Solenoid apparatus measuring approximately 4,000 cubic meters (not including accelerators) and employing over 2,000 authors, Franklin's study follows the decade-by-decade evolution of scale and standards in particle physics experimentation. As he shows, where once there were only one or two collaborators, now it literally takes a village. Similar changes are seen in data collection: in 1909 Millikan's data set took 175 oil drops, of which he used 23 to determine the value of e, the charge of the electron; in contrast, the 1988-1992 E791 experiment using the Collider Detector at Fermilab, investigating the hadroproduction of charm quarks, recorded 20 billion events. As we also see, data collection took a quantum leap in the 1950s with the use of computers. Events are now recorded at rates as of a few hundred per second, and analysis rates have progressed similarly.

Employing his epistemology of experimentation, Franklin deconstructs each example to view the arguments offered and the correctness of the results. Overall, he finds that despite the metamorphosis of the process, the role of experimentation has remained remarkably consistent through the years: to test theories and provide factual basis for scientific knowledge, to encourage new theories, and to reveal new phenomenon.
LanguageEnglish
Release dateNov 24, 2018
ISBN9780822979197
Shifting Standards: Experiments in Particle Physics in the Twentieth Century

Related to Shifting Standards

Related ebooks

Science & Mathematics For You

View More

Related articles

Reviews for Shifting Standards

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Shifting Standards - Allan Franklin

    SHIFTING STANDARDS

    Experiments in Particle Physics in the Twentieth Century

    ALLAN FRANKLIN

    UNIVERSITY OF PITTSBURGH PRESS

    Published by the University of Pittsburgh Press, Pittsburgh, Pa., 15260

    Copyright © 2013, University of Pittsburgh Press

    All rights reserved

    Manufactured in the United States of America

    Printed on acid-free paper

    10 9 8 7 6 5 4 3 2 1

    Cataloging-in-Publication data is available from the Library of Congress

    ISBN 10: 0-8229-4430-8

    ISBN 13: 978-0-8229-4430-0

    ISBN-13: 978-0-8229-7919-7 (electronic)

    Contents

    Preface

    Prologue: The Rise of the Sigmas

    Introduction

    Chapter 1. Some Measurements of the Temperature Variation in the Electrical Resistance of a Sample of Copper

    Chapter 2. Do Falling Bodies Move South?

    Chapter 3. The Isolation of an Ion, a Precision Measurement of Its Charge, and the Correction of Stokes’s Law

    Chapter 4. Directed Quanta of Scattered X-rays

    Chapter 5. A Determination of e/m for an Electron by a New Deflection Method

    Chapter 6. An Uncertain Interlude

    Chapter 7. Electron Polarization

    Chapter 8. Mean Lifetime of V-Particles and Heavy Mesons

    Chapter 9. Detection of the Free Antineutrino

    Chapter 10. Measurement of the Ke2+ Branching Ratio

    Chapter 11. Determination of Kl3 Form Factors from Measurements of Decay Correlations and Muon Polarizations

    Chapter 12. Bad Data: An Interlude

    Chapter 13. Measurement of the Antineutron-Proton Cross Section at Low Energy

    Chapter 14. New Measurements of Properties of the Ω− Hyperon

    Chapter 15. The Coherent Scattering of Neutrinos

    Chapter 16. Search for Neutral Weakly Interacting Massive Particles in the Fermilab Tevatron Wideband Neutrino Beam

    Chapter 17. Measurement of the B+ Total Cross Section and B+ Differential Cross Section dσ/dpT in p Collisions at √s = 1.8 TeV

    Chapter 18. B Meson Decays to Charmless Meson Pairs Containing η or η’ Mesons

    Chapter 19. The Case of the Disappearing Sigmas

    Conclusion

    Notes

    References

    Index

    Preface

    In his one-paragraph short story On Exactitude in Science, the great Argentinian writer Jorge Luis Borges presents a literary forgery, where the following quotation is fictitiously attributed to Suárez Miranda: . . . In that Empire, the Art of Cartography attained such Perfection that the map of a single Province occupied the entirety of a City, and the map of the Empire, the entirety of a Province. In time, those Unconscionable Maps no longer satisfied, and the Cartographers Guilds struck a Map of the Empire whose size was that of the Empire, and which coincided point for point with it (Borges 1971, 123). A writer who wishes to discuss experiment, even if restricted only to physics, would be faced with a similar problem. Experimental practice is so varied that in order to discuss it fully one might very well have to summarize all experimental papers. I suspect that there are no typical experiments that could be used as exemplars. In this book I will discuss the changes in experimental practice in the twentieth century, restricting myself to experiments in particle physics. There are several reasons for this restriction. First, it makes the discussion tractable. Second, some of the changes discussed, particularly those of scale, are best illustrated in particle physics. Third, particle physics is the field of physics I know best. I strongly believe that knowledge of the science is essential for historical and philosophical study of any science. Although I believe that some of the changes discussed are valid for other fields of experimental physics, and possibly for other scientific disciplines, I do not have sufficient knowledge of those fields to make any useful comments.

    The reader will notice, particularly in the prologue, that numerous elementary particles, such as the π, Λ, μ, ω, and ρ, are mentioned in the text. None of the discussion demands knowledge of these particles and their properties. The symbols should be read as the names of characters in the studies. They do, however, serve to show that the practices discussed are used in a wide variety of experiments in particle physics. In addition, including the particle names is required for historical accuracy.

    Similarly, various mathematical and statistical techniques are mentioned and discussed briefly. Some, like the standard deviation (sigma) and χ², are discussed briefly in the text and in more detail in the notes. Others, such as Boosted decision tree, neural network discriminants, and matrix element discriminant, are merely mentioned. These are advanced techniques and certainly exceed my own knowledge. The important point is that several of these methods are used in a single experiment and their agreement argues for the robustness and correctness of the results. In addition, these results are checked against actual data, providing more confidence in the results.

    This project would not have started without a question, combined with later valuable discussions, from Harry Collins, who, to use his felicitous expression, is my onetime bitter academic enemy and now valued colleague. Harry asked me about the changing standards for discovery in high-energy physics, which led to the study documented in the prologue. The discussions, along with Harry’s fine book Gravity’s Ghost, were essential in this work and also helped in my investigation of other aspects of experiment that have changed with time.

    This book could not have been completed without the assistance of my colleagues in the experimental high-energy physics group at the University of Colorado: John Cumalat, Brian Drell, Bill Ford, Eduardo Luiggi, Jim Smith, Kevin Stenson, Keith Ulmer, and Steve Wagner. They took time out from actually doing physics to answer my questions and to provide very helpful discussions and material. Thanks also to Josh Ellenbogen for discussions on experiment in general and in particular on the philosophy of Duhem. I am also grateful to Jim Bogen and Giora Hon for carefully reading the manuscript and offering constructive and helpful suggestions. Thanks are also due to Alex Wolfe for his careful and thoughtful editing of the manuscript.

    None of this work would have been possible without the support of my wife, and best friend, Cynthia Betts.

    Prologue. The Rise of the Sigmas

    Before beginning the discussion about changes in the experimental practices of particle physics in the twentieth century, it is worthwhile to introduce readers to changes in the field’s most notable experimental standard, namely, its use of standard deviation to measure the accuracy and credibility of results.

    The discovery of the top quark was announced in three papers published between 1994 and 1995 by the Collider Detector at Fermilab (CDF) collaboration. The first two papers, one in the letters journal Physical Review Letters (Abe et al. 1994a) and the other in the more archival journal Physical Review D (Abe et al. 1994b),¹ were both titled Evidence for Top Quark Production in p Collisions at √s = 1.8 TeV.² The third paper, also published in Physical Review Letters, was titled Observation of Top Quark Production in p Collisions with the Collider Detector at Fermilab (Abe et al. 1995). The difference between the papers was that in the interval between when the first two papers were written and when the third paper was written the CDF group had acquired more data and had obtained a statistically more significant result. The Evidence for papers presented a result that had a statistical significance of 2.8 standard deviations (or sigmas [σ]), whereas the Observation of paper was a 5-standard-deviation effect.

    This was an early indication of what has now become a general policy for papers on high-energy physics.³ Editors for Physical Review Letters and Physical Review have remarked that papers that are titled Observation of must report at least a five-standard-deviation effect (see the discussion below and tables P.4 and P.5 for statistical evidence on this policy). Anything less statistically significant can be titled only Evidence for or other similar phrases.⁴ According to several colleagues in high-energy physics, this unwritten policy is strongly adhered to and enforced by research groups themselves in the writing of papers, even before submission to a journal. It is also enforced by referees for the journals. This has become a significant issue within the high-energy physics community (see the discussion of the discovery of the Higgs boson below). Groups prefer to publish Observation of papers, which has become the gold standard for a discovery.⁵ In fact, it is reasonable to say that Observation of is synonymous with discovery of, at least for the high-energy physics community. Thus, for example, in recent work on single-top-quark production, Fermilab issued a press release—Fermilab collider experiments discover rare single top quark—announcing the discovery only after both the CDF (Aaltonen et al. 2009) and D0 (Abazov et al. 2009) groups at the laboratory had reported five-standard-deviation effects, meaning that both papers used Observation of in their respective titles.⁶

    Not everyone within the high-energy physics community is happy about the strict enforcement of the five-sigma rule. I have been told in private communication that there have been, on occasion, rather heated discussions between authors and editors about that enforcement. As it now stands, if you want to publish an Observation of paper in either Physical Review Letters or Physical Review D, it must report a five-sigma effect.

    The Beginning

    The use of standard deviations as a measure of the significance or credibility of a high-energy physics results seems to have started in the early 1960s.⁸ This is not to say that experimental physicists had not previously quoted an experimental uncertainty with their results, usually as a probable error or a standard deviation, but rather that the use of the number of standard deviations as a measure of credibility began at this time. Initially, there was no agreed upon criterion for the number of standard deviations that signaled a significant result. Papers reported effects as low as 1.5 standard deviations (Connolly et al. 1963). It should be noted that although no significant claim was made for this result, it was mentioned: For events satisfying these criteria, their 3π effective mass spectrum . . . is examined for evidence of a peak at the φ mass. There is a deviation at M(3π) = 1020 MeV of about 1.5 standard deviations above background (375). This result was consistent with the mass of the known φ particle. An interesting question, and one that will be discussed later, is whether the criteria for the confirmation of an already reported result or for a discovery claim should be the same or different.

    On occasion, significant claims were made on the basis of what we would now consider rather weak evidence. A case in point is the claim made by Baltay and collaborators (1966): "On the basis of our result we conclude that C [charge conjugation or particle-antiparticle] invariance is violated in η-meson decay into three pions" (1224). Their conclusion was based on an asymmetry between those η decays in which the positive pion had more energy and those in which the negative pion had more energy. They found that the asymmetry A, which is determined by (N+ − N−) / (N+ + N−), where N+ is the number of decays in which the positive pion had more energy and N− the number of decays that had more energetic negative pions, was equal to 0.072 ± 0.028, a 2.57-sigma effect: "The probability of obtaining a result |A| ≥ 0.072 because of random fluctuations in the data of this experiment, if there were no C-invariance, is 1.08 × 10−2" (Baltay et al. 1966, 1226). The group added two other less significant results that gave asymmetries A = 0.058 ± 0.034 and A = 0.087 ± 0.053, each less than a 2-sigma effect, rather weak evidence to their result: "Combining our data with the [earlier data], we obtain A = 0.068 ± 0.020. The probability, in the same sense as before, of obtaining this result, there being no C-invariance violation, is 8 × 10−4 (Baltay et al. 1966, 1226–27). That probability corresponds to a 3.3-sigma effect, which would have justified a discovery claim at the time. As Samuel Goudsmit, the editor of Physical Review at the time, remarked, an effect of less than three standard deviations is quite insufficient in such an important and subtle experiment (Goudsmit 1971, 137). This was later shown to be a statistical fluctuation and not a real effect.

    In a paper reporting the existence of possible resonances in the Ξπ and K systems Bertanza and coworkers remarked that the deviations from the phase-space predictions are 3 standard deviations for the Ξπ and 2.5 standard deviations for the K effective-mass-squared distribution. This estimate of error is based on the square root of the total number of events in the bins containing the peaks in the Ξπ and K Dalitz plots (Bertanza et al. 1962, 180).⁹ The method used here to estimate the number of standard deviations differs slightly from that which is used currently. The experimenters used the total number of events, equal to the signal plus the background (S + B), to calculate the standard deviation, which in this method of calculation is equal to the square root of the total number of events. (See figure P.1, where the plotted xs indicate the calculated background, and the signal is the number of events above the background curve). They then calculated the number of standard deviations of the signal by dividing the signal by that standard deviation. This results in a lower value for the number of standard deviations of the signal than does current practice and lowers the significance of the results. Current practice calculates the number of standard deviations as S/√B, where S is the signal above background and B is the background. The standard deviation is √B, which gives a larger number of standard deviations to the signal and a greater statistical significance. The older method calculates the probability that the total number of events might fluctuate downward and give only the background. The current method calculates the probability that the background will fluctuate upward and give the total number of events. This is a small but significant difference in method. The issue is still, at least on occasion, a subject for discussion. In a paper published in 2004 the COSY-TOF collaboration made the following statement about their calculation of the statistical significance of their result:

    The first alternative is the naive estimation NS/√NB where NS is the number of events corresponding to the signal on top of the fitted background and NB is the number of background events in the chosen area. . . . In the present case this leads to a significance of 5.9 σ. . . . This estimator however neglects the statistical uncertainty of the background and therefore usually overestimates the significance of the peak. A more conservative method which is reliable for cases where the background is smooth and well-fixed in its shape uses the estimator NS/√(NS + NB). In our case this method leads to a significance of 4.7 σ. The third expression taking into account the full uncertainty of a statistically independent background which should underestimate the significance of the signal is given by NS/√((NS + NB) + NB). This leads to a value of 3.7 σ. (Abdel-Barry et al. 2004, 132).

    Often, other methods were used in addition to standard deviations. Not all data are suitable for analysis with standard deviations. If one wants to investigate which of two mathematical formulas better fits a set of data, then χ² analysis may be more appropriate. Bertanza and collaborators also used the χ² test: An analysis has been made of the effective-mass-squared distributions of the Ξπ and K systems by means of the χ² test. The probabilities that the observed distributions originate from their corresponding phase-space distributions are < 0.0001 for the Ξπ system and < 0.01 for the K system. The large χ² arise mainly from the single peaks appearing in each curve (Bertanza et al. 1962, 182–83). In order to calculate the χ² value one must know the standard deviation: χ² = Σi=1N ([Observed events] – [Expected events])i²/σi². For randomly distributed events, σ² is equal to the expected number of events. If the data are a good fit to the proposed curve or hypothesis, the χ² per degree of freedom should be approximately one.¹⁰ It is also possible to translate χ² into a probability and then into standard deviations, although this is not always done. A large χ² indicates that the results do not fit the hypothesis, whereas a smaller χ² shows a good fit.

    Sometimes the number of standard deviations is not mentioned at all. The paper that announced the discovery of the η meson by Pevsner and collaborators (1961) noted that there were 36 events in the appropriate region with an overestimated background on 12 events, a signal of 24 events, with no further comment.¹¹ The experimenters did, however, also present a graph of their data (figure P.1).¹² The smaller peak on the left, the η meson, is clearly visible.¹³ The combination of the two was deemed sufficient to establish their claim by both the authors and by the physics community.

    A similar method was used by Alston and collaborators (1961) in reporting a new resonance in the K-π system. They noted that they had a signal of 22 events above a background of 3 events, a total of 25 events, and presented a graph of their data (figure P.2). No number of standard deviations for the signal was given, but the peak was clearly visible in the graph. In this case the authors did use standard deviations to determine the spin of the new particle: Experimentally for the 21 events lying in the K* mass range between 870 and 900 Mev is 0.275. Using S = 2 [S is the spin of the particle], . . . we find the expected value of ≥ 0.429 with a standard deviation of 0.051. The experimental result thus deviates from the range of values expected for S = 2 by three standard deviations. For S > 2 the discrepancy is even greater. On the other hand, the experimental result is consistent (within errors) with S = 0 or S = 1 (Alston et al. 1961, 301). A three-sigma signal seemed to be sufficient.

    Image: Figure P.1. Histogram of the three-pion mass distribution for 233 events. From Pevsner et al. (1961).Image: Figure P.2. Mass spectrum of the k0-π- system. The solid line represents the phase-space normalized to background events. From Alston et al. (1961).

    The trend of using standard deviations as a measure of significance continued in the early 1970s.¹⁴ A survey of papers in Physical Review Letters in 1971 reveals that experimenters were now tending to cite only results with four standard deviations or more, although, once again, the method is not used exclusively. The origin of this change was attributed to Arthur Rosenfeld, a member of the Particle Data Group, which produced the standard reference guide of particle properties, Review of Particle Physics. According to the story, which received wide circulation within the high-energy physics community, Rosenfeld pointed out that given the large number of graphs that were plotted each year by experimenters, one would expect to see a significant number of three-standard-deviation effects even if the data were distributed randomly and no particles or resonances were present. Rosenfeld (1975) discussed the issue in print. In discussing the existence of the κ(725), a Kπ resonance that had been reported five times, but subsequently disappeared, Rosenfeld stated the following: We compiled and histogrammed (by computer) 60,000 new Kπ events, and found no substantial further evidence, and went on to ask how frequently such striking statistical fluctuations should be expected at some given mass in the Kπ system. (At the time about 2 million bubble chamber events were being measured annually, and about a thousand physicists were hunting through 10,000 to 20,000 mass histograms each year, in search of striking features, real or imagined,) We concluded that the five κ claims were just about what we should expect (564–65).

    The probability of a three-sigma effect in a single bin is 0.27 percent.¹⁵ In a 1,000 bin graph, however, the probability of observing a three-standard-deviation effect, if the data are distributed randomly, is 93 percent. The criterion was then changed to four sigma, which has a probability in a single bin of 0.0064 percent and a probability of 6 percent in a 1,000 bin graph. The question of what sample space to use for calculating such probabilities is important and will be discussed in more detail below.

    Carmony and collaborators (1971), in reporting an Observation of a New KN(1760) in the Kπ and Kππ Systems, stated that decays into K*(890) π and Kρ with the same mass and width are observed with 4-standard-deviation significance (1160). There were, however, some problems with their analysis. The proposed KN(1420) shows up clearly in the K⁰π⁰π− spectrum and both the known K*(890) and the KN(1420) appear in the K+π− spectrum (figure P.3). The experimenters fitted the data with a third resonance at a mass of 1760 MeV.¹⁶ As the authors themselves admitted, "the fitted curve still has a very low probability for describing the data between 1.9 and 2.2 GeV [in the K+π− spectrum]. It appears possible that other structures may exist in this region, although with the current data it is not possible to extract their parameters. The KN(1760) is not clearly separated from these structures" (Carmony et al. 1971, 1161, emphasis added). One may get a good fit to the proposed hypothesis in the area of interest, but it may fail elsewhere. This analysis raises the question of whether one is fitting the data only to what one is looking for without considering alternative hypotheses. In this case the experimenters might have considered fitting a smooth curve to the region between 1.6 and 2.2 GeV. The problem of what hypotheses to use for fitting data is one that we will see again.

    The use of χ² analysis is further illustrated in the work of Foley and collaborators (1971). At this time there was a controversy as to whether the A2(1320) meson consisted of a single peak or of two slightly separated peaks (the split A2 controversy).¹⁷ The experimenters fitted their observed mass spectrum with both a single-peak distribution and a double-peak distribution. They obtained a χ² fit of 35.1 for 37 degrees of freedom, a 55 percent probability for the former, whereas the double-peak fit yielded a χ² of 149 for 39 degrees of freedom, a probability of very close to zero,¹⁸ which they regarded as unacceptably bad (Foley et al. 1971, 419). The superiority of the one-peak fit is seen clearly in figure P.4. The group also remarked that within the region 1.2 to 1.4 GeV, the observed distribution about the fitted curve is consistent with statistical fluctuations. However the bins 1.415 to 1.435 contain 37 events where 18 are predicted by the fit. The probability that this is a statistical fluctuation is < 10−4. . . . Hence we conclude that there is probably a narrow peak in the K⁰K− mass at ~1.425 GeV (Foley et al. 1971, 415).¹⁹ The observed effect is more than four standard deviations, although this is not explicitly mentioned.

    Image: Figure P.3. (a) K-π mass spectrum for reaction K+n → K+π-p with −tnp < 1 GeV2, fitted with a superposition of two Breit-Wigner distributions and a fourth-order polynomial background (solid line). The insert shows a fit including a third Breit-Wigner distribution for the KN(1760) and the polynomial background as a dash-dotted line. The dashed curve represents a double-Regge background. (b) Kππ mass spectrum for the combined sample for events from reactions K+n → K0π+π-p and K+n → K+π-π0p with −tnp < 1 GeV2 and Δ°(1236) (1.18 to 1.32 GeV) excluded, together with a superimposed fit. From Carmony et al. (1971).Image: Figure P.4. The K−KS0 effective-mass spectrum for K−KS0 events with a recoil mass in the region 0.76 to 1.06 GeV. From Foley et al. (1971).

    Another problem that arises in this type of analysis, and in later experiments, is how does one calculate the appropriate number of standard deviations when selection criteria, or cuts, are applied to the data.²⁰ Consider the claim made by Ming Ma and Colton (1971) that they had found Evidence for a Narrow N*(1470). They studied the reactions 1) pp → p(nπ+) and 2) pp → p(pπ⁰). Although the pπ⁰ mass spectrum shows no significant enhancements, an enhancement near 1.46 GeV/c² can be seen in reaction 1) (Ming Ma and Colton 1971, 334) (figure P.5). In order to enhance the observed effect, a cut was made requiring the Mπ+p be greater than 2.4 GeV/c², which would reduce a known source of background. When this cut was made, peaks stood out clearly at approximately 1.47 and 1.65 GeV/c²: The combined signal in the mass region between 1.425 and 1.500 Gev/c² stands at the 6-standard deviation level [this was for the combined spectra for reaction 1) and 2) shown in figure P.6] (335). If one looks at the Review of Particle Physics, 2010 (Nakamura et al. 2010), however, one finds no mention of a nucleon resonance at 1470 MeV/c².²¹ A six-standard-deviation effect had disappeared.

    The possible overreliance on standard deviation estimates as a measure of significance suggested above is clearly illustrated in the work of Maglich and collaborators (1971). In this paper a claim was made that a new neutral boson with mass 953+1−2.5 MeV had been discovered by a "20-standard deviation peak (Maglich et al. 1971, 1479; emphasis added). The group explicitly rejected the previously discovered η’(958) as an explanation for their result: If the mass value of the η’, which is 957.7 ± 0.8 is fixed in the program, it is rejected with a confidence level P(χ²) < 10−4" (1479). Their results are shown in figure P.7. As one can see, they were able to observe peaks at the masses of the known π⁰, η⁰, and ω⁰ particles at their previously measured masses. The insert in figure P.7 shows the details of the spectrum in the region of 900 MeV. The claimed 20-standard-deviation effect has a probability of approximately 10−86, or about the same probability of getting 288 consecutive heads in the toss of a fair coin. Yet if one looks at Nakamura et al. (2010), only the η’(958) is listed. There are other instances in which results that have large reported statistical significance have disappeared. This, of course, raises the question of whether the standard deviations are being calculated correctly or whether the result is an artifact caused by the application of selection criteria.²² In this case one might suggest that the experimenters underestimated the uncertainty in their mass determination and had mistaken a slightly displaced η’(958) for a new particle.

    Image: Figure P.5. Invariant-mass distributions of the bracketed pion-nucleon pair. From Ming Ma and Colton (1971).Image: Figure P.6. Combined plot for the invariant mass for π+n combinations for Mπ+p > 2.4 GeV/c2 and for π0p with the other proton greater than 2.4 GeV/c2. The solid curve represents fits by a quadratic polynomial background plus two Breit-Wigner functions. From Ming Ma and Colton (1971).Image: Figure P.7. Data at 3 GeV for full target minus empty target. The fit shown includes resonance terms plus phase-space background. The inset shows the details in the region of 900 MeV. From Maglich et al. (1971).

    It is clear that the four-standard-deviation criterion for a significant effect was well-established by the mid-1970s. In a Search for Heavy Narrow Resonances produced by Photons with Energies up to 11.8 GeV, Theodosiou et al. (1976) stated that within the sensitivity of the experiment no evidence for any narrow new resonances was found (126).²³ They noted that "the limits quoted . . . are based on the number of events that one would have detected for a 4-standard-deviation signal on top of the background (128; emphasis added). In order to demonstrate that their experimental apparatus and analysis procedure would have detected such an effect, they plotted both the observed e+e− mass-squared spectrum as well as that spectrum with a four-standard-deviation bump added to it, which was accomplished by adding 160 events from the decay of a resonance with mass equal to 2.15 GeV² and a width of 0.29 GeV² (figure P.8).²⁴ It is quite clearly visible. The four-standard-deviation criterion was also used by Abolins and collaborators (1976) in their Search for Narrow Two-body Enhancements at Fermilab. Their method was to compare their observed mass spectra for the invariant mass of particles produced in states by neutron-beryllium interactions, π+π−, π+K−, and p , with smoothed distributions generated with randomized tracks from different events. This was intended to search for ≥ 4-standard-deviation enhancements. One such enhancement is observed in the π+K− mass distribution at mπK = 2.29 ± 0.03 GeV. . . . We note that this ‘4σ’ enhancement has a purely statistical probability of ~3% (Abolins et al. 1976, 418). The reader may be somewhat surprised at this estimate because the probability of a four σ deviation in a single bin is 0.0064 percent. The experimenters obtained this estimate by multiplying that probability by five hundred, the total number of bins.²⁵ In a footnote they remarked that for comparison two ~3 standard deviation peaks are observed: p at 2.66 GeV and K−π+ at 2.42 GeV" (420). These were not noted in the text. A three-sigma effect was insufficient.

    The increasing importance of standard deviations as a criterion of significance is shown in Search for Charmed Hadrons Using a Direct-Muon Trigger (Bunnell et al. 1976). The experimenters noted that "figure [P.9] shows the distribution of standard deviations from the smoothed background curve for a representative data sample. While we see no significant deviations from Poisson statistics, there are three bins with probability less than one over the number of bins studied, i.e., ≤ 1.3 × 10−5 (Bunnell et al. 1976, 87). A probability of ≤ 1.3 × 10−5 corresponds to slightly more than four standard deviations. The experimenters found three such peaks but did not regard any of them as establishing the existence of a new particle, but they did note that, because of its quantum numbers, the enhancements at m(K+K−) = 1984 MeV/c² is the most promising one for further investigation" (87).

    Image: Figure P.8. (a) The measured e+e− mass spectrum. In (b) a four-standard-deviation bump is generated by adding 160 events from the decay of a resonance of m02 = 2.15 GeV2 with Δ m02 = 0.29 GeV2 to the spectrum. From Theodosiou et al. (1976).Image: Figure P.9. Distribution of standard deviations form a smooth background form invariant mass plots. From Bunnell et al. (1976).Image: Figure P.10. A plot of the parameter α indicating detection of π-μ atoms. From Coombes et al. (1976).

    There were, however, also papers that made no use of standard deviations. In a search for π-μ bound states Coombes et al. (1976) presented a graph of their results for α, the parameter of interest (figure P.10), claiming that the data show a clear peak at the predicted point containing a total of 21 events with an estimated background of 3 events. . . . We conclude that we have observed Coulomb bound states of pions and muons (251). The combination of the graph, the size of the signal, and the agreement with theory was sufficient to establish the existence of the effect.

    The Reign of Four Standard Deviations

    The reign of four standard deviations continued into the 1980s, albeit without the rigor that the statistical significance of a result would assume in the 2000s.²⁶ Although the four-standard-deviation criterion was acknowledged, papers were published with lower statistical significance. In some cases the number of events above background and the statistical uncertainty were given, but the number of standard deviations was neither calculated nor presented. This was left as an exercise for the reader. In addition, there was no correlation between the statistical significance of a result, the number of standard deviations of that result, and the title of the paper. Thus, we have papers titled Evidence for and Observation of that report results having the same statistical significance.

    Consider Evidence for Two Narrow p Resonances at 2020 MeV and 2200 MeV (Benkheiri et al. 1977). The experimenters presented graphs of their results (figure P.11) and noted that for the 2020 MeV peak the total sample gives a signal/noise = 153/409 (~7.6 s.d.). For the 2200 peak the sample of events with a Δ⁰(1232) selection gives a signal/noise = 58/82 (~6.5 s.d.) (485). They concluded that we observe two narrow p peaks at 2020 and 2200 MeV, with a significance of more than 6 s.d. (485). In contemporary work this would qualify for an Observation of title. In examining the decay angular distributions they remarked that "the compatibility between the 2200 MeV resonance and the background exists only at 3 s.d." (485; emphasis added). The group also remarked on a suggestive peak at approximately 1930 MeV but presented only a figure (figure P.12, the small peak on the left) and no number of events or standard deviations.

    That same resonance at 1930 MeV was reported in Observation of a Narrow p Enhancement at 1940 MeV (Daum et al. 1980). The experimenters remarked that the situation with the narrow baryonium states was quite confusing. Only the 1936 MeV resonance had been seen in more than one experiment, but it had not been seen in all experiments. The group took data with beams of protons and of positive and negative pions. The resonance was seen only in the proton data. They reported a peak of 36 ± 9 events after background subtraction and observed a 4-standard-deviation enhancement in p states produced inclusively in 93 GeV proton interactions, while there is no enhancement in the pion-induced reactions (478). Four standard deviations was, at the time, sufficient for an observation.

    A somewhat different case was presented by Russell and collaborators (1981). In this paper the group was investigating a particle that was already regarded as well established. They were attempting to better determine its mass. The experimenters presented a graph of the combined data for pK⁰S and K⁰S final states (figure P.13), and they stated that a clear signal is seen at the charmed baryon mass. A fit of the mass distribution to a single resonance and a smooth quadratic background gives a signal of 55 ± 10 events above a background of 85 events (Russell et al. 1981, 800). The group also commented that "for the π-π-π+ mode, where a signal has been previously observed in photoproduction, an enhancement of marginal significance is present in the inclusive mass distribution (801; emphasis added). The number of signal events given was 140 ± 48 events, a 2.9-standard-deviation effect. This was regarded as of marginal significance."

    Image: Figure P.11. The distribution of the pp invariant mass at 9 GeV/c. (a) All the events. (b) Events with invariant mass pFπ− in Δ0(1232) region 1175 <M(pFπ−) < 1300 MeV. (c) Events with invariant mass pFπ− in the N0(1520) region 1450 < M(pFπ−) < 1600 MeV. (d) Events with invariant mass pFπ− outside the regions of (b) and (c). The full curves represent the fit of the data with a smooth background and one or two Breit-Wigner resonances. The dotted curve under the peak represents the contribution of the background. From Benkheiri et al. (1977).Image: Figure P.12. The distribution of the pp invariant mass with a cosθ < 0 and a 1175 < M(pFπ−) < 1300 MeV selection for the sample of the two 9 + 12 GeV/c runs. From Benkheiri et al. (1977).Image: Figure P.13. pK0S + pK0S mass distribution. The shaded portion of the distribution represents the data taken with six radiation lengths of lead in the beam, normalized to equal hadron flux on the target. From Russell et al. (1981).

    Roos et al.’s (1982) Review of Particle Properties, the Particle Data Group’s discussion of particle resonances, stated that in general we accept such peaks if they are experimentally reliable, of high statistical significance or observed in several different production experiments (xi). No explicit criterion was given, however, for high statistical significance.

    Thus, we see that in the early 1980s the four-standard-deviation criterion seems to have been in effect, although it was not always explicitly used or mentioned. This continued through the 1980s. In 1986, for example, we find four papers whose titles begin Observation of in volume 56 of Physical Review Letters, and in three of these four papers, the significance of the results is expressed in standard deviations, although, as usual, graphs of the data and numbers of events are also included. The fourth paper, however, presented no standard deviations but showed the result in a graph.

    Image: Figure P.14. KK invariant-mass distribution for the full sample of 5.8 × 106 J/ψ for (a) the K+K− final state and for (b) the K0SK0S final state, where the four-pion background is shown cross-hatched. Fits to the 1.9–2.6 GeV/c2 mass region are displayed in the inserts. From Baltrusaitis et al. (1986).

    In the first of these papers, Observation of a Narrow K State in J/Ψ Radiative Decays, Baltrusaitis et al. (1986) looked for such a state, named the ξ, in the decay of the J/Ψ particle. For the K+K− final decay state they reported a statistical significance of ~4.5 standard deviations, whereas for the K⁰S K⁰S final state the significance was ~3.6 standard deviations. The results for both decay states were combined and are shown in figure P.14. The significance of the signal, Baltrusaitis et al. report, was determined by comparing the likelihood for the fit described above [this included the new K+K− state] with that obtained for a fit containing no signal, is found to be 4.5 standard deviations (s.d.). Other choices of background parameterization lead to fits in which the statistical significance of the ξ signal varies from 3.9 to 5.8 s.d. (109). A similar procedure for the K⁰S K⁰S final state yielded results that varied from 3 to 4.7 standard deviations. Baltrusaitis and collaborators varied the background parameters used to guard against the possibility that the observed effect was an artifact produced by the particular choice of background. This is, as discussed later, an important safeguard that is not always employed.

    In Observation of a New Charmed Meson (Albrecht et al. 1986), the group reported a new state, the D*⁰(2420), which decayed into D*±(2010) π−+. For one set of selection criteria, they noted that a prominent peak is seen around 410 MeV. A Breit-Wigner form for the signal, plus a threshold factor times a second-order polynomial for the background were fitted to the mass difference distribution. . . . The statistical significance of the enhancement is 3.9 standard deviations (550). They combined the data for the two final states and remarked that the combined significance of the effect is 4.9 standard deviations (551) (figure P.15).

    Similarly in Observation of a Narrow Enhancement in ΦKK and Φππ Final States Produced in 400-GeV p-N Interactions (Green et al. 1986) the experimenters used both χ² analysis and standard deviation analysis. For the ΦKK final state they state that there is a clear enhancement in ΦKK. . . . The chi squared per degree of freedom (χ²/DF) is 51/36 [note that this has a probability of 5 percent of getting worse fit to their data] with a 4.3 standard-deviation (σ) excess of 222 ± 52 events above background (1640). For Φππ they found a (χ²/DF) = 35/39 (note that this has a probability of 65 percent of getting worse fit to their data) and a 3.9 standard-deviation excess of 213 ± 55 events in their peak. Graphs of their data were also presented.

    Image: Figure P.15. Distribution of the mass difference m(D*+(2010) π−) – m(D*+(2010)). From Albrecht et al. (1986).Image: Figure P.16. (a) The φπ mass spectrum for Pφπ > 2.5 GeV/c for the combined ϒ(4S) and continuum data samples. (b) The φπ mass spectrum for Pφπ < 2.5 GeV/c for the ϒ(4S) sample. From Haas et al. (1986).

    In Observation of the Decay B → FX (Haas et al. 1986), however, neither the statistical significance of the result nor the number of events in their peak were given. Clear peaks shown in figure P.16 were deemed sufficient to establish the existence of the

    Enjoying the preview?
    Page 1 of 1