The following list of papers discussing null hypothesis significance testing as a method of inference covers the period 2001-2011. The list is intended to supplement previous compilations through 1997 by Marks Nester at http://warnercnr.colostate.edu/~anderson/nester.html, and another through 2001 by Bill Thompson at http://warnercnr.colostate.edu/~anderson/thompson1.html. The list includes a few papers prior to 2001 that did not appear in either of these two previous compilations. The list may not be complete, either because a paper was simply missed, or because the primary focus of a paper did not seem to be relevant to significance testing. Internet links are provided where available. Abstracts are usually available at no charge, but some sites require a fee or subscription for the full text.

Anderson, D. R. (2008). *Model Based Inference in the Life Sciences: A Primer on Evidence*. New York: Springer.

Anttonen, R. G. (1970). "The significance of the null." *The Journal of Educational Research* **63**(10): 438-440.

Balluerka Lasa, N., A. I. Vergara Iraeta, et al. (2009). "Calculating the main alternatives to null-hypothesis-significance testing in between-subject experimental designs." *Psicothema* **21**(1): 141-151.

Balluerka, N., J. Gómez, et al. (2005). "The controversy over null hypothesis significance testing revisited." *Methodology: European Journal of Research Methods for the Behavioral and Social Sciences* **1**(2): 55-70.

Beaulieu-Prévost, D. (2007). "Statistical decision and falsification in science: Going beyond the null hypothesis." In *Cognitive Decision-Making: Empirical and Foundational Issues*, ed.* *Hardy-Vallee, B. Cambridge: Cambridge Scholar Publishing.

__Beninger, P. G., I. Boldina, and S. Katsanevakis. (2012). "Strengthening statistical usage in marine ecology." __*Journal of Experimental Marine Biology and Ecology* 426-427: 97-108.

Berger, J. O. (2003). "Could Fisher, Jeffreys and Neyman have agreed on testing?" *Statistical Science* **18**(1): 1-32.

Blaich, C. F. (1998). "The null-hypothesis significance-test procedure: Can't live with it, can't live without it." *Behavioral and Brain Sciences* **21**(2): 194-195.

Bonett, D., and T. Wright (2007). "Comments and recommendations regarding the hypothesis testing controversy." *Journal of Organizational Behavior* **28**(6): 647-659.

Bookstein, F. (1998). "Statistical significance testing was not meant for weak corroborations of weaker theories." *Behavioral and Brain Sciences* **21**(2): 195-196.

Brosi, B. J., and E. G. Biber (2009). "Statistical inference, Type II error, and decision making under the U.S. Endangered Species Act." *Frontiers in Ecology and Environment* **7**(9): 487-494.

Buckland, S. T., D. R. Anderson, K. P. Burnham, J. L. Laake, D. L. Borchers and L. Thomas (2001). *Introduction to Distance Sampling: Estimating Abundance of Biological Populations*. New York: Oxford University Press.

Burnham, K. P., and D. R. Anderson (2002). *Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach*. New York: Springer.

Butcher, J. A., J. E. Groce, et al. (2007). "Persistent controversy in statistical approaches in wildlife sciences: a perspective of students." *The Journal of Wildlife Management* **71**(7): 2142-2144.

Camp, R. J., N. E. Seavy, et al. (2008). "A statistical test to show negligible trend: comment." *Ecology* **89**(5): 1469-1472.

Chinipardaz, R., and A. Abtahi (2008). "Testing a point null hypothesis: The comparison of *p*-Values and Bayesian evidence in multivariate normal distribution." *Pakistan Journal of Statistics* **24**(2): 123-133.

Chisholm, R., and R. Taylor (2007). "Null-hypothesis significance testing and the critical weight range for Australian mammals." *Conservation Biology* **21**(6): 1641-1645.

Chow, S. (1988). "Significance test or effect size?" *Psychological Bulletin* **103**(1): 105-110.

Chow, S. (1998). "The null-hypothesis significance-test procedure is still warranted." *Behavioral and Brain Sciences* **21**(2): 228-235.

Chow, S. L. (1998). "Précis of Statistical significance: Rationale, validity, and utility." *Behavioral and Brain Sciences* **21**: 169-239.

Cole, R., and G. McBride (2004). "Assessing impacts of dredge spoil disposal using equivalence tests: implications of a precautionary (proof of safety) approach." *Marine Ecology Progress Series* **279**:63-72.

Cormack, R. M. (1988). "Statistical challenges in the environmental sciences: A personal view." *Journal of the Royal Statistical Society* A 151:201-210.

Cowgill, G. (1977). "The trouble with significance tests and what we can do about it." *American Antiquity* **42**(3): 350-368.

Dahiru, T. 2008. "P-value, a true test of statistical significance? A cautionary note." *Annals of Ibadan Postgraduate Medicine* 6:21-26.

Dar, R. (1998). "Null hypothesis tests and theory corroboration: Defending NHSTP out of context." *Behavioral and Brain Sciences* **21**(2): 196-197.

Denis, D. (2003). "Alternatives to null hypothesis significance testing." *Theory & Science* **4**(1).

D'Errico, G. E. (2009). "Issues in significance testing." *Measurement* **42**(10): 1478-1481.

D'Errico, G. E. (2009). "Issues in significance testing." *Measurement* **42**(10): 1478-1481.

Dienes, Z. (2011). "Bayesian versus orthodox statistics: Which side are you on?" *Perspectives on Psychological Science* 6(3): 274-290.

Dixon, P. M., and J. H. K. Pechmann (2008). "A statistical test to show negligible trend: A reply." *Ecology* **89**(5): 1473.

Eberhardt, L. L. (2003). "What should we do about hypothesis testing?" *The Journal of Wildlife Management* **67**(2): 241-247.

Eguchi, T. and T. Gerrodette (2009) "A Bayesian approach to line-transect analysis for estimating abundance." *Ecological Modelling*, **220**, 1620-1630.

Erwin, E. (1998). "The logic of null hypothesis testing." *Behavioral and Brain Sciences* **21**(2): 197-198.

Fernandez-Duque, E. (1997). "Comparing and combining data across studies: Alternatives to significance testing." *Oikos* **79**(3): 616-618.

Fidler, F. (2002). "The fifth edition of the APA Publication Manual: Why its statistics recommendations are so controversial." *Educational and Psychological Measurement* **62**(5): 749-770.

Fidler, F., M. A. Burgman, et al. (2006). "Impact of criticism of null-hypothesis significance testing on statistical reporting practices in conservation biology." *Conservation Biology* **20**(5): 1539-1544.

Finch, S., G. Cumming, et al. (2001). "Reporting of statistical inference in the Journal of Applied Psychology: Little evidence of reform." *Educational and Psychological Measurement* **61**(2): 181-210.

Fraley, R., and M. Marks (2007). "The null hypothesis significance-testing debate and its implications for personality research." In *Handbook of Research Methods in Personality Psychology*, ed. Robins, R. W., R. C. Fraley and R. F. Krueger, 149-169. New York: The Guilford Press.

Freedman, D. A. (1983). "A note on screening regression equations." *The American Statistician* 37: 152-155.

Frick, R. (1998). "Chow's defense of null-hypothesis testing: Too traditional?" *Behavioral and Brain Sciences* **21**(2): 199.

Gelman, A., J. B. Carlin, H. S. Stern and D. B. Rubin (2004). *Bayesian Data Analysis*. Boca Raton, FL: Chapman & Hall/CRC.

Gelman, A., and H. Stern (2006). "The difference between 'significant' and 'not significant' is not itself statistically significant." *The American Statistician* **60**(4): 328-331.

Gerrodette, T. (2011). "Inference without significance: Measuring support for hypotheses rather than rejecting them." *Marine Ecology*. 32: 404-418.

Gerrodette, T., B. L. Taylor, R. Swift, S. Rankin, L. A. Jaramillo and L. Rojas-Bracho (2011) "A combined visual and acoustic estimate of 2008 abundance, and change in abundance since 1997, for the vaquita, *Phocoena sinus*." *Marine Mammal Science* **27**(2): E79-E100.

Gibbons, J. M., N. M. J. Crout, et al. (2007). "What role should null-hypothesis significance tests have in statistical education and hypothesis falsification?" *Trends in Ecology and Evolution* **22**(9): 445-446.

Glück, J., and O. Vitouch (1998). "Stranded statistical paradigms: The last crusade." *Behavioral and Brain Sciences* **21**(2): 200-201.

Good, I. J. (1985). "Tail-area probabilities and Bayes factors as distances from the null hypothesis." *Journal of Statistical Computation and Simulation* **20**(4): 325-325.

Good, I. J. (1958). "Significance tests in parallel and in series." *Journal of the American Statistical Association* 53:799-813.

Goodie, A. S. (2004). "Null hypothesis statistical testing and the balance between positive and negative approaches." *Behavioral and Brain Sciences* **27**(3): 338-339.

Goodman, S. N. (2001). "Of *p*-values and Bayes: A modest proposal." *Epidemiology* **12**(3): 295-297.

Goodman, D. (2004a). "Methods for joint inference from multiple data sources for improved estimates of population size and survival rates." *Marine Mammal Science*, **20**(3), 401-423.

Goodman, D. (2004b). "Taking the prior seriously: Bayesian analysis without subjective probability." In *The Nature of Scientific Evidence: Statistical, Philosophical, and Empirical Considerations*, ed. M.L. Taper and S. R. Lele*. *Chicago: University of Chicago Press.

Goodman, S. N., and S. Greenland (2007). "Why most published research findings are false: Problems in the analysis." *PLoS Medicine* **4**(4): e168.

Gray, B. R., and M. M. Burlew (2007). "Estimating trend precision and power to detect trends across grouped count data." *Ecology*, **88**(9), 2364-2372.

Gregg, A. P., and C. Sedikides (2004). "Is social psychological research really so negatively biased?" *Behavioral and Brain Sciences* **27**(3): 340-341.

Guthery, Fred S. (2008). "Statistical ritual versus knowledge accrual in wildlife science." *Journal of Wildlife Management* **72**(8): 1872-1875.

Guthery, F. S., J. J. Lusk, et al. (2001). "The fall of the null hypothesis: Liabilities and opportunities." *The Journal of Wildlife Management* **65**(3): 379-384.

Hagen, R. (1997). "In praise of the null hypothesis significance test." *American Psychologist* **52**(1): 15-24.

Hamilton, W. I. (1993). "Testing the Null Hypothesis." *Journal of Forestry* **91**(1): 5.

Harris, R. (1998). "'With friends like this...': Three flaws in Chow's defense of significance testing." *Behavioral and Brain Sciences* **21**(2): 202-203.

Hobbs, N. T., and R. Hilborn (2006). "Alternatives to statistical hypothesis testing in ecology: A guide to self teaching." *Ecological Applications* **16**(1): 5-19.

Hobbs, N. T., S. Twombly, et al. (2006). "Deepening ecological insights using contemporary statistics." *Ecological Applications* **16**(1): 3-4.

Hoekstra, R., S. FINCH, et al. (2006). "Probability as certainty: Dichotomous thinking and the misuse of p values." *Psychonomic Bulletin & Review* **13**(6): 1033-1037.

Hoenig, J. M., and D. M. Heisey (2001). "The abuse of power: The pervasive fallacy of power calculations for data analysis." *The American Statistician* **55**(1): 19-24.

Hunter, J. (1998). "Testing significance testing: A flawed defense." *Behavioral and Brain Sciences* **21**(2): 204.

Hurlbert, S. H., and C. M. Lombardi (2009). "Final collapse of the Neyman-Pearson decision theoretic framework and rise of the neoFisherian." *Annales Zoologici Fennici* **46**(5): 311-349.

Ioannidis, J. P. A. (2005). "Why most published research findings are false." *PLoS Medicine* **2**(8): e124.

Ioannidis, J. P. A. (2007). "Why most published research findings are false: Author's reply to Goodman and Greenland." *PLoS Medicine* **4**(6): e215.

Jaramillo-Legorreta, A., L. Rojas-Bracho, E. L. Brownell, Jr., A. J. Read, R. R. Reeves, K. Ralls and B. L. Taylor (2007). "Saving the vaquita: Immediate action, not more data." *Conservation Biology*, **21**(6), 1653-1655.

Johnson, D. H. (2002). "The role of hypothesis testing in wildlife science." *The Journal of Wildlife Management* **66**(2): 272-276.

Jones, L., and J. Tukey (2000). "A sensible formulation of the significance test." *Psychological Methods* **5**(4): 411-414.

Kelley, J. (2009). "The perils of *p*-values: Why tests of statistical significance impede the progress of research." *Handbook of Evidence-Based Psychodynamic Psychotherapy*: 367-377.

Kihlstrom, J. (1998). "If you've got an effect, test its significance; if you've got a weak effect, do a meta-analysis." *Behavioral and Brain Sciences* **21**(2): 205-206.

Kluger, A. N., and J. Tikochinsky (2001). "The error of accepting the "theoretical" null hypothesis: The rise, fall, and resurrection of commonsense hypotheses in psychology." *Psychological Bulletin* **127**(3): 408-423.

Krueger, L. (1998). "The Ego has landed! The .05 level of statistical significance is soft (Fisher) rather than hard (Neyman/Pearson)." *Behavioral and Brain Sciences* **21**(2): 207-208.

Läärä, E. (2009). "Statistics: reasoning on uncertainty, and the insignificance of testing null." *Annales Zoologici Fennici* **46**(2): 138-157.

Lecoutre, B., M.-P. Lecoutre, et al. (2001). "Uses, abuses and misuses of significance tests in the scientific community: Won't the Bayesian choice be unavoidable?" *International Statistical Review* **69**(3): 399-417.

Lee, M. D., and K. J. Pope (2006). "Model selection for the rate problem: A comparison of significance testing, Bayesian, and minimum description length statistical inference." *Journal of Mathematical Psychology* **50**(2): 193-202.

Lee, M. D., and E.-J. Wagenmakers (2005). "Bayesian statistical inference in psychology: Comment on Trafimow (2003)." *Psychological Review* **112**(3): 662-668.

Lew, M. J. (2006). "Principles: When there should be no difference - how to fail to reject the null hypothesis." *Trends in Pharmacological Sciences* **27**(5): 274-278.

Link, W. A., and R. J. Barker (2010). *Bayesian Inference with Ecological Applications*. New York: Academic Press.

Loftus, G. R. (1996). "Psychology will be a much better science when we change the way we analyze data." *Current Directions in Psychological Science* **5**(6): 161-171.

Lombardi, C. M., and S. H. Hurlbert (2009). "Misprescription and misuse of one-tailed tests." *Austral Ecology* **34**(4): 447-468.

Luft, H. S. (2000). "Identifying and assessing the null hypothesis." *Health Services Research* **34**(6): 1265-1271.

Mackintosh, N. J. (1987). "From null hypothesis to null dogma." *Behavioral and Brain Sciences* **10**(4): 689-695.

Malgady, R. (2000). "Myths about the null hypothesis and the path to reform." In *Handbook of Cross-cultural and Multicultural Personality Assessment*, ed. R. H. Dana, 49-62. Mahwah, NJ: Lawrence Erlbaum Associates.

Martien, K. K., and B. L. Taylor (2003). "Limitations of hypothesis-testing in defining management units for continuously distributed species." *Journal of Cetacean Research and Management* **5**(3): 213-218.

Martínez del Rio, C., S. W. Buskirk, et al. (2007). "Response to Gibbons *et al*.: Null-hypothesis significance tests in education and inference." *Trends in Ecology and Evolution* **22**(9): 446.

McBride, G. B. (2002). "Statistical methods helping and hindering environmental science and management." *Journal of Agricultural, Biological, and Environmental Statistics* **7**(3): 300-305.

McIlroy, D. R. (2005). "Failing to reject the null hypothesis does not mean that the null hypothesis is true." *Anesthesia and Analgesia* **100**(6): 1868-1869.

Meeks, S. L., and R. B. Dagostino (1983). "A note on the use of confidence-limits following rejection of a null hypothesis." *American Statistician* **37**(2): 134-136.

Mogie, M. (2004). "In support of null hypothesis significance testing." *Proceedings: Biological Sciences*. London: The Royal Society. **271**(3): S82-S84.

Morrison, D. E., and R. Henkel, Eds. (1970). __The Significance Test Controversy: A Reader__. New Brunswick, NJ: Transaction Publishers (reprinting).

Mundry, R., and C. L. Nunn (2009). "Stepwise model fitting and statistical inference: Turning noise into signal pollution." *The American Naturalist* **173**(1): 119-123.

Nakagawa, S., and I. C. Cuthill (2007). "Effect size, confidence interval and statistical significance: A practical guide for biologists." *Biological Reviews* **82**(4): 591-605.

Nester, M. R. (1998). "Significance tests cannot be justified in theory-corroboration experiments." *Behavioral and Brain Sciences* **21**(2): 213.

Ngatia, M., D. Gonzalez, et al. (2010). "Equivalence versus classical statistical tests in water quality assessments." *Journal of Environmental Monitoring* **12**(1): 172-177.

Nicholls, N. (2001). "The insignificance of significance testing." *Bulletin of the American Meteorological Society* **82**(5): 981-986.

Oakes, W. F. (1975). "On alleged falsity of null hypothesis." *Psychological Record* **25**(2): 265-272.

Paley, J., H. Cheyne, et al. (2008). "The null hypothesis: A reply." *Journal of Advanced Nursing* **64**(2): 209-210.

Palm, G. (1998). "Significance testing–does it need this defence?" *Behavioral and Brain Sciences* **21**(2): 214-215.

Peres-Neto, P. (1999). "How many statistical tests are too many? The problem of conducting multiple ecological inferences revisited." *Marine Ecology Progress Series* 176: 303-306.

Poitevineau, J., and B. Lecoutre (1998). "Some statistical misconceptions in Chow's statistical significance." *Behavioral and Brain Sciences* **21**(2): 215.

Rigby, A. (1999). "Getting past the statistical referee: Moving away from *p*-values and towards interval estimation." *Health Education Research* **14**(6): 713.

Rindskopf, D. (1998). "Null-hypothesis tests are not completely stupid, but Bayesian statistics are better." *Behavioral and Brain Sciences* **21**(2): 215-216.

Robinson, D. H., and H. Wainer (2002). "On the past and future of null hypothesis significance testing." *The Journal of Wildlife Management* **66**(2): 263-271.

Rojas-Bracho, L., R. R. Reeves and A. Jaramillo-Legorreta (2006). "Conservation of the vaquita, *Phocoena sinus*." *Mammal Review*, **36**(3), 179-216.

Salsburg, D. (2001). *The Lady Tasting Tea: How Statistics Revolutionized Science in the Twentieth Century*. New York: W. H. Freeman / Holt Paperbacks.

Schwann, N. M., and J. Horrow (2005). "Failing to reject the null hypothesis does not mean that the null hypothesis is true - In response." *Anesthesia and Analgesia* **100**(6): 1869.

Scialli, A. R. (1992). "Confidence and the null hypothesis." *Reproductive Toxicology* **6**(5): 383-384.

Sedlmeier, P. (2009). "Beyond the significance test ritual." *Zeitschrift für Psychologie/Journal of Psychology* **217**(1): 1-5.

Sellke, T., M. J. Bayarri, et al. (2001). "Calibration of *p* values for testing precise null hypotheses." *The American Statistician* **55**(1): 62-71.

Shrader-Frechette, K. (2011). "Randomization and rules for causal inferences in biology: When the biological emperor (significance testing) has no clothes." *Biological Theory* 6:154-161.

Shrout, P. E. (1997). "Should significance tests be banned? Introduction to a special section exploring the pros and cons." *Psychological Science* 8:1-2.

Siegfried, T. (2010). "Odds are, it's wrong." *Science News* **177**(7)**: **26.

Silva-Aycaguer, L. C., P. Suarez-Gil, et al. (2010). "The null hypothesis significance test in health sciences research (1995-2006): Statistical analysis and interpretation." *BMC Medical Research Methodology* **10**: 44.

Simonne, E., M. Ozores-Hampton, et al. (2007). "So, you wanted to accept the null hypothesis? Analysis and interpretation of fertilizer trials in the BMP era." *HortScience* **42**(3): 440.

Smedslund, G. (2008). "All bachelors are unmarried men (*p* < 0.05)." *Quality & Quantity* **42**(1): 53-73.

Sohn, D. (2000). "Does the finding of statistical significance justify the rejection of the null hypothesis?" *Behavioral and Brain Sciences* **23**(2): 293-294.

Soper, H. V., D. V. Cicchetti, et al. (1988). "Null hypothesis disrespect in neuropsychology: Dangers of alpha and beta errors." *Journal of Clinical and Experimental Neuropsychology* **10**(2): 255-270.

Stallings, W., and S. Singhal (1969). "Confidence level and significance level: Semantic confusion or logical fallacy." *The Journal of Experimental Educational* **37**(4): 57-59.

Stam, H., and G. Pasay (1998). "The historical case against null-hypothesis significance testing." *Behavioral and Brain Sciences* **21**(2): 219-220.

Stephens, P. A., S. W. Buskirk, et al. (2006). "Inference in ecology and evolution." *Trends in Ecology and Evolution* **22**(4): 192-197.

Sterne, J. A. C., and G. Davey Smith (2001). "Sifting the evidence - what's wrong with significance tests?" *Physical Therapy* **81**(8): 1464-1469.

Stürzebecher, E., M. Cebulla, et al. (2005). "Automated auditory response detection: Statistical problems with repeated testing." *International Journal of Audiology* **44**(2): 110-117.

Tachibana, T. (1982). "A comment on confusion in open-field studies: Abuse of null-hypothesis significance test." *Physiology & Behavior* **29**(1): 159-161.

Taper, M. L., and S. R. Lele (2004). *The Nature of Scientific Evidence: Statistical, Philosophical, and Empirical Considerations*. Chicago, The University of Chicago Press.

Tassinary, L. (1998). "Significance tests: Necessary but not sufficient." *Behavioral and Brain Sciences* **21**(2): 221-222.

Thomas, L., S. T. Buckland, E. A. Rexstad, J. L. Laake, S. Strindberg, S. L. Hedley, J. R. B. Bishop, T. A. Marques and K. P. Burnham (2010). "Distance software: Design and analysis of distance sampling surveys for estimating population size." *Journal of Applied Ecology*, **47:** 5-14.

Thompson, C. F., and A. J. Neill (1993). "Statistical power and accepting the null hypothesis." *Animal Behaviour* **46**(5): 1012.

Trafimow, D. (2003). "Hypothesis testing and theory evaluation at the boundaries: Surprising insights from Bayes's theorem." *Psychological Review* **110**(3): 526-535.

Trout, J. D. (1999). "Measured realism and statistical inference: An explanation for the fast progress of 'hard' psychology." *Philosophy of Science* **66**(3): S260-S272.

Turan, F. N., and M. Senocak (2007). "Evaluating 'superiority', 'equivalence' and 'non-inferiority' in clinical trials." *Annals of Saudi Medicine* **27**(4): 284-288.

Vokey, J. (1998). "Statistics without probability: Significance testing as typicality and exchangeability in data analysis." *Behavioral and Brain Sciences* **21**(2): 225-226.

Wagenmakers, E. (2007). "A practical solution to the pervasive problems of *p* values." *Psychonomic Bulletin & Review* **14**(5): 779-804.

Whittingham, M. J., P. A. Stephens, et al. (2006). "Why do we still use stepwise modelling in ecology and behaviour?" *Journal of Animal Ecology* **75**(5): 1182-1189.

Zumbo, B. (1998). "A viable alternative to null-hypothesis testing." *Behavioral and Brain Sciences* **21**(2): 227-228.