JACIII Vol.10 No.3 pp. 260-264
doi: 10.20965/jaciii.2006.p0260


Testing Hypotheses on Simulated Data: Why Traditional Hypotheses-Testing Statistics Are Not Always Adequate for Simulated Data, and How to Modify Them

Richard Aló*, Vladik Kreinovich**, and Scott A. Starks**

*Center for Computational Sciences and Advanced Distributed Simulation, University of Houston-Downtown, One Main Street, Houston, TX 77002, USA

**Pan-American Center for Earth and Environmental Studies, University of Texas at El Paso, El Paso, TX 79968, USA

February 22, 2005
December 21, 2005
May 20, 2006
hypothesis testing, simulated data

To check whether a new algorithm is better, researchers use traditional statistical techniques for hypotheses testing. In particular, when the results are inconclusive, they run more and more simulations (n2>n1, n3>n2, …, nm>nm-1) until the results become conclusive. In this paper, we point out that these results may be misleading. Indeed, in the traditional approach, we select a statistic and then choose a threshold for which the probability of this statistic “accidentally” exceeding this threshold is smaller than, say, 1%. It is very easy to run additional simulations with ever-larger n. The probability of error is still 1% for each ni, but the probability that we reach an erroneous conclusion for at least one of the values ni increases as m increases. In this paper, we design new statistical techniques oriented towards experiments on simulated data, techniques that would guarantee that the error stays under, say, 1% no matter how many experiments we run.

Cite this article as:
Richard Aló, Vladik Kreinovich, and Scott A. Starks, “Testing Hypotheses on Simulated Data: Why Traditional Hypotheses-Testing Statistics Are Not Always Adequate for Simulated Data, and How to Modify Them,” J. Adv. Comput. Intell. Intell. Inform., Vol.10, No.3, pp. 260-264, 2006.
Data files:
  1. [1] P. R. Cohen, “Empirical Methods for Artificial Intelligence,” MIT Press, Cambridge, Massachusetts, 1995.
  2. [2] P. R. Cohen, I. Gent, and T. Walsh, “Empirical Methods for Artificial Intelligence and Computer Science,” Tutorial at the 17th National Conference on Artificial Intelligence AAAI’2000, Austin, TX, July 30-August 3, 2000.
  3. [3] I. Gent, and T. Walsh, “An Empirical Analysis of Search in GSAT,” Journal of Artificial Intelligence Research, Vol.1, pp. 47-59, 1993.
  4. [4] C. McGeoch, P. Sanders, R. Fleischer, P. R. Cohen, and D. Precup, “Using Finite Experiments to Study Asymptotic Performance,” In: R. Fleischer, B. Moret, and M. Schmidt (eds.), Experimental Algorithmics, Springer-Verlag, Berlin, Heidelberg, New York, pp. 93-124, 2002.
  5. [5] D. J. Sheskin, “Handbook of Parametric and Nonparametric Statistical Procedures,” Chapman & Hall/CRC, Boca Raton, Florida, 2004.
  6. [6] H. M. Wadsworth Jr., “Handbook of statistical methods for engineers and scientists,” McGraw-Hill, N.Y., 1990.

*This site is desgined based on HTML5 and CSS3 for modern browsers, e.g. Chrome, Firefox, Safari, Edge, Opera.

Last updated on Mar. 05, 2021