single-jc.php

JACIII Vol.10 No.3 pp. 260-264
doi: 10.20965/jaciii.2006.p0260
(2006)

Paper:

Testing Hypotheses on Simulated Data: Why Traditional Hypotheses-Testing Statistics Are Not Always Adequate for Simulated Data, and How to Modify Them

Richard Aló*, Vladik Kreinovich**, and Scott A. Starks**

*Center for Computational Sciences and Advanced Distributed Simulation, University of Houston-Downtown, One Main Street, Houston, TX 77002, USA

**Pan-American Center for Earth and Environmental Studies, University of Texas at El Paso, El Paso, TX 79968, USA

Received:
February 22, 2005
Accepted:
December 21, 2005
Published:
May 20, 2006
Keywords:
hypothesis testing, simulated data
Abstract

To check whether a new algorithm is better, researchers use traditional statistical techniques for hypotheses testing. In particular, when the results are inconclusive, they run more and more simulations (n2>n1, n3>n2, …, nm>nm-1) until the results become conclusive. In this paper, we point out that these results may be misleading. Indeed, in the traditional approach, we select a statistic and then choose a threshold for which the probability of this statistic “accidentally” exceeding this threshold is smaller than, say, 1%. It is very easy to run additional simulations with ever-larger n. The probability of error is still 1% for each ni, but the probability that we reach an erroneous conclusion for at least one of the values ni increases as m increases. In this paper, we design new statistical techniques oriented towards experiments on simulated data, techniques that would guarantee that the error stays under, say, 1% no matter how many experiments we run.

Cite this article as:
Richard Aló, Vladik Kreinovich, and Scott A. Starks, “Testing Hypotheses on Simulated Data: Why Traditional Hypotheses-Testing Statistics Are Not Always Adequate for Simulated Data, and How to Modify Them,” J. Adv. Comput. Intell. Intell. Inform., Vol.10, No.3, pp. 260-264, 2006.
Data files:
References
  1. [1] P. R. Cohen, “Empirical Methods for Artificial Intelligence,” MIT Press, Cambridge, Massachusetts, 1995.
  2. [2] P. R. Cohen, I. Gent, and T. Walsh, “Empirical Methods for Artificial Intelligence and Computer Science,” Tutorial at the 17th National Conference on Artificial Intelligence AAAI’2000, Austin, TX, July 30-August 3, 2000.
  3. [3] I. Gent, and T. Walsh, “An Empirical Analysis of Search in GSAT,” Journal of Artificial Intelligence Research, Vol.1, pp. 47-59, 1993.
  4. [4] C. McGeoch, P. Sanders, R. Fleischer, P. R. Cohen, and D. Precup, “Using Finite Experiments to Study Asymptotic Performance,” In: R. Fleischer, B. Moret, and M. Schmidt (eds.), Experimental Algorithmics, Springer-Verlag, Berlin, Heidelberg, New York, pp. 93-124, 2002.
  5. [5] D. J. Sheskin, “Handbook of Parametric and Nonparametric Statistical Procedures,” Chapman & Hall/CRC, Boca Raton, Florida, 2004.
  6. [6] H. M. Wadsworth Jr., “Handbook of statistical methods for engineers and scientists,” McGraw-Hill, N.Y., 1990.

*This site is desgined based on HTML5 and CSS3 for modern browsers, e.g. Chrome, Firefox, Safari, Edge, Opera.

Last updated on Jun. 24, 2021