Testing Hypotheses on Simulated Data: Why Traditional Hypotheses-Testing Statistics Are Not Always Adequate for Simulated Data, and How to Modify Them

Richard Aló; Vladik Kreinovich; Scott A. Starks

doi:10.20965/jaciii.2006.p0260

single-jc.php

« previous

JACIII Vol.10 No.3 pp. 260-264

doi: 10.20965/jaciii.2006.p0260

(2006)

Paper:

Views over last 60 days: 687

Testing Hypotheses on Simulated Data: Why Traditional Hypotheses-Testing Statistics Are Not Always Adequate for Simulated Data, and How to Modify Them

Richard Aló^*, Vladik Kreinovich^, and Scott A. Starks^

^*Center for Computational Sciences and Advanced Distributed Simulation, University of Houston-Downtown, One Main Street, Houston, TX 77002, USA

^**Pan-American Center for Earth and Environmental Studies, University of Texas at El Paso, El Paso, TX 79968, USA

Received:

February 22, 2005

Accepted:

December 21, 2005

Published:

May 20, 2006

Keywords:

hypothesis testing, simulated data

Abstract

To check whether a new algorithm is better, researchers use traditional statistical techniques for hypotheses testing. In particular, when the results are inconclusive, they run more and more simulations (n₂>n₁, n₃>n₂, ..., n_m>n_m-1) until the results become conclusive. In this paper, we point out that these results may be misleading. Indeed, in the traditional approach, we select a statistic and then choose a threshold for which the probability of this statistic “accidentally” exceeding this threshold is smaller than, say, 1%. It is very easy to run additional simulations with ever-larger n. The probability of error is still 1% for each n_i, but the probability that we reach an erroneous conclusion for at least one of the values n_i increases as m increases. In this paper, we design new statistical techniques oriented towards experiments on simulated data, techniques that would guarantee that the error stays under, say, 1% no matter how many experiments we run.

Cite this article as:

R. Aló, V. Kreinovich, and S. Starks, “Testing Hypotheses on Simulated Data: Why Traditional Hypotheses-Testing Statistics Are Not Always Adequate for Simulated Data, and How to Modify Them,” J. Adv. Comput. Intell. Intell. Inform., Vol.10 No.3, pp. 260-264, 2006.

Data files:

References

[1] P. R. Cohen, “Empirical Methods for Artificial Intelligence,” MIT Press, Cambridge, Massachusetts, 1995.
[2] P. R. Cohen, I. Gent, and T. Walsh, “Empirical Methods for Artificial Intelligence and Computer Science,” Tutorial at the 17th National Conference on Artificial Intelligence AAAI’2000, Austin, TX, July 30-August 3, 2000.
[3] I. Gent, and T. Walsh, “An Empirical Analysis of Search in GSAT,” Journal of Artificial Intelligence Research, Vol.1, pp. 47-59, 1993.
[4] C. McGeoch, P. Sanders, R. Fleischer, P. R. Cohen, and D. Precup, “Using Finite Experiments to Study Asymptotic Performance,” In: R. Fleischer, B. Moret, and M. Schmidt (eds.), Experimental Algorithmics, Springer-Verlag, Berlin, Heidelberg, New York, pp. 93-124, 2002.
[5] D. J. Sheskin, “Handbook of Parametric and Nonparametric Statistical Procedures,” Chapman & Hall/CRC, Boca Raton, Florida, 2004.
[6] H. M. Wadsworth Jr., “Handbook of statistical methods for engineers and scientists,” McGraw-Hill, N.Y., 1990.

This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.

[1] [1] P. R. Cohen, “Empirical Methods for Artificial Intelligence,” MIT Press, Cambridge, Massachusetts, 1995.

[2] [2] P. R. Cohen, I. Gent, and T. Walsh, “Empirical Methods for Artificial Intelligence and Computer Science,” Tutorial at the 17th National Conference on Artificial Intelligence AAAI’2000, Austin, TX, July 30-August 3, 2000.

[3] [3] I. Gent, and T. Walsh, “An Empirical Analysis of Search in GSAT,” Journal of Artificial Intelligence Research, Vol.1, pp. 47-59, 1993.

[4] [4] C. McGeoch, P. Sanders, R. Fleischer, P. R. Cohen, and D. Precup, “Using Finite Experiments to Study Asymptotic Performance,” In: R. Fleischer, B. Moret, and M. Schmidt (eds.), Experimental Algorithmics, Springer-Verlag, Berlin, Heidelberg, New York, pp. 93-124, 2002.

[5] [5] D. J. Sheskin, “Handbook of Parametric and Nonparametric Statistical Procedures,” Chapman & Hall/CRC, Boca Raton, Florida, 2004.

[6] [6] H. M. Wadsworth Jr., “Handbook of statistical methods for engineers and scientists,” McGraw-Hill, N.Y., 1990.

Testing Hypotheses on Simulated Data: Why Traditional Hypotheses-Testing Statistics Are Not Always Adequate for Simulated Data, and How to Modify Them

Richard Aló*, Vladik Kreinovich**, and Scott A. Starks**

Richard Aló^*, Vladik Kreinovich^, and Scott A. Starks^