Research Paper:
Applying Three-Arm Design for Assessing Distance Learning
Hsin-Neng Hsieh
, Chien-Chou Chen
, and Hung-Yi Lu

Department of Statistics and Information Science, Fu Jen Catholic University
No.510 Zhongzheng Rd., Xinzhuang Dist., New Taipei 242062, Taiwan
Corresponding author
The COVID-19 pandemic has transformed teaching methods, shifting from traditional face-to-face teaching to distance learning. To explore the effectiveness of digital distance teaching in STEM subjects, this study uses mathematical statistics as a case study to analyze and compare student learning outcomes across three different digital distance teaching methods. While conventional assessment of differences in mean values among three sample groups often adopts analysis of variance (ANOVA), ANOVA requires adherence to two major assumptions: normality and homogeneity of variances, which may not be satisfied in practical applications. However, in clinical trials, superiority or noninferiority tests in a three-arm design are often conducted to confirm the effectiveness of a new drug compared to a control group (old drug or placebo). In a three-arm design, superiority or noninferiority tests can be applied under conditions of variance heterogeneity, which represents the primary distinction from the ANOVA approach. We employed superiority and noninferiority tests in the three-arm design to assess student learning effectiveness across three distance digital teaching methods, utilizing Fieller’s and bootstrap methods. An intensive simulation study revealed Fieller’s method performed satisfactorily. Fieller’s method both adequately controlled the empirical type I error rate and was uniformly more powerful than the bootstrap method. Accordingly, Fieller’s method yields stable test results in small samples, making it suitable for scenarios with limited sample sizes in educational settings. Finally, the proposed application methods are illustrated using real-world data.
Power of two methods in three-arm design
- [1] W. P. Tsai and H. N. Hsieh, “Assessing superiority of the learning effectiveness in the three-arm design in the presence of heteroscedasticity,” 19th South East Asian Association for Institutional Research Annual Conf.: Transforming Intelligence into Action in IR, pp. 443-451, 2019.
- [2] H. N. Hsieh and H. Y. Lu, “The generalized inference on the ratio of mean differences for fraction retention noninferiority hypothesis,” PLOS ONE, Vol.15, No.6, Article No.e0234432, 2020. https://doi.org/10.1371/journal.pone.0234432
- [3] R. B. D’Agostino Sr., J. M. Massaro, and L. M. Sullivan, “Non-inferiority trials: Design concepts and issues – The encounters of academic consultants in statistics,” Statistics in Medicine, Vol.22, No.2, pp. 169-186, 2003. https://doi.org/10.1002/sim.1425
- [4] D. Hauschke and I. Pigeot, “Establishing efficacy of a new experimental treatment in the ‘Gold Standard’ Design,” Biometrical J., Vol.47, pp. 782-789, 2005. https://doi.org/10.1002/bimj.200510169
- [5] J. Zhong, M. J. Wen, K. S. Kwong, and S. H. Cheung, “Testing of non-inferiority and superiority for three-arm clinical studies with multiple experimental treatments,” Statistical Methods in Medical Research, Vol.27, pp. 1751-1765, 2018. https://doi.org/10.1177/0962280216668913
- [6] M. J. Adjabui, N. K. Howard, and M. Akamba, “Biostatistical assessment of mutagenicity studies: A stepwise confidence procedure,” J. of Probability and Statistics, Vol.2019, Article No.3249097, 2019. https://doi.org/10.1155/2019/3249097
- [7] I. Pigeot, J. Schäfer, and D. Hauschke, “Assessing non-inferiority of a new treatment in a three-arm clinical trial including a placebo,” Statistics in Medicine, Vol.22, pp. 883-899, 2003. https://doi.org/10.1002/sim.1450
- [8] M. Hasler, R. Vonk, and L. A. Hothorn, “Assessing non-inferiority of a new treatment in a three-arm trial in the presence of heteroscedasticity,” Statistics in Medicine, Vol.27, pp. 490-503, 2008. https://doi.org/10.1002/sim.3052
- [9] Y. W. Chang, Y. Tsong, X. Dong, and Z. Zhao, “Sample size determination for a three-arm equivalence trial of normality distributed responses,” J. of Biopharmaceutical Statistics, Vol.24, pp. 1190-1202, 2014. https://doi.org/10.1080/10543406.2014.948552
- [10] E. C. Fieller, “Some problems in interval estimation,” J. of the Royal Statistical Society, Series B, Vol.16, pp. 175-185, 1954. https://doi.org/10.1111/j.2517-6161.1954.tb00159.x
- [11] S. Chen, J. Rolfes, and H. Zhao, “Estimation of mean health care costs and incremental cost-effectiveness ratios with possibly censored data,” The Stata J., Vol.15, pp. 698-711, 2015. https://doi.org/10.1177/1536867X1501500305
- [12] M. R. Mahmoudi, J. Behboodian, and M. Maleki, “Large sample inference about the ratio of means in two independent populations,” J. of Statistical Theory and Applications, Vol.16, pp. 366-374, 2017. https://doi.org/10.2991/jsta.2017.16.3.8
- [13] G. Piccoli, R. Ahmad, and B. Ives, “Web-based virtual learning environment: A research framework and a preliminary assessment of effectiveness in basic IT skill training,” MIS Quarterly, Vol.25, No.4, pp. 401-426, 2001. https://doi.org/10.2307/3250989
- [14] S. Boghikian-Whitby and Y. Mortagy, “The effect of student background in e-learning – Longitudinal study,” Issues in Informing Science and Information Technology, Vol.5, pp. 107-126, 2008. https://doi.org/10.28945/3203
- [15] S. S. Noesgaard and R. Ørngreen, “The effectiveness of e-learning: An explorative and integrative review of the definitions, methodologies and factors that promote e-Learning effectiveness,” Electronic J. of e-Learning, Vol.13, No.4, pp. 277-289, 2015. https://academic-publishing.org/index.php/ejel/article/view/1735/1698
- [16] S. R. Hiltz and B. Wellman, “Asynchronous learning networks as a virtual classroom,” Communications of the ACM, Vol.40, No.9, pp. 44-49, 1997. https://doi.org/10.1145/260750.260764
- [17] T. Gerholm, T. Hörberg, S. Tonér, P. Kallioinen, S. Frankenberg, S. Kjällander, A. Palmer, and H. L. Taguchi, “A protocol for a three-arm cluster randomized controlled superiority trial investigating the effects of two pedagogical methodologies in Swedish preschool settings on language and communication, executive functions, auditive selective attention, socioemotional skills and early maths skills,” BMC Psychology, Vol.6, Article No.29, 2018. https://doi.org/10.1186/s40359-018-0239-y
- [18] H. Y. Lu and H. N. Hsieh, “Applying non-inferiority test to evaluate student learning effectiveness of asynchronous distance learning,” Psychological Testing, Vol.69, No.4, pp. 321-350, 2002.
- [19] S. C. Chow and J. P. Liu, “Design and Analysis of Bioavailability and Bioequivalence Studies (3rd Ed.),” CRC/Chapman and Hall, 2008.
- [20] B. Efron and R. J. Tibshirani, “An Introduction to the Bootstrap,” Chapman and Hall, 1994.
- [21] G. E. P. Box and D. R. Cox, “An analysis of transformations,” J. of the Royal Statistical Society, Series B, Vol.26, pp. 211-246, 1964. https://doi.org/10.1111/j.2517-6161.1964.tb00553.x
This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.