Paper:

# A Survey of Video-Based Crowd Anomaly Detection in Dense Scenes

## Junjie Ma, Yaping Dai, and Kaoru Hirota

School of Automation, Beijing Institute of Technology

5 Zhongguancun South Street, Haidian District, Beijing 100081, China

Population growth has made the probability of incidents at large-scale crowd events higher than ever. In the past decades, automated crowd scene analysis done by computer vision has attracted attention. However, severe occlusions and complex crowd behaviors make such analysis a challenge. As a key aspect of crowd scene analysis, a number of works dealing with dense crowd anomaly detection based on computer vision have been presented. This work is a survey of computer vision techniques for analyzing dense crowd scenes. It covers two aspects: crowd density estimation and abnormal event detection. Some problems and perspectives are discussed at the end.

- [1] B. B. Zhan, D. N. Monekosso, P. Remagnino, S. A. Velastin, and L. Q. Xu, “Crowd analysis: a survey,” Mach. Vis. Appl., Vol.19, No.5-6, pp. 345-357, 2008.
- [2] B. T. Morris and M. M. Trivedi, “A survey of vision-based trajectory learning and analysis for surveillance,” IEEE Trans. Circuits Syst. Video Technol., Vol.18, No. 8, pp. 1114-1127, 2008.
- [3] W. Hu, X. Xiao, Z. Fu, D. Xie, T. Tan, and S. Maybank, “A system for learning statistical motion patterns,” IEEE Trans. Pattern Anal. Mach. Intell., Vol.28, No. 9, pp. 1450-1464, 2006.
- [4] W. M. Hu, T. N. Tan, L. Wang, and S. Maybank, “A survey on visual surveillance of object motion and behaviors,” IEEE Trans. Syst. Man Cybern. C, Appl. Rev., Vol.34, No. 3, pp. 334-352, 2004.
- [5] T. Li, H. Chang, M. Wang, B. B. Ni, R. C. Hong, and S. C. Yan, “Crowded scene analysis: a survey,” IEEE Trans. Circuits Syst. Video Technol., Vol.25, No. 3, pp. 367-386, 2015.
- [6] J. J. Fruin, “Pedestrian planning and design,” New York: Metropolitan Association of Urban Designers and Environmental Planners, 1971.
- [7] J. C. S. Jacques, S. R. Musse, and C. R. Jung, “Crowd analysis using computer vision techniques [A survey],” IEEE Signal Process. Mag., Vol.27, No. 5, pp. 66-77, 2010.
- [8] M. Thida, Y. L. Yong, P. Climent-Pérez, H. Eng, and P. Remagnino, “A literature review on video analytics of crowded scenes,” Intelligent Multimedia Surveillance, pp. 17-36, 2013.
- [9] N. N. A. Sjarif, S. M. Shamsuddin, and S. Z. Hashim, “Detection of abnormal behaviors in crowd scene: a review,” Int. J. Advance. Soft Comput. Appl., Vol.4, No.1, pp. 1-33, 2012.
- [10] A. A. Sodemann, M. P. Ross, and B. J. Borghetti, “A review of anomaly detection in automated surveillance,” IEEE Trans. Syst. Man Cybern., C, Appl. Rev., Vol.42, No.6, pp. 1257-1272, 2012.
- [11] D. Conte, P. Foggia, G. Percannella, F. Tufano, and M. Vento, “A method for counting people in crowded scenes,” 2010 Seventh IEEE Int. Conf. on Advanced Video and Signal Based Surveillance (AVSS), pp. 225-232, 2010.
- [12] S. A. M. Saleh, S. A. Suandi, and H. Ibrahim, “Recent survey on crowd density estimation and counting for visual surveillance,” Eng. Appl. Artif. Intell., Vol.41, pp. 103-114, 2015.
- [13] S. A. Velastin, J. H. Yin, A. C. Davies, M. A. Vicencio-Silva, R. E. Allsop, and A. Penn, “Analysis of crowd movements and densities in built-up environments using image processing,” IEE Colloquium on Image Processing for Transport Applications, 1993.
- [14] S. A. Velastin, J. H. Yin, A. C. Davies, M. A. Vicencio-Silva, R. E. Allsop, and A. Penn, “Automated measurement of crowd density and motion using image processing,” Int. Conf. on Road Traffic Monitoring and Control, 1994.
- [15] S. A. Velastin, A. C. Davies, A. Penn, J. H. Yin, M. A. Vicencio-Silva, and R. E. Allsop, “Image processing for on-line analysis of crowds in public areas,” Transportation Systems Theory & Application of Advanced Technology, Vol.1, pp. 163-168, 1995.
- [16] A. C. Davies, J. H. Yin, and S. A. Velastin, “Crowd monitoring using image processing,” Electron. Commun. Eng. J., Vol.7, No.1, pp. 37-47, 1995.
- [17] S. Y. Cho and T. W. S. Chow, “A fast neural learning vision system for crowd estimation at underground stations platform,” Neural Process. Lett., Vol.10, No.2, pp. 111-120, 1999.
- [18] S. Y. Cho, T. S. Chow, and C. T. Leung, “A neural-based crowd estimation by hybrid global learning algorithm,” IEEE Trans. Syst. Man Cybern. B, Cybern., Vol.29, No.4, pp. 535-541, 1999.
- [19] D. B. Yang, H. H. Gonzalez-Banos, and L. J. Guibas, “Counting people in crowds with a real-time network of simple image sensors,” IEEE Int. Conf. on Computer Vision (ICCV), Vol.1, pp. 122-129, 2003.
- [20] R. H. Ma, L. Y. Li, W. M. Huang, and Q. Tian, “On pixel count based crowd density estimation for visual surveillance,” 2004 IEEE Conf. on Cybernetics and Intelligent Systems, Vol.1, pp. 170-173, 2004.
- [21] N. Hussain, H. S. M. Yatim, N. L. Hussain, J. L. S. Yan, and F. Haron, “CDES: A pixel-based crowd density estimation system for Masjid al-Haram,” Saf. Sci., Vol.49, No.6, pp. 824-833, 2011.
- [22] T. Kohonen, “The self-organizing map,” Proc. of the IEEE, Vol.78, No.9, pp. 1464-1480, 1990.
- [23] G. Z. Liu, T. Z. Wang, and Z. Cao, “Crowd density estimation based on the normalized number of foreground pixels in infrared images,” 2013 4th Int. Conf. on Intelligent Control and Information Processing (ICICIP 2013), pp. 6-9, 2013.
- [24] A. N. Marana, S. A. Velastin, L. F. Costa, and R. A. Lotufo, “Estimation of crowd density using image processing,” IEE Colloquium on Image Processing for Security Applications, pp. 11/1-8, 1997.
- [25] A. N. Marana, S. A. Velastin, L. F. Costa, and R. A. Lotufo, “Automatic estimation of crowd density using texture,” Saf. Sci., Vol.28, No.3, pp. 165-175, 1998.
- [26] A. N. Marana, M. A. Cavenaghi, R. S. Ulson, and F. L. Drumond, “Real-time crowd density estimation using images,” Advances in Visual Computing, First Int. Symp. (ISVC2005), Vol.1, No.74, pp. 355-362, 2005.
- [27] A. N. Marana, L. D. F. Costa, R. A. Lotufo, and S. A. Velastin, “Estimating crowd density with Minkowski fractal dimension,” IEEE Int. Conf. on Acoustics, Speech, & Signal Processing, Vol.6, pp. 3521-3524, 1999.
- [28] A. N. Marana, L. F. Costa, R. A. Lotufo, and S. A. Velastin, “On the efficacy of texture analysis for crowd monitoring,” Int. Symp. on Computer Graphics, Image Processing, and Vision, pp. 354-361, 1998.
- [29] X. H. Li, L. S. Shen, and H. Q. Li, “Estimation of crowd density based on wavelet and support vector machine,” Trans. Inst. Meas. Control, Vol.28, No.3, pp. 299-308, 2006.
- [30] W. H. Ma, L. Huang, and C. P. Liu, “Advanced local binary pattern descriptors for crowd estimation,” Proc. - Pacific-Asia Workshop Comput. Intel. Ind. Appl., PACIIA, Vol.2, pp. 958-962, 2008.
- [31] W. H. Ma, L. Huang, and C. P. Liu, “Crowd estimation using multi-scale local texture analysis and confidence-based soft classification,” 2008 2nd Int. Symp. on Intelligent Information Technology Application, Vol.I, pp. 142-146, 2008.
- [32] Z. Wang, H. Liu, Y. Qian, and T. Xu, “Crowd density estimation based on local binary pattern co-occurrence matrix,” 2012 IEEE Int. Conf. on Multimedia & Expo Workshops (ICMEW 2012), pp. 372-377, 2012.
- [33] A. Polus, J. L. Schofer, and A. Ushpiz, “Pedestrian flow and level of service,” J. Transp. Eng, 1983. Vol.109, No.1, pp. 46-56, 1983.
- [34] Z. X. Zhang and M. Li, “Crowd density estimation based on statistical analysis of local intra-crowd motions for public area surveillance,” Opt. Eng., Vol.51, No.4, pp. 047204-047213, 2012.
- [35] A. Albiol, J. María, A. Silla, and J. M. Albiol, “Video analysis using corner motion statistics,” Proc. of the IEEE Int. Workshop on Performance Evaluation of Tracking & Surveillance –38 Tools Appl, 2009.
- [36] A. M. Tekalp, “Digital video processing,” Prentice-Hall, Inc., 1995.
- [37] D. Conte, P. Foggia, G. Percannella, and F. Tufano, “Counting moving people in videos by salient points detection,” Proc. of the 2010 20th Int. Conf. on Pattern Recognition (ICPR 2010), pp. 1743-1746, 2010.
- [38] D. Conte, P. Foggia, G. Percannella, F. Tufano, and M. Vento, “A method for counting moving people in video surveillance videos,” EURASIP J. Adv. Signal Process., pp. 231240-10, 2010.
- [39] D. Conte, P. Foggia, G. Percannella, and M. Vento, “A method based on the indirect approach for counting people in crowded scenes,” Proceedings 7th IEEE Int. Conf. on Advanced Video and Signal Based Surveillance (AVSS 2010), pp. 111-118, 2010.
- [40] H. Fradi and J. L. Dugelay, “People counting system in crowded scenes based on feature regression,” 2012 Proc. of the 20th European Signal Processing Conf., pp. 136-140, 2012.
- [41] D. Conte, P. Foggia, G. Percannella, and M. Vento, “Counting moving persons in crowded scenes,” Mach. Vis. Appl., Vol.24, No.5, pp. 1029-1042, 2013.
- [42] R. H. Liang, Y. G. Zhu, and H. X. Wang, “Counting crowd flow based on feature points,” Neurocomputing, Vol.133, No.8, pp. 377-384, 2014.
- [43] L. J. Cao, X. Zhang, W. Ren, and K. Huang, “Large scale crowd analysis based on convolutional neural network,” Pattern Recognit., Vol.48, No.10, pp. 3016-3024, 2015.
- [44] M. Fu, P. Xu, X. Li, Q. Liu, and M. Ye, 4 “Fast crowd density estimation with convolutional neural networks,” Eng. Appl. Artif. Intell., Vol.43, pp. 81-88, 2015.
- [45] C. Zhang, H. Li, X. Wang, and X. Yang, “Cross-scene crowd counting via deep convolutional neural networks,” 2015 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 833-841, 2015.
- [46] T. Xiang and S. Gong, “Video behavior profiling for anomaly detection,” IEEE Trans. Pattern Anal. Mach. Intell., Vol.30, No.5, pp. 893-908, 2008.
- [47] C. C. Loy, T. Xiang, and S. G. Gong, “Detecting and discriminating behavioral anomalies,” Pattern Recogn., Vol.44, No.1, pp. 117-132, 2011.
- [48] T. V. Duong, H. H. Bui, D. Q. Phung, and S. Venkatesh, “Activity recognition and abnormality detection with the switching hidden semi-Markov model,” Proc. 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), Vol.1, No.4, pp. 838-845, 2005.
- [49] P. C. Chung and C. D. Liu, “A daily behavior enabled hidden Markov model for human behavior understanding,” Pattern Recognit., Vol.41, No.5, pp. 1589-1597, 2008.
- [50] L. Kratz and K. Nishino, “Anomaly detection in extremely crowded scenes using spatio-temporal motion pattern models,” 2009 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Vol.1-4, pp. 1446-1453, 2009.
- [51] B. Wang, M. Ye, X. Li, F. Zhao, and J. Ding, “Abnormal crowd behavior detection using high-frequency and spatio-temporal features,” Mach. Vis. Appl., Vol.23, No.3, pp. 501-511, 2012.
- [52] B. Wang, M. Ye, X. Li, and F. Zhao, “Abnormal crowd behavior detection using size-adapted spatio-temporal features,” Int. J. Control Autom. Syst., Vol.9, No.5, pp. 905-912, 2011.
- [53] M. J. Roshtkhari and M. D. Levine, “An on-line, real-time learning method for detecting anomalies in videos using spatio-temporal compositions,” Comput. Vis. Image Underst., Vol.117, No.10, pp. 1436-1452, 2013.
- [54] J. Varadarajan and J. Odobez, “Topic models for scene analysis and abnormality detection,” IEEE Int. Conf. Comput. Vis. Workshops, pp. 1338-1345, 2009.
- [55] R. Mehran, A. Oyama, and M. Shah, “Abnormal crowd behavior detection using social force model,” 2009 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Vol.1-4, pp. 935-942, 2009.
- [56] D. Helbing and P. Molnar, “Social force model for pedestrian dynamics,” Phys. Rev. E, Stat. Phys. Plasmas Fluids Relat. Interdiscip. Top., Vol.51, No.5, pp. 4282-4286, 1998.
- [57] R. Raghavendra, A. Del Bue, M. Cristani, and V. Murino, “Optimizing interaction force for global anomaly detection in crowded scenes,” 2011 IEEE Int. Conf. on Computer Vision Workshops (ICCV Workshops), Vol.21, No.5, pp. 136-143, 2011.
- [58] Y. H. Zhang, L. Qin, H. Yao, and Q. Huang, “Abnormal crowd behavior detection based on social attribute-aware force model,” 2012 19th IEEE Int. Conf. on Image Processing (ICIP 2012), pp. 2689-2692, 2012.
- [59] Y. H. Zhang, L. Qin, R. Ji, H. Yao, and Q. Huang, “Social attribute-aware force model: exploiting richness of interaction for abnormal crowd detection,” IEEE Trans. Circuits Syst. Video Technol., Vol.25, No.7, pp. 1231-1245, 2015.
- [60] X. Y. Cui, Q. Liu, M. Gao, and D. N. Metaxas, “Abnormal detection using interaction energy potentials,” 2011 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Vol.42, No.7, pp. 3161-3167, 2011.
- [61] I. Laptev, M. Marszalek, C. Schmid, and B. Rozenfeld, “Learning realistic human actions from movies,” 2008 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Vol.1-12, pp. 3222-3229, 2008.
- [62] R. Messing, C. Pal, and H. Kautz, “Activity recognition using the velocity histories of tracked keypoints,” 2009 IEEE 12th Int. Conf. on Computer Vision (ICCV), Vol.30, No.2, pp. 104-111, 2009.
- [63] G. Xiong, X. Wu, Y. L. Chen, and Y. Ou, 4 “Abnormal crowd behavior detection based on the energy model,” Proc. 2011 Int. Conf. on Information and Automation (ICIA 2011), pp. 495-500, 2011.
- [64] G. Xiong, X. Wu, J. Cheng, and Y. L. Chen, “Crowd density estimation based on image potential energy model,” IEEE Int. Conf. Rob. Biomimetics (ROBIO), pp. 538-543, 2011.
- [65] Y. Yuan, J. Fang, and Q. Wang, “Online anomaly detection in crowd scenes via structure analysis,” IEEE Trans. Cybern., Vol.45, No.3, pp. 562-575, 2015.
- [66] R. Mehran, B. E. Moore, and M. Shah, “A streakline representation of flow in crowded scenes,” Lect. Notes Comput. Sci., Pt III, Vol.6313 LNCS, pp. 439-452, 2010.
- [67] C. Liu, “Beyond pixels: exploring new representations and applications for motion analysis,” Massachusetts Institute of Technology, 2009.
- [68] C. C. Loy, X. Tao, and S. Gong, “Salient motion detection in crowded scenes,” 2012 5th Int. Symp. on Communications, Control and Signal Processing (ISCCSP 2012), pp. 1-4, 2012.
- [69] X. D. Hou and L. Q. Zhang, “Saliency detection: a spectral residual approach,” 2007 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Vol.1-8, pp. 2280-2287, 2007.