Motion-Based Depth Estimation for 2D to 3D Video Conversion

Fan Guo; Jin Tang; Beiji Zou

doi:10.20965/jaciii.2016.p0013

single-jc.php

« previous

JACIII Vol.20 No.1 pp. 13-25

doi: 10.20965/jaciii.2016.p0013

(2016)

Paper:

Views over last 60 days: 1,167

Motion-Based Depth Estimation for 2D to 3D Video Conversion

Fan Guo^*,**, Jin Tang^, and Beiji Zou^,**,†

^*School of Information Science and Engineering, Central South University
Changsha, Hunan 410083, China

^**Mobile Health Ministry of Education, China Mobile Joint Laboratory
Changsha, Hunan 410012, China

^†Corresponding author

Received:

July 5, 2015

Accepted:

September 11, 2015

Online released:

January 19, 2016

Published:

January 20, 2016

Keywords:

depth estimation, video, stereoscopic conversion, motion, virtual view

Abstract

Recent advances in 3D have increased the importance of stereoscopic content creation and processing. Therefore, converting existing 2D videos into 3D videos is very important for growing 3D market. The most difficult task in 2D-to-3D video conversion is estimating depth map from single-view frame images. Thus, in this paper, we propose a novel motion-based 2D to 3D video conversion method. The method first determines the motion type using the optical flow estimation. Then, different depth estimation processes are performed based on the motion type. For global motion, the depth from motion parallax provides the final depth map. For local motion, the depth from template together with the bilateral filter is used to produce the depth map. Finally, the left- and right-view images are synthesized to generate realistic stereoscopic results for viewers. During the process, the visual artifacts of the synthesized virtual views are effectively eliminated by recovering the separation and loss of foreground objects. A comparative study and quantitative evaluation with other conversion methods are carried out, which demonstrate that better overall quality results may be obtained using the proposed method.

Cite this article as:

F. Guo, J. Tang, and B. Zou, “Motion-Based Depth Estimation for 2D to 3D Video Conversion,” J. Adv. Comput. Intell. Intell. Inform., Vol.20 No.1, pp. 13-25, 2016.

Data files:

References

[1] L. M. Po, X. Y. Xu, Y. S. Zhu, S. H. Zhang, K. W. Cheung, and C. W. Ting, “Automatic 2D-to-3D video conversion technique based on depth-from-motion and color segmentation,” IEEE 10^th Int. Conf. on Signal Processing (ICSP), pp. 1000-1003, 2010.
[2] C. Jung, L. Wang, and X. Zhu, “2D to 3D conversion with motion-type adaptive depth estimation,” Multimedia Systems, 2014.
[3] C. C. Cheng, C. T. Li, and L.G. Chen, “A novel 2D-to-3D conversion system using edge information,” IEEE Trans. on Consumer Electronics. Vol.56, No.3, pp. 1739-1745, 2010
[4] C. C. Han and F. F. Hsiao, “Depth estimation and video synthesis for 2D to 3D video conversion,” J. of Signal Processing Systems, Vol.76, No.1, pp. 33-46, 2014.
[5] Z. B. Zhang, Y. Z. Wang, T. T. Jiang, and W. Gao, “Visual pertinent 2D-to-3D video conversion by multi-cue fusion,” IEEE Int. Conf. on Image Processing (ICIP), pp. 909-912, 2011.
[6] R. Rzeszutek and D. Androutsos, “Efficient Automatic Depth Estimation for Video,” 18^th Int. Conf. on Digital Signal Processing (DSP), pp. 1-6, 2013.
[7] S. F. Tsai, C. C. Cheng, C. T. Li, and L. G. Chen, “A real-time 1080p 2D-to-3D video conversion system,” IEEE Trans. on Consumer Electronics, Vol.57, No.2, pp. 915-922, 2011.
[8] R. Phan and D. Androutsos, “Robust semi-automatic depth map generation in unconstrained images and video sequences for 2D to stereoscopic 3D conversion,” IEEE Trans. on Multimedia, Vol.16, No.1, pp. 122-136, 2014.
[9] X. Zhang and Y. Yang, “Minimum spanning tree and color image segmentation,” IEEE Int. Conf. on Networking, Sensing and Control, pp. 900-904, 2008.
[10] M. J. Wang, C. F. Chen, and G. G. Lee, “Motion-based depth estimation for 2D-to-3D video conversion,” Visual Communications and Image Processing (VCIP), pp. 1-6, 2013.
[11] D. Sun, S. Roth, and M. J. Black, “Secrets of Optical Flow Estimation and Their Principles,” IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, pp. 2432-2439, 2010.
[12] M. W. Tao, J. Bai, P. Kohli, and S. Paris, “SimpleFlow: a non-iterative, sublinear optical flow algorithm,” Computer Graphics Forum, Vol.31, No.2, pp. 345-353, 2012.
[13] M. V. Rossum and T. Nieuwenhuizen, “Multiple scattering of classical waves: microscopy, mesoscopy and diffusion,” Reviews of Modern Physics, Vol.71, No.1, pp. 313-371, 1999.
[14] N. Aggarwal and W. C. Karl, “Line Detection in Images through Regularized Hough Transform,” IEEE Trans. on Image Processing, Vol.15, No.3, pp. 582-91, 2006.
[15] The gco-v3.0 library (gco-v3.0), Available:
http://vision.csd.uwo.ca/code/ [Accessed January 21, 2014]
[16] C. Tomasi and R. Manduchi, “Bilateral filtering for gray and color images,” IEEE 6^th Int. Conf. on Computer Vision, pp. 839-846, 1998.
[17] H. W. Cho, S. W. Chung, M. K. Song, and W. J. Song, “Depth-image-based 3D rendering with edge dependent preprocessing,” IEEE 54^th Int. Midwest Symp. on Circuits and Systems, pp. 1-4, 2011.
[18] 3D Video Download, Available:
http://www.cad.zju.edu.cn/home/gfzhang/projects/videodepth/data/ [Accessed on May 7, 2014]
[19] The Double-Stimulus Continuous Quality-Scale method (DSCQS), http://www.irisa.fr/armor/lesmembres/Mohamed/Thesis/node147.
html [Accessed July 10, 2014]
[20] U. Celikcan, G. Cimen, E. B. Kevine, and T. Capin, “Attention-aware disparity control in interactive environments,” Visual Computer, Vol.29, pp. 685-694, 2013.
[21] W. J. Tam, F. Speranza, S. Yano, and H. Ono, “Stereoscopic 3D-TV: visual comfort,” IEEE Trans. on Broadcasting, Vol.57, No.2, pp. 335-346, 2011.

This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.

[1] [1] L. M. Po, X. Y. Xu, Y. S. Zhu, S. H. Zhang, K. W. Cheung, and C. W. Ting, “Automatic 2D-to-3D video conversion technique based on depth-from-motion and color segmentation,” IEEE 10^th Int. Conf. on Signal Processing (ICSP), pp. 1000-1003, 2010.

[2] [2] C. Jung, L. Wang, and X. Zhu, “2D to 3D conversion with motion-type adaptive depth estimation,” Multimedia Systems, 2014.

[3] [3] C. C. Cheng, C. T. Li, and L.G. Chen, “A novel 2D-to-3D conversion system using edge information,” IEEE Trans. on Consumer Electronics. Vol.56, No.3, pp. 1739-1745, 2010

[4] [4] C. C. Han and F. F. Hsiao, “Depth estimation and video synthesis for 2D to 3D video conversion,” J. of Signal Processing Systems, Vol.76, No.1, pp. 33-46, 2014.

[5] [5] Z. B. Zhang, Y. Z. Wang, T. T. Jiang, and W. Gao, “Visual pertinent 2D-to-3D video conversion by multi-cue fusion,” IEEE Int. Conf. on Image Processing (ICIP), pp. 909-912, 2011.

[6] [6] R. Rzeszutek and D. Androutsos, “Efficient Automatic Depth Estimation for Video,” 18^th Int. Conf. on Digital Signal Processing (DSP), pp. 1-6, 2013.

[7] [7] S. F. Tsai, C. C. Cheng, C. T. Li, and L. G. Chen, “A real-time 1080p 2D-to-3D video conversion system,” IEEE Trans. on Consumer Electronics, Vol.57, No.2, pp. 915-922, 2011.

[8] [8] R. Phan and D. Androutsos, “Robust semi-automatic depth map generation in unconstrained images and video sequences for 2D to stereoscopic 3D conversion,” IEEE Trans. on Multimedia, Vol.16, No.1, pp. 122-136, 2014.

[9] [9] X. Zhang and Y. Yang, “Minimum spanning tree and color image segmentation,” IEEE Int. Conf. on Networking, Sensing and Control, pp. 900-904, 2008.

[10] [10] M. J. Wang, C. F. Chen, and G. G. Lee, “Motion-based depth estimation for 2D-to-3D video conversion,” Visual Communications and Image Processing (VCIP), pp. 1-6, 2013.

[11] [11] D. Sun, S. Roth, and M. J. Black, “Secrets of Optical Flow Estimation and Their Principles,” IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, pp. 2432-2439, 2010.

[12] [12] M. W. Tao, J. Bai, P. Kohli, and S. Paris, “SimpleFlow: a non-iterative, sublinear optical flow algorithm,” Computer Graphics Forum, Vol.31, No.2, pp. 345-353, 2012.

[13] [13] M. V. Rossum and T. Nieuwenhuizen, “Multiple scattering of classical waves: microscopy, mesoscopy and diffusion,” Reviews of Modern Physics, Vol.71, No.1, pp. 313-371, 1999.

[14] [14] N. Aggarwal and W. C. Karl, “Line Detection in Images through Regularized Hough Transform,” IEEE Trans. on Image Processing, Vol.15, No.3, pp. 582-91, 2006.

[15] [15] The gco-v3.0 library (gco-v3.0), Available:
http://vision.csd.uwo.ca/code/ [Accessed January 21, 2014]

[16] [16] C. Tomasi and R. Manduchi, “Bilateral filtering for gray and color images,” IEEE 6^th Int. Conf. on Computer Vision, pp. 839-846, 1998.

[17] [17] H. W. Cho, S. W. Chung, M. K. Song, and W. J. Song, “Depth-image-based 3D rendering with edge dependent preprocessing,” IEEE 54^th Int. Midwest Symp. on Circuits and Systems, pp. 1-4, 2011.

[18] [18] 3D Video Download, Available:
http://www.cad.zju.edu.cn/home/gfzhang/projects/videodepth/data/ [Accessed on May 7, 2014]

[19] [19] The Double-Stimulus Continuous Quality-Scale method (DSCQS), http://www.irisa.fr/armor/lesmembres/Mohamed/Thesis/node147.
html [Accessed July 10, 2014]

[20] [20] U. Celikcan, G. Cimen, E. B. Kevine, and T. Capin, “Attention-aware disparity control in interactive environments,” Visual Computer, Vol.29, pp. 685-694, 2013.

[21] [21] W. J. Tam, F. Speranza, S. Yano, and H. Ono, “Stereoscopic 3D-TV: visual comfort,” IEEE Trans. on Broadcasting, Vol.57, No.2, pp. 335-346, 2011.

Motion-Based Depth Estimation for 2D to 3D Video Conversion

Fan Guo*,**, Jin Tang*, and Beiji Zou*,**,†

Fan Guo^*,**, Jin Tang^, and Beiji Zou^,**,†