JACIII Vol.20 No.1 pp. 13-25
doi: 10.20965/jaciii.2016.p0013


Motion-Based Depth Estimation for 2D to 3D Video Conversion

Fan Guo*,**, Jin Tang*, and Beiji Zou*,**,†

*School of Information Science and Engineering, Central South University
Changsha, Hunan 410083, China
**Mobile Health Ministry of Education, China Mobile Joint Laboratory
Changsha, Hunan 410012, China
Corresponding author

July 5, 2015
September 11, 2015
Online released:
January 19, 2016
January 20, 2016
depth estimation, video, stereoscopic conversion, motion, virtual view

Recent advances in 3D have increased the importance of stereoscopic content creation and processing. Therefore, converting existing 2D videos into 3D videos is very important for growing 3D market. The most difficult task in 2D-to-3D video conversion is estimating depth map from single-view frame images. Thus, in this paper, we propose a novel motion-based 2D to 3D video conversion method. The method first determines the motion type using the optical flow estimation. Then, different depth estimation processes are performed based on the motion type. For global motion, the depth from motion parallax provides the final depth map. For local motion, the depth from template together with the bilateral filter is used to produce the depth map. Finally, the left- and right-view images are synthesized to generate realistic stereoscopic results for viewers. During the process, the visual artifacts of the synthesized virtual views are effectively eliminated by recovering the separation and loss of foreground objects. A comparative study and quantitative evaluation with other conversion methods are carried out, which demonstrate that better overall quality results may be obtained using the proposed method.

  1. [1]  L. M. Po, X. Y. Xu, Y. S. Zhu, S. H. Zhang, K. W. Cheung, and C. W. Ting, “Automatic 2D-to-3D video conversion technique based on depth-from-motion and color segmentation,” IEEE 10th Int. Conf. on Signal Processing (ICSP), pp. 1000-1003, 2010.
  2. [2]  C. Jung, L. Wang, and X. Zhu, “2D to 3D conversion with motion-type adaptive depth estimation,” Multimedia Systems, 2014.
  3. [3]  C. C. Cheng, C. T. Li, and L.G. Chen, “A novel 2D-to-3D conversion system using edge information,” IEEE Trans. on Consumer Electronics. Vol.56, No.3, pp. 1739-1745, 2010
  4. [4]  C. C. Han and F. F. Hsiao, “Depth estimation and video synthesis for 2D to 3D video conversion,” J. of Signal Processing Systems, Vol.76, No.1, pp. 33-46, 2014.
  5. [5]  Z. B. Zhang, Y. Z. Wang, T. T. Jiang, and W. Gao, “Visual pertinent 2D-to-3D video conversion by multi-cue fusion,” IEEE Int. Conf. on Image Processing (ICIP), pp. 909-912, 2011.
  6. [6]  R. Rzeszutek and D. Androutsos, “Efficient Automatic Depth Estimation for Video,” 18th Int. Conf. on Digital Signal Processing (DSP), pp. 1-6, 2013.
  7. [7]  S. F. Tsai, C. C. Cheng, C. T. Li, and L. G. Chen, “A real-time 1080p 2D-to-3D video conversion system,” IEEE Trans. on Consumer Electronics, Vol.57, No.2, pp. 915-922, 2011.
  8. [8]  R. Phan and D. Androutsos, “Robust semi-automatic depth map generation in unconstrained images and video sequences for 2D to stereoscopic 3D conversion,” IEEE Trans. on Multimedia, Vol.16, No.1, pp. 122-136, 2014.
  9. [9]  X. Zhang and Y. Yang, “Minimum spanning tree and color image segmentation,” IEEE Int. Conf. on Networking, Sensing and Control, pp. 900-904, 2008.
  10. [10]  M. J. Wang, C. F. Chen, and G. G. Lee, “Motion-based depth estimation for 2D-to-3D video conversion,” Visual Communications and Image Processing (VCIP), pp. 1-6, 2013.
  11. [11]  D. Sun, S. Roth, and M. J. Black, “Secrets of Optical Flow Estimation and Their Principles,” IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, pp. 2432-2439, 2010.
  12. [12]  M. W. Tao, J. Bai, P. Kohli, and S. Paris, “SimpleFlow: a non-iterative, sublinear optical flow algorithm,” Computer Graphics Forum, Vol.31, No.2, pp. 345-353, 2012.
  13. [13]  M. V. Rossum and T. Nieuwenhuizen, “Multiple scattering of classical waves: microscopy, mesoscopy and diffusion,” Reviews of Modern Physics, Vol.71, No.1, pp. 313-371, 1999.
  14. [14]  N. Aggarwal and W. C. Karl, “Line Detection in Images through Regularized Hough Transform,” IEEE Trans. on Image Processing, Vol.15, No.3, pp. 582-91, 2006.
  15. [15]  The gco-v3.0 library (gco-v3.0), Available: [Accessed January 21, 2014]
  16. [16]  C. Tomasi and R. Manduchi, “Bilateral filtering for gray and color images,” IEEE 6th Int. Conf. on Computer Vision, pp. 839-846, 1998.
  17. [17]  H. W. Cho, S. W. Chung, M. K. Song, and W. J. Song, “Depth-image-based 3D rendering with edge dependent preprocessing,” IEEE 54th Int. Midwest Symp. on Circuits and Systems, pp. 1-4, 2011.
  18. [18]  3D Video Download, Available: [Accessed on May 7, 2014]
  19. [19]  The Double-Stimulus Continuous Quality-Scale method (DSCQS),
    html [Accessed July 10, 2014]
  20. [20]  U. Celikcan, G. Cimen, E. B. Kevine, and T. Capin, “Attention-aware disparity control in interactive environments,” Visual Computer, Vol.29, pp. 685-694, 2013.
  21. [21]  W. J. Tam, F. Speranza, S. Yano, and H. Ono, “Stereoscopic 3D-TV: visual comfort,” IEEE Trans. on Broadcasting, Vol.57, No.2, pp. 335-346, 2011.

*This site is desgined based on HTML5 and CSS3 for modern browsers, e.g. Chrome, Firefox, Safari, Edge, IE9,10,11, Opera.

Last updated on Mar. 28, 2017