Efficient Distortion Mitigation in Equirectangular Images for Two-View Pose Estimation

Taisei Ando; Junwoon Lee; Mitsuru Shinozaki; Toshihiro Kitajima; Qi An; Atsushi Yamashita

doi:10.20965/ijat.2025.p0226

single-au.php

« previous

IJAT Vol.19 No.3 pp. 226-236

doi: 10.20965/ijat.2025.p0226

(2025)

Research Paper:

Views over last 60 days: 669

Efficient Distortion Mitigation in Equirectangular Images for Two-View Pose Estimation

Taisei Ando^,† , Junwoon Lee^ , Mitsuru Shinozaki^, Toshihiro Kitajima^, Qi An^* , and Atsushi Yamashita^*

^*The University of Tokyo
5-1-5 Kashiwanoha, Kashiwa, Chiba 277-8563, Japan

^†Corresponding author

^**Technology Innovation R&D Dept. II, Research & Development Headquarters, KUBOTA Corporation
Sakai, Japan

Received:

November 18, 2024

Accepted:

January 9, 2025

Published:

May 5, 2025

Keywords:

equirectangular image, feature matching, two-view pose estimation

Abstract

This study proposes a method to efficiently reduce distortion effects in equirectangular images. Spherical cameras provide a wide field of view, advantageous for localization tasks. When applying standard image processing techniques to spherical images, they are commonly converted into equirectangular images by equirectangular projection, which introduces geometric distortions that can impair localization accuracy. Existing approaches for distortion mitigation frequently encounter a trade-off between accuracy and processing speed. We propose a method that mitigates distortion effects while reducing computational costs to overcome these limitations. Our method incorporates an innovative strategy for image rotation and region selection, improving computational efficiency in feature detection and description. Experimental results for two-view pose estimation, an essential component of localization, showed that our method achieves the fastest processing speed while maintaining accuracy comparable to that of distortion mitigation techniques.

Cite this article as:

T. Ando, J. Lee, M. Shinozaki, T. Kitajima, Q. An, and A. Yamashita, “Efficient Distortion Mitigation in Equirectangular Images for Two-View Pose Estimation,” Int. J. Automation Technol., Vol.19 No.3, pp. 226-236, 2025.

Data files:

References

[1] A. J. Davison, I. D. Reid, N. D. Molton, and O. Stasse, “MonoSLAM: Real-Time Single Camera SLAM,” IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol.29, No.6, pp. 1052-1067, 2007. https://doi.org/10.1109/TPAMI.2007.1049
[2] C. Campos, R. Elvira, J. J. Gomez, J. M. M. Montiel, and J. D. Tardós, “ORB-SLAM3: An Accurate Open-Source Library for Visual, Visual-Inertial and Multi-Map SLAM,” IEEE Trans. on Robotics, Vol.37, No.6, pp. 1874-1890, 2021. https://doi.org/10.1109/TRO.2021.3075644
[3] B. K. Horn and B. G. Schunck, “Determining optical flow,” Artificial Intelligence, Vol.17, No.1, pp. 185-203, 1981. https://doi.org/10.1016/0004-3702(81)90024-2
[4] T. Qin, P. Li, and S. Shen, “VINS-Mono: A Robust and Versatile Monocular Visual-Inertial State Estimator,” IEEE Trans. on Robotics, Vol.34, No.4, pp. 1004-1020, 2018. https://doi.org/10.1109/TRO.2018.2853729
[5] T. Shan, B. Englot, C. Ratti, and R. Daniela, “LVI-SAM: Tightly-coupled Lidar-Visual-Inertial Odometry via Smoothing and Mapping,” Proc. of the 2021 IEEE Int. Conf. on Robotics and Automation (ICRA2021), pp. 5692-5698, 2021. https://doi.org/10.1109/ICRA48506.2021.9561996
[6] J. Lee, R. Komatsu, M. Shinozaki, T. Kitajima, H. Asama, Q. An, and A. Yamashita, “Switch-SLAM: Switching-Based LiDAR-Inertial-Visual SLAM for Degenerate Environments,” IEEE Robotics and Automation Letters, Vol.9, No.8, pp. 7270-7277, 2024. https://doi.org/10.1109/LRA.2024.3421792
[7] L. Valgaerts, A. Bruhn, M. Mainberger, and J. Weickert, “Dense versus sparse approaches for estimating the fundamental matrix,” Int. J. of Computer Vision, Vol.96, pp. 212-234, 2012. https://doi.org/10.1007/s11263-011-0466-7
[8] S. Pathak, A. Moro, H. Fujii, A. Yamashita, and H. Asama, “Spherical Video Stabilization by Estimating Rotation from Dense Optical Flow Fields,” J. Robot. Mechatron., Vol.29, No.3, pp. 566-579, 2017. https://doi.org/10.20965/jrm.2017.p0566
[9] H. Shi, Y. Zhou, K. Yang, X. Yin, Z. Wang, Y. Ye, Z. Yin, S. Meng, P. Li, and K. Wang, “PanoFlow: Learning 360° Optical Flow for Surrounding Temporal Understanding,” IEEE Trans. on Intelligent Transportation Systems, Vol.24, No.5, pp. 5570-5585, 2023. https://doi.org/10.1109/TITS.2023.3241212
[10] J. Murrugarra-Llerena, T. L. T. Da Silveira, and C. R. Jung, “Pose Estimation for Two-View Panoramas based on Keypoint Matching: A Comparative Study and Critical Analysis,” Proc. of the 2022 IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR2022) Workshops, pp. 5198-5207, 2022. https://doi.org/10.1109/CVPRW56347.2022.00568
[11] S. Jiang, K. You, L. Yaxin, D. Weng, and W. Chen, “3D Reconstruction of Spherical Images: A Review of Techniques, Applications, and Prospects,” Geo-spatial Information Science, Vol.27, No.6, pp. 1959-1988, 2024. https://doi.org/10.1080/10095020.2024.2313328
[12] H. Huang and S.-K. Yeung, “360VO: Visual Odometry Using A Single 360 Camera,” Proc. of the 2022 IEEE Int. Conf. on Robotics and Automation (ICRA2022), pp. 5594-5600, 2022. https://doi.org/10.1109/ICRA46639.2022.9812203
[13] Q. Wu, X. Xu, X. Chen, L. Pei, C. Long, J. Deng, G. Liu, S. Yang, S. Wen, and W. Yu, “360-VIO: A Robust Visual-Inertial Odometry Using a 360° Camera,” IEEE Trans. on Industrial Electronics, Vol.71, No.9, pp. 11136-11145, 2024. https://doi.org/10.1109/TIE.2023.3337541
[14] J. Cruz-Mota, I. Bogdanova, B. Paquier, M. Bierlaire, and J.-P. Thiran, “Scale Invariant Feature Transform on the Sphere: Theory and Applications,” Int. J. of Computer Vision, Vol.98, No.2, pp. 217-241, 2012. https://doi.org/10.1007/s11263-011-0505-4
[15] Q. Zhao, W. Feng, L. Wan, and J. Zhang, “SPHORB: A Fast and Robust Binary Feature on the Sphere,” Int. J. of Computer Vision, Vol.113, No.2, pp. 1573-1405, 2015. https://doi.org/10.1007/s11263-014-0787-4
[16] H. Guan and W. A. P. Smith, “BRISKS: Binary Features for Spherical Images on a Geodesic Grid,” Proc. of the 2017 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR2017), pp. 4886-4894, 2017. https://doi.org/10.1109/CVPR.2017.519
[17] B. Coors, A. P. Condurache, and A. Geiger, “SphereNet: Learning Spherical Representations for Detection and Classification in Omnidirectional Images,” Proc. of the 13th European Conf. on Computer Vision (ECCV2018), pp. 518-533, 2018. https://doi.org/10.1007/978-3-030-01240-3_32
[18] Y. Li, C. Barnes, K. Huang, and F.-L. Zhang, “Deep 360° Optical Flow Estimation Based on Multi-projection Fusion,” Proc. of the 17th European Conf. on Computer Vision (ECCV2022), pp. 336-352, 2022. https://doi.org/10.1007/978-3-031-19833-5_20
[19] Y. Wang, S. Cai, S.-J. Li, Y. Liu, Y. Guo, T. Li, and M.-M. Cheng, “CubemapSLAM: A Piecewise-Pinhole Monocular Fisheye SLAM System,” Proc. of the Asian Conf. on Computer Vision, pp. 34-49, 2018. https://doi.org/10.1007/978-3-030-20876-9_3
[20] M. Eder, M. Shvets, J. Lim, and J.-M. Frahm, “Tangent Images for Mitigating Spherical Distortion,” Proc. of the 2020 IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR2020), pp. 12423-12431, 2020. https://doi.org/10.1109/CVPR42600.2020.01244
[21] H. Taira, Y. Inoue, A. Torii, and M. Okutomi, “Robust feature matching for distorted projection by spherical cameras,” IPSJ Trans. on Computer Vision and Applications, Vol.7, pp. 84-88, 2015. https://doi.org/10.2197/ipsjtcva.7.84
[22] D. G. Lowe, “Distinctive Image Features from Scale-Invariant Keypoints,” Int. J. of Computer Vision, Vol.60, No.2, pp. 91-110, 2004. https://doi.org/10.1023/B:VISI.0000029664.99615.94
[23] E. Rublee, V. Rabaud, K. Konolige, and G. Bradski, “ORB: An Efficient Alternative to SIFT or SURF,” Proc. of the 2011 Int. Conf. on Computer Vision (ICCV2011), pp. 2564-2571, 2011. https://doi.org/10.1109/ICCV.2011.6126544
[24] P. F. Alcantarilla, J. Nuevo, and A. Bartoli, “Fast Explicit Diffusion for Accelerated Features in Nonlinear Scale Spaces,” Proc. of British Machine Vision Conf. (BMVC2013), 2013.
[25] D. DeTone, T. Malisiewicz, and A. Rabinovich, “SuperPoint: Self-Supervised Interest Point Detection and Description,” Proc. of the 2018 IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR2018) Workshops, pp. 337-349, 2018. https://doi.org/10.1109/CVPRW.2018.00060
[26] X. Zhao, X. Wu, W. Chen, P. C. Y. Chen, Q. Xu, and Z. Li, “ALIKED: A Lighter Keypoint and Descriptor Extraction Network via Deformable Transformation,” IEEE Trans. on Instrumentation and Measurement, pp. 1-16, 2023. https://doi.org/10.1109/TIM.2023.3271000
[27] J. Sun, Z. Shen, Y. Wang, H. Bao, and X. Zhou, “LoFTR: Detector-Free Local Feature Matching with Transformers,” Proc. of the 2021 IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR2021), pp. 8918-8927, 2021. https://doi.org/10.1109/CVPR46437.2021.00881
[28] P.-E. Sarlin, D. DeTone, T. Malisiewicz, and A. Rabinovich, “SuperGlue: Learning Feature Matching with Graph Neural Networks,” Proc. of the 2020 IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR2020), pp. 4937-4946, 2020. https://doi.org/10.1109/CVPR42600.2020.00499
[29] C. Gava, V. Mukunda, T. Habtegebrial, F. Raue, S. Palacio, and A. Dengel, “SphereGlue: Learning Keypoint Matching on High Resolution Spherical Images,” Proc. of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR2023) Workshops, pp. 6133-6143, 2023. https://doi.org/10.1109/CVPRW59228.2023.00653
[30] T. Ando, J. Lee, M. Shinozaki, T. Kitajima, Q. An, and A. Yamashita, “Highly Accurate and Fast Two-view Pose Estimation by Fast Reduction of Spherical Image Distortion Effects,” Proc. of the 24th Int. Conf. on Control, Automation and Systems (ICCAS2024), pp. 774-779, 2024.
[31] H. Huang, C. Liu, Y. Zhu, C. Hui, T. Braud, and S.-K. Yeung, “360Loc: A Dataset and Benchmark for Omnidirectional Visual Localization with Cross-device Queries,” Proc. of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition, pp. 22314-22324, 2024. https://doi.org/10.1109/CVPR52733.2024.02106
[32] M. A. Fischler and R. C. Bolles, “Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography,” Communications of the ACM, Vol.24, No.6, pp. 381-395, 1981. https://doi.org/10.1145/358669.358692
[33] R. Hartley, “In Defense of the Eight-Point Algorithm,” IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol.19, No.6, pp. 580-593, 1997. https://doi.org/10.1109/34.601246

This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.

[1] [1] A. J. Davison, I. D. Reid, N. D. Molton, and O. Stasse, “MonoSLAM: Real-Time Single Camera SLAM,” IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol.29, No.6, pp. 1052-1067, 2007. https://doi.org/10.1109/TPAMI.2007.1049

[2] [2] C. Campos, R. Elvira, J. J. Gomez, J. M. M. Montiel, and J. D. Tardós, “ORB-SLAM3: An Accurate Open-Source Library for Visual, Visual-Inertial and Multi-Map SLAM,” IEEE Trans. on Robotics, Vol.37, No.6, pp. 1874-1890, 2021. https://doi.org/10.1109/TRO.2021.3075644

[3] [3] B. K. Horn and B. G. Schunck, “Determining optical flow,” Artificial Intelligence, Vol.17, No.1, pp. 185-203, 1981. https://doi.org/10.1016/0004-3702(81)90024-2

[4] [4] T. Qin, P. Li, and S. Shen, “VINS-Mono: A Robust and Versatile Monocular Visual-Inertial State Estimator,” IEEE Trans. on Robotics, Vol.34, No.4, pp. 1004-1020, 2018. https://doi.org/10.1109/TRO.2018.2853729

[5] [5] T. Shan, B. Englot, C. Ratti, and R. Daniela, “LVI-SAM: Tightly-coupled Lidar-Visual-Inertial Odometry via Smoothing and Mapping,” Proc. of the 2021 IEEE Int. Conf. on Robotics and Automation (ICRA2021), pp. 5692-5698, 2021. https://doi.org/10.1109/ICRA48506.2021.9561996

[6] [6] J. Lee, R. Komatsu, M. Shinozaki, T. Kitajima, H. Asama, Q. An, and A. Yamashita, “Switch-SLAM: Switching-Based LiDAR-Inertial-Visual SLAM for Degenerate Environments,” IEEE Robotics and Automation Letters, Vol.9, No.8, pp. 7270-7277, 2024. https://doi.org/10.1109/LRA.2024.3421792

[7] [7] L. Valgaerts, A. Bruhn, M. Mainberger, and J. Weickert, “Dense versus sparse approaches for estimating the fundamental matrix,” Int. J. of Computer Vision, Vol.96, pp. 212-234, 2012. https://doi.org/10.1007/s11263-011-0466-7

[8] [8] S. Pathak, A. Moro, H. Fujii, A. Yamashita, and H. Asama, “Spherical Video Stabilization by Estimating Rotation from Dense Optical Flow Fields,” J. Robot. Mechatron., Vol.29, No.3, pp. 566-579, 2017. https://doi.org/10.20965/jrm.2017.p0566

[9] [9] H. Shi, Y. Zhou, K. Yang, X. Yin, Z. Wang, Y. Ye, Z. Yin, S. Meng, P. Li, and K. Wang, “PanoFlow: Learning 360° Optical Flow for Surrounding Temporal Understanding,” IEEE Trans. on Intelligent Transportation Systems, Vol.24, No.5, pp. 5570-5585, 2023. https://doi.org/10.1109/TITS.2023.3241212

[10] [10] J. Murrugarra-Llerena, T. L. T. Da Silveira, and C. R. Jung, “Pose Estimation for Two-View Panoramas based on Keypoint Matching: A Comparative Study and Critical Analysis,” Proc. of the 2022 IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR2022) Workshops, pp. 5198-5207, 2022. https://doi.org/10.1109/CVPRW56347.2022.00568

[11] [11] S. Jiang, K. You, L. Yaxin, D. Weng, and W. Chen, “3D Reconstruction of Spherical Images: A Review of Techniques, Applications, and Prospects,” Geo-spatial Information Science, Vol.27, No.6, pp. 1959-1988, 2024. https://doi.org/10.1080/10095020.2024.2313328

[12] [12] H. Huang and S.-K. Yeung, “360VO: Visual Odometry Using A Single 360 Camera,” Proc. of the 2022 IEEE Int. Conf. on Robotics and Automation (ICRA2022), pp. 5594-5600, 2022. https://doi.org/10.1109/ICRA46639.2022.9812203

[13] [13] Q. Wu, X. Xu, X. Chen, L. Pei, C. Long, J. Deng, G. Liu, S. Yang, S. Wen, and W. Yu, “360-VIO: A Robust Visual-Inertial Odometry Using a 360° Camera,” IEEE Trans. on Industrial Electronics, Vol.71, No.9, pp. 11136-11145, 2024. https://doi.org/10.1109/TIE.2023.3337541

[14] [14] J. Cruz-Mota, I. Bogdanova, B. Paquier, M. Bierlaire, and J.-P. Thiran, “Scale Invariant Feature Transform on the Sphere: Theory and Applications,” Int. J. of Computer Vision, Vol.98, No.2, pp. 217-241, 2012. https://doi.org/10.1007/s11263-011-0505-4

[15] [15] Q. Zhao, W. Feng, L. Wan, and J. Zhang, “SPHORB: A Fast and Robust Binary Feature on the Sphere,” Int. J. of Computer Vision, Vol.113, No.2, pp. 1573-1405, 2015. https://doi.org/10.1007/s11263-014-0787-4

[16] [16] H. Guan and W. A. P. Smith, “BRISKS: Binary Features for Spherical Images on a Geodesic Grid,” Proc. of the 2017 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR2017), pp. 4886-4894, 2017. https://doi.org/10.1109/CVPR.2017.519

[17] [17] B. Coors, A. P. Condurache, and A. Geiger, “SphereNet: Learning Spherical Representations for Detection and Classification in Omnidirectional Images,” Proc. of the 13th European Conf. on Computer Vision (ECCV2018), pp. 518-533, 2018. https://doi.org/10.1007/978-3-030-01240-3_32

[18] [18] Y. Li, C. Barnes, K. Huang, and F.-L. Zhang, “Deep 360° Optical Flow Estimation Based on Multi-projection Fusion,” Proc. of the 17th European Conf. on Computer Vision (ECCV2022), pp. 336-352, 2022. https://doi.org/10.1007/978-3-031-19833-5_20

[19] [19] Y. Wang, S. Cai, S.-J. Li, Y. Liu, Y. Guo, T. Li, and M.-M. Cheng, “CubemapSLAM: A Piecewise-Pinhole Monocular Fisheye SLAM System,” Proc. of the Asian Conf. on Computer Vision, pp. 34-49, 2018. https://doi.org/10.1007/978-3-030-20876-9_3

[20] [20] M. Eder, M. Shvets, J. Lim, and J.-M. Frahm, “Tangent Images for Mitigating Spherical Distortion,” Proc. of the 2020 IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR2020), pp. 12423-12431, 2020. https://doi.org/10.1109/CVPR42600.2020.01244

[21] [21] H. Taira, Y. Inoue, A. Torii, and M. Okutomi, “Robust feature matching for distorted projection by spherical cameras,” IPSJ Trans. on Computer Vision and Applications, Vol.7, pp. 84-88, 2015. https://doi.org/10.2197/ipsjtcva.7.84

[22] [22] D. G. Lowe, “Distinctive Image Features from Scale-Invariant Keypoints,” Int. J. of Computer Vision, Vol.60, No.2, pp. 91-110, 2004. https://doi.org/10.1023/B:VISI.0000029664.99615.94

[23] [23] E. Rublee, V. Rabaud, K. Konolige, and G. Bradski, “ORB: An Efficient Alternative to SIFT or SURF,” Proc. of the 2011 Int. Conf. on Computer Vision (ICCV2011), pp. 2564-2571, 2011. https://doi.org/10.1109/ICCV.2011.6126544

[24] [24] P. F. Alcantarilla, J. Nuevo, and A. Bartoli, “Fast Explicit Diffusion for Accelerated Features in Nonlinear Scale Spaces,” Proc. of British Machine Vision Conf. (BMVC2013), 2013.

[25] [25] D. DeTone, T. Malisiewicz, and A. Rabinovich, “SuperPoint: Self-Supervised Interest Point Detection and Description,” Proc. of the 2018 IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR2018) Workshops, pp. 337-349, 2018. https://doi.org/10.1109/CVPRW.2018.00060

[26] [26] X. Zhao, X. Wu, W. Chen, P. C. Y. Chen, Q. Xu, and Z. Li, “ALIKED: A Lighter Keypoint and Descriptor Extraction Network via Deformable Transformation,” IEEE Trans. on Instrumentation and Measurement, pp. 1-16, 2023. https://doi.org/10.1109/TIM.2023.3271000

[27] [27] J. Sun, Z. Shen, Y. Wang, H. Bao, and X. Zhou, “LoFTR: Detector-Free Local Feature Matching with Transformers,” Proc. of the 2021 IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR2021), pp. 8918-8927, 2021. https://doi.org/10.1109/CVPR46437.2021.00881

[28] [28] P.-E. Sarlin, D. DeTone, T. Malisiewicz, and A. Rabinovich, “SuperGlue: Learning Feature Matching with Graph Neural Networks,” Proc. of the 2020 IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR2020), pp. 4937-4946, 2020. https://doi.org/10.1109/CVPR42600.2020.00499

[29] [29] C. Gava, V. Mukunda, T. Habtegebrial, F. Raue, S. Palacio, and A. Dengel, “SphereGlue: Learning Keypoint Matching on High Resolution Spherical Images,” Proc. of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR2023) Workshops, pp. 6133-6143, 2023. https://doi.org/10.1109/CVPRW59228.2023.00653

[30] [30] T. Ando, J. Lee, M. Shinozaki, T. Kitajima, Q. An, and A. Yamashita, “Highly Accurate and Fast Two-view Pose Estimation by Fast Reduction of Spherical Image Distortion Effects,” Proc. of the 24th Int. Conf. on Control, Automation and Systems (ICCAS2024), pp. 774-779, 2024.

[31] [31] H. Huang, C. Liu, Y. Zhu, C. Hui, T. Braud, and S.-K. Yeung, “360Loc: A Dataset and Benchmark for Omnidirectional Visual Localization with Cross-device Queries,” Proc. of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition, pp. 22314-22324, 2024. https://doi.org/10.1109/CVPR52733.2024.02106

[32] [32] M. A. Fischler and R. C. Bolles, “Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography,” Communications of the ACM, Vol.24, No.6, pp. 381-395, 1981. https://doi.org/10.1145/358669.358692

[33] [33] R. Hartley, “In Defense of the Eight-Point Algorithm,” IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol.19, No.6, pp. 580-593, 1997. https://doi.org/10.1109/34.601246

Efficient Distortion Mitigation in Equirectangular Images for Two-View Pose Estimation

Taisei Ando*,† , Junwoon Lee* , Mitsuru Shinozaki**, Toshihiro Kitajima**, Qi An* , and Atsushi Yamashita*

Taisei Ando^,† , Junwoon Lee^ , Mitsuru Shinozaki^, Toshihiro Kitajima^, Qi An^* , and Atsushi Yamashita^*