Paper:
Spatio-Temporal Gradient Flow for Efficient Motion Estimation in Sparse Point Clouds
Shuncong Shen*,
, Toshio Ito*,**
, and Toshiya Hirose*

*College of Engineering, Shibaura Institute of Technology
3-7-5 Toyosu, Koto-ku, Tokyo 135-8548, Japan
Corresponding author
**Hyper Digital Twins Co., Ltd.
2-1-17 Nihonbashi, Chuo-ku, Tokyo 103-0027, Japan
With the rapid development of three-dimensional sensors, such as LiDAR, there is an increasing demand for accurate motion estimation from point cloud data in dynamic tasks like autonomous driving and robot navigation. To address the limitations of traditional methods in terms of efficiency and accuracy when handling sparse point clouds containing multiple objects, non-rigid motion, and noise, this paper presents an unsupervised spatio-temporal gradient flow estimation framework, called Spatio-Temporal Gradient Flow (STG-Flow). Unlike traditional methods, this approach does not rely on large labeled datasets or assume rigid-body motion. STG-Flow segments continuous-frame point clouds by combining global density statistics with supervoxel clustering. It then adaptively adjusts clustering parameters using an upper and lower bound filtering mechanism to mitigate the effects of extreme cases. After segmentation, optical flow refinement is applied to each local cluster using spatio-temporal gradient constraints, along with a multi-level robust optimization strategy and domain grouping. This method enhances the stability and accuracy of motion estimation, even under large displacements. Experiments demonstrate that STG-Flow achieves more accurate motion predictions for local object-level motion estimation in sparse scenarios. Its registration accuracy is comparable to the iterative closest point method, while offering approximately ten times higher computational efficiency, showcasing strong real-time performance and robustness.
STG-Flow: motion in sparse point clouds
- [1] D. Chetverikov, D. Svirko, D. Stepanov, and P. Krsek, “The Trimmed Iterative Closest Point algorithm,” 2002 Int. Conf. on Pattern Recognition, Vol.3, pp. 545-548, 2002. https://doi.org/10.1109/ICPR.2002.1047997
- [2] P. Li, R. Wang, Y. Wang, and W. Tao, “Evaluation of the ICP algorithm in 3D point cloud registration,” IEEE Access, Vol.8, pp. 68030-68048, 2020. https://doi.org/10.1109/ACCESS.2020.2986470
- [3] A. Dewan, T. Caselitz, G. D. Tipaldi, and W. Burgard, “Motion-based detection and tracking in 3D LiDAR scans,” 2016 IEEE Int. Conf. on Robotics and Automation, pp. 4508-4513, 2016. https://doi.org/10.1109/ICRA.2016.7487649
- [4] A. Myronenko and X. Song, “Point set registration: Coherent point drift,” IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol.32, No.12, pp. 2262-2275, 2010. https://doi.org/10.1109/TPAMI.2010.46
- [5] K.-L. Low, “Linear least-squares optimization for point-to-plane ICP surface registration,” Technical Report TR04-004, University of North Carolina, 2004.
- [6] J. Zhang, Y. Yao, and B. Deng, “Fast and robust iterative closest point,” IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol.44, No.7, pp. 3450-3466, 2022. https://doi.org/10.1109/TPAMI.2021.3054619
- [7] A. V. Segal, D. Haehnel, and S. Thrun, “Generalized-ICP,” Robotics: Science and Systems V, pp. 161-168, 2010. https://doi.org/10.7551/mitpress/8727.003.0022
- [8] S. Bouaziz, A. Tagliasacchi, and M. Pauly, “Sparse iterative closest point,” Computer Graphics Forum, Vol.32, No.5, pp. 113-123, 2013. https://doi.org/10.1111/cgf.12178
- [9] S. Deguchi and G. Ishigami, “Computationally efficient mapping for a mobile robot with a downsampling method for the iterative closest point,” J. Robot. Mechatron., Vol.30, No.1, pp. 65-75, 2018. https://doi.org/10.20965/jrm.2018.p0065
- [10] X. Liu, C. R. Qi, and L. J. Guibas, “FlowNet3D: Learning scene flow in 3D point clouds,” 2019 IEEE/CVF Conf. on Computer Vision and Pattern Recognition, pp. 529-537, 2019. https://10.1109/CVPR.2019.00062
- [11] X. Gu, Y. Wang, C. Wu, Y. J. Lee, and P. Wang, “HPLFlowNet: Hierarchical permutohedral lattice FlowNet for scene flow estimation on large-scale point clouds,” 2019 IEEE/CVF Conf. on Computer Vision and Pattern Recognition, pp. 3249-3258, 2019. https://doi.org/10.1109/CVPR.2019.00337
- [12] G. Puy, A. Boulch, and R. Marlet, “FLOT: Scene flow on point clouds guided by optimal transport,” Proc. of the 16th European Conf. on Computer Vision, pp. 527-544, 2020. https://doi.org/10.1007/978-3-030-58604-1_32
- [13] D. T. Hoffmann et al., “Floxels: Fast unsupervised voxel based scene flow estimation,” 2025 IEEE/CVF Conf. on Computer Vision and Pattern Recognition, pp. 22328-22337, 2025. https://doi.org/10.1109/CVPR52734.2025.02080
- [14] P. Kadam, J. Gu, S. Liu, and C.-C. J. Kuo, “PointFlowHop: Green and interpretable scene flow estimation from consecutive point clouds,” APSIPA Trans. on Signal and Information Processing, Vol.12, No.4, Article No.e103, 2023. https://doi.org/10.1561/116.00000006
- [15] K. Vedder et al., “Neural Eulerian scene flow fields,” arXiv:2410.02031, 2024. https://doi.org/10.48550/arXiv.2410.02031
- [16] K. Vedder et al., “Scene flow as a partial differential equation,” 13th Int. Conf. on Learning Representations, 2025. https://doi.org/10.48550/arXiv.2410.02031
- [17] Y. Lin and H. Caesar, “ICP-Flow: LiDAR scene flow estimation with ICP,” 2024 IEEE/CVF Conf. on Computer Vision and Pattern Recognition, pp. 15501-15511, 2024. https://doi.org/10.1109/CVPR52733.2024.01468
- [18] H. Li and M. Pauly, “First steps toward the automatic registration of deformable scans,” Technical Report, ETH Zurich, 2007.
- [19] T. N. Linh and H. Hiroshi, “Global iterative closet point using nested annealing for initialization,” Procedia Computer Science, Vol.60, pp. 381-390, 2015. https://doi.org/10.1016/j.procs.2015.08.147
- [20] L. Liang et al., “Nonrigid iterative closest points for registration of 3D biomedical surfaces,” Optics and Lasers in Engineering, Vol.100, pp. 141-154, 2018. https://doi.org/10.1016/j.optlaseng.2017.08.005
- [21] J. Liu, Y. Xu, L. Zhou, and L. Sun, “PCRMLP: A two-stage network for point cloud registration in urban scenes,” Sensors, Vol.23, No.12, Article No.5758, 2023. https://doi.org/10.3390/s23125758
- [22] Z. Lv et al., “A continuous optimization approach for efficient and accurate scene flow,” Proc. of the 14th European Conf. on Computer Vision, Vol.8, pp. 757-773, 2016. https://doi.org/10.1007/978-3-319-46484-8_46
- [23] C. Vogel, K. Schindler, and S. Roth, “3D scene flow estimation with a piecewise rigid scene model,” Int. J. of Computer Vision, Vol.115, No.1, pp. 1-28, 2015. https://doi.org/10.1007/s11263-015-0806-0
- [24] H. J. Kashyap, C. C. Fowlkes, and J. L. Krichmar, “Sparse representations for object- and ego-motion estimations in dynamic scenes,” IEEE Trans. on Neural Networks and Learning Systems, Vol.32, No.6, pp. 2521-2534, 2021. https://doi.org/10.1109/TNNLS.2020.3006467
- [25] S. Morales and R. Klette, “Kalman-filter based spatio-temporal disparity integration,” Pattern Recognition Letters, Vol.34, No.8, pp. 873-883, 2013. https://doi.org/10.1016/j.patrec.2012.10.006
- [26] L. He, S. Li, J. Qiu, and C. Zhang, “DIO-SLAM: A dynamic RGB-D SLAM method combining instance segmentation and optical flow,” Sensors, Vol.24, No.18, Article No.5929, 2024. https://doi.org/10.3390/s24185929
- [27] W. Wu, Z. Wang, Z. Li, W. Liu, and L. Fuxin, “PointPWC-Net: A coarse-to-fine network for supervised and self-supervised scene flow estimation on 3D point clouds,” arXiv:1911.12408, 2019. https://doi.org/10.48550/arXiv.1911.12408
- [28] X. Li, J. K. Pontes, and S. Lucey, “Neural scene flow prior,” Proc. of the 35th Int. Conf. on Neural Information Processing Systems, pp. 7838-7851, 2021.
- [29] M. Jaimez, M. Souiai, J. Stückler, J. Gonzalez-Jimenez, and D. Cremers, “Motion cooperation: Smooth piece-wise rigid scene flow from RGB-D images,” 2015 Int. Conf. on 3D Vision, pp. 64-72, 2015. https://doi.org/10.1109/3DV.2015.15
- [30] V. Golyanik et al., “Multiframe scene flow with piecewise rigid moon,” 2017 Int. Conf. on 3D Vision, pp. 273-281, 2017. https://doi.org/10.1109/3DV.2017.00039
- [31] T. Ito, S. Shen, and T. Hirose, “Object tracking by application of spatio-temporal gradient method to point cloud,” 2024 ITS World Congress, 2024.
- [32] J. Papon, A. Abramov, M. Schoeler, and F. Wörgötter, “Voxel cloud connectivity segmentation – Supervoxels for point clouds,” 2013 IEEE Conf. on Computer Vision and Pattern Recognition, pp. 2027-2034, 2013. https://doi.org/10.1109/CVPR.2013.264
- [33] M. Friedrich, S. Illium, P.-A. Fayolle, and C. Linnhoff-Popien, “A hybrid approach for segmenting and fitting solid primitives to 3D point clouds,” Proc. of the 15th Int. Joint Conf. on Computer Vision, Imaging and Computer Graphics Theory and Applications, pp. 38-48, 2020. https://doi.org/10.5220/0008870600380048
- [34] R. Raguram, O. Chum, M. Pollefeys, J. Matas, and J.-M. Frahm, “USAC: A universal framework for random sample consensus,” IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol.35, No.8, pp. 2022-2038, 2012. https://doi.org/10.1109/TPAMI.2012.257
- [35] V. A. Puligandla and S. Lončarić, “A supervoxel segmentation method with adaptive centroid initialization for point clouds,” IEEE Access, Vol.10, pp. 98525-98534, 2022. https://doi.org/10.1109/ACCESS.2022.3206387
- [36] D. Patel and S. Upadhyay, “Optical flow measurement using Lucas kanade method,” Int. J. of Computer Applications, Vol.61, No.10, pp. 6-10, 2013. https://doi.org/10.5120/9962-4611
- [37] R. Ahuja, C. Baker, and W. Schwarting, “OptFlow: Fast optimization-based scene flow estimation without supervision,” 2024 IEEE/CVF Winter Conf. on Applications of Computer Vision, pp. 3149-3158, 2024. https://doi.org/10.1109/WACV57701.2024.00313
- [38] J. Ding, J. Zhang, L. Ye, and C. Wu, “Kalman-based scene flow estimation for point cloud densification and 3D object detection in dynamic scenes,” Sensors, Vol.24, No.3, Article No.916, 2024. https://doi.org/10.3390/s24030916
- [39] S. Shen, M. Saito, Y. Uzawa, and T. Ito, “Optimal clustering of point cloud by 2D-LiDAR using Kalman filter,” J. Robot. Mechatron., Vol.35, No.2, pp. 424-434, 2023. https://doi.org/10.20965/jrm.2023.p0424
- [40] H. Mittal, B. Okorn, and D. Held, “Just go with the flow: Self-supervised scene flow estimation,” 2020 IEEE/CVF Conf. on Computer Vision and Pattern Recognition, pp. 11174-11182, 2020. https://doi.org/10.1109/CVPR42600.2020.01119
- [41] I. Y. Jang, H. S. Lim, and S. C. An, “Simple method for generating evaluation data for scene flow algorithms,” Electronics Letters, Vol.55, No.1, pp. 24-26, 2019. https://doi.org/10.1049/el.2018.6856
- [42] M. Menze and A. Geiger, “Object scene flow for autonomous vehicles,” 2015 IEEE Conf. on Computer Vision and Pattern Recognition, pp. 3061-3070, 2015. https://doi.org/10.1109/CVPR.2015.7298925
- [43] P. Mordohai, “On the evaluation of scene flow estimation,” Computer Vision – ECCV 2012. Workshops and Demonstrations, pp. 148-157, 2012. https://doi.org/10.1007/978-3-642-33868-7_15
- [44] J. K. Pontes, J. Hays, and S. Lucey, “Scene flow from point clouds with or without learning,” 2020 Int. Conf. on 3D Vision, pp. 261-270, 2020. https://doi.org/10.1109/3DV50981.2020.00036
- [45] H. Guo, J. Zhu, and Y. Chen, “E-LOAM: LiDAR odometry and mapping with expanded local structural information,” IEEE Trans. on Intelligent Vehicles, Vol.8, No.2, pp. 1911-1921, 2023. https://doi.org/10.1109/TIV.2022.3151665
This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.