低纹理动态场景下基于特征增强的视觉惯导SLAM方法

解明扬; 徐鑫; 杨晨; 余子锐; 王垚

doi:10.11959/j.issn.2096-6652.202543

您当前的位置：

首页 >

文章列表页 >

低纹理动态场景下基于特征增强的视觉惯导SLAM方法

学术论文 | 更新时间：2026-03-18

- 低纹理动态场景下基于特征增强的视觉惯导SLAM方法
- Feature-enhanced visual-inertial SLAM method for low-texture dynamic environment
- 智能科学与技术学报 2025年7卷第4期页码：433-443
- 作者机构：
  
  南京航空航天大学自动化学院，江苏南京 211106
- 作者简介：
  
  [ "解明扬（1988- ），男，博士，南京航空航天大学自动化学院副研究员，主要研究方向为智能机器人技术。" ]
  [ "徐鑫（2001- ），男，南京航空航天大学自动化学院硕士生，主要研究方向为移动机器人自主导航。" ]
  [ "杨晨（2004- ），男，南京航空航天大学自动化学院研究助理，主要研究方向为机器人路径规划技术。" ]
  [ "余子锐（2005- ），男，南京航空航天大学自动化学院研究助理，主要研究方向为机器人路径规划技术。" ]
  [ "王垚，（2005- ），女，南京航空航天大学自动化学院研究助理，主要研究方向为机器人路径规划技术。" ]
- 基金信息：
  
  国家重点研发计划项目(2023YFB4704400);国家自然科学基金项目(62373186);江苏省自然科学基金项目(BK20231440);中央高校基本科研业务费(NZ2024-033)
- DOI：10.11959/j.issn.2096-6652.202543
  中图分类号： TP242.6
- 收稿：2025-06-21，
  
  修回：2025-12-02，
  
  录用：2025-12-04，
  
  纸质出版：2025-12-15
- 稿件说明：
移动端阅览
解明扬,徐鑫,杨晨等.低纹理动态场景下基于特征增强的视觉惯导SLAM方法[J].智能科学与技术学报,2025,07(04):433-443.

XIE Mingyang,XU Xin,YANG Chen,et al.Feature-enhanced visual-inertial SLAM method for low-texture dynamic environment[J].Chinese Journal of Intelligent Science and Technology,2025,07(04):433-443.
解明扬,徐鑫,杨晨等.低纹理动态场景下基于特征增强的视觉惯导SLAM方法[J].智能科学与技术学报,2025,07(04):433-443. DOI： 10.11959/j.issn.2096-6652.202543.

XIE Mingyang,XU Xin,YANG Chen,et al.Feature-enhanced visual-inertial SLAM method for low-texture dynamic environment[J].Chinese Journal of Intelligent Science and Technology,2025,07(04):433-443. DOI： 10.11959/j.issn.2096-6652.202543.

摘要

针对低纹理动态场景下SLAM系统难以提取足够多的稳定特征点且易发生错误匹配，进而导致系统位姿估计精度与鲁棒性不足的难题，提出一种面向低纹理动态场景的单目视觉惯导SLAM系统，实现特征提取匹配与动态特征点判别的性能提升。首先，用基于深度学习的SuperPoint特征提取与LightGlue特征匹配模块代替现有ORB-SLAM3系统前端，提升低纹理区域特征提取与匹配的鲁棒性；其次，结合YOLOv8-seg实现动态区域语义分割，利用IMU预积分估计相机位姿变化，通过构建联合的动态点剔除机制实现更细粒度的动态点剔除，提高动态干扰场景下系统的精度和鲁棒性。最后，在公开数据集和实际场景中进行对比实验，验证所提方法的性能。结果表明，相较于现有基于视觉及视觉惯导的SLAM方法，所提系统在实际低纹理动态场景中绝对轨迹误差的均方根误差降低88.4%及以上，标准差降低90%及以上，具有更高的定位精度和鲁棒性。

Abstract

To address the challenge of SLAM systems struggling to extract sufficient stable feature points and being prone to incorrect matching in low-texture dynamic environments

which leads to poor accuracy and robustness in pose estimation

this paper proposes a monocular visual-inertial SLAM system specifically designed for low-texture dynamic scenarios. The proposed method achieves improved performance in feature extraction

matching

and dynamic feature point discrimination. First

a deep learning-based SuperPoint feature extraction and LightGlue feature matching modules are employed to replace the existing ORB-SLAM3 frontend

significantly enhancing the robustness of feature extraction and matching in weak texture areas. Second

by integrating YOLO-seg for dynamic region semantic segmentation and leveraging IMU pre-integration to estimate camera pose changes

a joint dynamic point removal mechanism is constructed to achieve finer-grained dynamic point filtering

thereby enhancing system accuracy and robustness in dynamic interference scenarios. Finally

the performance of the proposed method was validated through comparative experiments on public datasets and real-world scenarios

and the proposed system achieves a reduction of 88.4% or more in the root mean square error of absolute trajectory error

and 90% or more in the standard deviation in real-world low-texture dynamic scenarios

exhibiting superior positioning accuracy and robustness when compared with existing visual and visual-inertial SLAM approaches.

关键词

Keywords

references

王霞, 左一凡. 视觉SLAM研究进展[J]. 智能系统学报, 2020, 15(5): 825-834.

WANG X, ZUO Y F. Advances in visual SLAM research[J]. CAAI Transactions on Intelligent Systems, 2020, 15(5): 825-834.

蔡显奇, 王晓松, 李玮. 一种室内弱纹理环境下的视觉SLAM算法[J]. 机器人, 2024, 46(3): 284-293, 304.

CAI X Q, WANG X S, LI W. A visual SLAM algorithm in indoor weak texture environment[J]. Robot, 2024, 46(3): 284-293, 304.

HUANG G P, MOURIKIS A I, ROUMELIOTIS S I. Observability-based rules for designing consistent EKF SLAM estimators[J]. International Journal of Robotics Research, 2010, 29(5): 502-528.

JIANG X Y, LI H, CHEN C Q, et al. DDIO-mapping: a fast and robust visual-inertial odometry for low-texture environment challenge[J]. IEEE Transactions on Industrial Informatics, 2024, 20(3): 4418-4428.

WEN S H, LI X F, LIU X, et al. Dynamic SLAM: a visual SLAM in outdoor dynamic scenes[J]. IEEE Transactions on Instrumentation and Measurement, 2023, 72: 5028911.

PUMAROLA A, VAKHITOV A, AGUDO A, et al. PL-SLAM: real-time monocular visual SLAM with points and lines[C]//Proceedings of the 2017 IEEE International Conference on Robotics and Automation. Piscataway: IEEE Press, 2017: 4503-4508.

HE Y J, ZHAO J, GUO Y, et al. PL-VIO: tightly-coupled monocular visual-inertial odometry using point and line features[J]. Sensors, 2018, 18(4): 1159.

FU Q, WANG J, YU H, et al. PL-VINS: real-time monocular visual-inertial SLAM with point and line features[EB]. arXiv preprint, 2020. arXiv:2009.07462>.

DETONE D, MALISIEWICZ T, RABINOVICH A. SuperPoint: self-supervised interest point detection and description[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway: IEEE Press, 2018: 337-33712.

TYSZKIEWICZ M, FUA P, TRULLS E. DISK: learning local features with policy gradient[J]. Advances in Neural Information Processing Systems, 2020, 33: 14254-14265.

SUN J M, SHEN Z H, WANG Y A, et al. LoFTR: detector-free local feature matching with transformers[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2021: 8918-8927.

WANG Y F, HE X Y, PENG S D, et al. Efficient LoFTR: semi-dense local feature matching with sparse-like speed[C]//Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2024: 21666-21675.

POTJE G, CADAR F, ARAUJO A, et al. XFeat: accelerated features for lightweight image matching[C]//Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2024: 2682-2691.

SARLIN P E, DETONE D, MALISIEWICZ T, et al. SuperGlue: learning feature matching with graph neural networks[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 4937-4946.

LINDENBERGER P, SARLIN P E, POLLEFEYS M. LightGlue: local feature matching at light speed[C]//Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE Press, 2023: 17581-17592.

JIANG H W, KARPUR A, CAO B Y, et al. OmniGlue: generalizable feature matching with foundation model guidance[C]//Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2024: 19865-19875.

王柯赛, 姚锡凡, 黄宇, 等. 动态环境下的视觉SLAM研究评述[J]. 机器人, 2021, 43(6): 715-732.

WANG K S, YAO X F, HUANG Y, et al. Review of visual SLAM in dynamic environment[J]. Robot, 2021, 43(6): 715-732.

YU C, LIU Z X, LIU X J, et al. DS-SLAM: a semantic visual SLAM towards dynamic environments[C]//Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway: IEEE Press, 2018: 1168-1174.

KUNDU A, KRISHNA K M, SIVASWAMY J. Moving object detection by multi-view geometric techniques from a single camera mounted robot[C]//Proceedings of the 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway: IEEE Press, 2009: 4306-4312.

ZOU D P, TAN P. CoSLAM: collaborative visual SLAM in dynamic environments[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(2): 354-366.

NARAYANA M, HANSON A, LEARNED-MILLER E. Coherent motion segmentation in moving camera videos using optical flow orientations[C]//Proceedings of the 2013 IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2013: 1577-1584.

李嘉铭, 解明扬, 张民, 等. 动态环境下基于语义信息与几何约束的视觉SLAM系统[J]. 智能科学与技术学报, 2023, 5(4): 477-485.

LI J M, XIE M Y, ZHANG M, et al. Visual SLAM based on semantic information and geometric constraints in dynamic environment[J]. Chinese Journal of Intelligent Science and Technology, 2023, 5(4): 477-485.

XIAO L H, WANG J G, QIU X S, et al. Dynamic-SLAM: semantic monocular visual localization and mapping based on deep learning in dynamic environment[J]. Robotics and Autonomous Systems, 2019, 117: 1-16.

BESCOS B, FÁCIL J M, CIVERA J, et al. DynaSLAM: tracking, mapping, and inpainting in dynamic scenes[J]. IEEE Robotics and Automation Letters, 2018, 3(4): 4076-4083.

KIM D H, HAN S B, KIM J H. Visual odometry algorithm using an RGB-D sensor and IMU in a highly dynamic environment[C]//Robot Intelligence Technology and Applications 3. Cham: Springer, 2015: 11-26.

XU H, YANG C G, LI Z J. OD-SLAM: real-time localization and mapping in dynamic environment through multi-sensor fusion[C]//Proceedings of the 2020 5th International Conference on Advanced Robotics and Mechatronics (ICARM). Piscataway: IEEE Press, 2020: 172-177.

CHAVEZ-GARCIA R O, AYCARD O. Multiple sensor fusion and classification for moving object detection and tracking[J]. IEEE Transactions on Intelligent Transportation Systems, 2016, 17(2): 525-534.

VARGHESE R, M S. YOLOv8: a novel object detection algorithm with enhanced performance and robustness[C]//Proceedings of the 2024 International Conference on Advances in Data Engineering and Intelligent Computing Systems. Piscataway: IEEE Press, 2024: 1-6.

HUANG X T, CHEN X B, ZHANG N, et al. ADM-SLAM: accurate and fast dynamic visual SLAM with adaptive feature point extraction, Deeplabv3pro, and multi-view geometry[J]. Sensors, 2024, 24(11): 3578.

CHEN L C, PAPANDREOU G, SCHROFF F, et al. Rethinking atrous convolution for semantic image segmentation[EB]. arXiv preprint, 2017, arXiv:1706.05587.

LIU J H, LI X F, LIU Y Q, et al. RGB-D inertial odometry for a resource-restricted robot in dynamic environments[J]. IEEE Robotics and Automation Letters, 2022, 7(4): 9573-9580.

REDMON J, FARHADI A. Yolov3: an incremental improvement[EB]. arXiv preprint, 2018, arXiv:1804.02767.

CONG P C, LI J X, LIU J J, et al. SEG-SLAM: dynamic indoor RGB-D visual SLAM integrating geometric and YOLOv5-based semantic information[J]. Sensors, 2024, 24(7): 2102.

BURRI M, NIKOLIC J, GOHL P, et al. The EuRoC micro aerial vehicle datasets[J]. International Journal of Robotics Research, 2016, 35(10): 1157-1163.

SCHUBERT D, GOLL T, DEMMEL N, et al. The TUM VI benchmark for evaluating visual-inertial odometry[C]//Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway: IEEE Press, 2018: 1680-1687.

CAMPOS C, ELVIRA R, RODRÍGUEZ J J G, et al. ORB-SLAM3: an accurate open-source library for visual, visual-inertial, and multimap SLAM[J]. IEEE Transactions on Robotics, 2021, 37(6): 1874-1890.

LEUTENEGGER S, LYNEN S, BOSSE M, et al. Keyframe-based visual-inertial odometry using nonlinear optimization[J]. The International Journal of Robotics Research, 2015, 34(3): 314-334.

HUAI Z, HUANG G Q. Robocentric visual-inertial odometry[C]//Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway: IEEE Press, 2018: 6319-6326.

USENKO V, DEMMEL N, SCHUBERT D, et al. Visual-inertial mapping with non-linear factor recovery[J]. IEEE Robotics and Automation Letters, 2020, 5(2): 422-429.

QIN T, CAO S, PAN J, et al. A general optimization-based framework for global pose estimation with multiple sensors[EB]. arXiv preprint, 2019, arXiv: 1901.03642.

浏览量

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

动态环境下基于语义信息与几何约束的视觉SLAM系统

基于选择性深度嵌入聚类的复杂未知雷达信号分选方法

HCANet：基于分层Transformer架构的微表情识别模型

Cerberus：基于深度学习的跨网站社交机器人检测系统

Cerberus：基于深度学习的跨站社交网络机器人检测系统