基于3D高斯泼溅与语义分割的室内场景三维重建Indoor scene 3D reconstruction based on 3D Gaussian splatting and semantic segmentation
刘庭杉,刘伟,张洋,冯清娟,田咪
摘要(Abstract):
三维场景重建技术是机器人导航、虚拟现实等应用的关键技术之一。针对现有基于3D高斯泼溅(3D Gaussian splatting, 3DGS)的同步定位与地图构建(simultaneous localization and mapping, SLAM)算法采用统一渲染策略处理所有物品,无法准确表达玻璃、屏幕等特殊光学材质的视觉效果,导致场景重建视觉效果不佳的问题,提出了一种基于3DGS与语义分割的SLAM算法。通过视觉大模型联合推理自动识别玻璃和屏幕等特殊物品,设计了三维物品标签映射策略,将二维识别结果转换至三维空间;同时,设计融合多模态约束的联合优化损失函数,在提升特殊物品渲染真实感的同时,保障SLAM算法的几何重建精度和定位稳定性。实验结果表明,所提算法在玻璃和屏幕区域的渲染质量优于现有方法,同时保持了良好的定位精度。
关键词(KeyWords): 三维重建;同步定位与地图构建(simultaneous localization and mapping, SLAM);语义分割;3D高斯泼溅
基金项目(Foundation): 北京科艺空间文化有限公司横向项目(9162423320)
作者(Author): 刘庭杉,刘伟,张洋,冯清娟,田咪
DOI: 10.16508/j.cnki.11-5866/n.2025.06.006
参考文献(References):
- [1]刘浩敏,章国锋,鲍虎军.基于单目视觉的同时定位与地图构建方法综述[J].计算机辅助设计与图形学学报,2016,28(6):855-868.LIU H M,ZHANG G F,BAO H J. A survey of monocular simultaneous localization and mapping[J]. Journal of ComputerAided Design&Computer Graphics,2016,28(6):855-868.(in Chinese)
- [2]吴建清,宋修广.同步定位与建图技术发展综述[J].山东大学学报(工学版),2021,51(5):16-31.WU J Q,SONG X G. Review on development of simultaneous localization and mapping technology[J]. Journal of Shandong University(Engineering Science),2021,51(5):16-31.(in Chinese)
- [3]JIANG Y W Q,TU J D,LIU Y,et al. GaussianShader:3D Gaussian splatting with shading functions for reflective surfaces[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). Los Alamitos, CA,USA:IEEE Computer Society,2024:5322-5332.
- [4]KERBL B,KOPANAS G,LEIMKUEHLER T,et al. 3D Gaussian splatting for real-time radiance field rendering[J]. ACM Transactions on Graphics,2023,42(4):139.
- [5]KEETHA N,KARHADE J,JATAVALLABHULA K M,et al.SplaTAM:splat,track&map 3D Gaussians for dense RGB-D SLAM[C]//2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). Los Alamitos, CA, USA:IEEE Computer Society,2024:21357-21366.
- [6]YAN C,QU D L,XU D,et al. GS-SLAM:dense visual SLAM with 3D Gaussian splatting[C]//2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). Los Alamitos, CA, USA:IEEE Computer Society,2024:19595-19604.
- [7]KIRILLOV A,MINTUN E,RAVI N,et al. Segment anything[C]//2023 IEEE/CVF International Conference on Computer Vision(ICCV). Los Alamitos, CA, USA:IEEE Computer Society,2023:3992-4003.
- [8]RADFORD A,KIM J W,HALLACY C,et al. Learning transferable visual models from natural language supervision[C]//Proceedings of the 38th International Conference on Machine Learning. San Diego, CA, USA:JMLR,2021,139:8748-8763.
- [9]DAVISON A J,REID I D,MOLTON N D,et al. MonoSLAM:real-time single camera SLAM[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2007,29(6):1052-1067.
- [10]TEED Z,DENG J. DROID-SLAM:deep visual SLAM for monocular,stereo,and RGB-D cameras[J]. Advances in Neural Information Processing Systems,2021,34:16558-16569.
- [11]SHAN T X,ENGLOT B,MEYERS D,et al. LIO-SAM:tightlycoupled lidar inertial odometry via smoothing and mapping[C]//2020 IEEE/RSJ International Conference on Intelligent Robots and Systems(IROS),Las Vegas:IEEE,2020:5135-5142.
- [12]WEN S H,TAO S,LIU X,et al. CD-SLAM:a real-time stereo visual-inertial SLAM for complex dynamic environments with semantic and geometric information[J]. IEEE Transactions on Instrumentation and Measurement,2024,73:1-8.
- [13]MILDENHALL B,SRINIVASAN P P,TANCIK M,et al. NeRF:representing scenes as neural radiance fields for view synthesis[J]. Communications of the ACM,2021,65(1):99-106.
- [14]SUCAR E,LIU S K,ORTIZ J,et al. iMAP:implicit mapping and positioning in real-time[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision(ICCV). New York, USA:IEEE,2021:6229-6238.
- [15]ZHU Z H,PENG S Y,LARSSON V,et al. NICE-SLAM:neural implicit scalable encoding for SLAM[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). Los Alamitos, CA, USA:IEEE Computer Society,2022:12786-12796.
- [16]WANG H Y,WANG J W,AGAPITO L. Co-SLAM:joint coordinate and sparse parametric encodings for neural real-time SLAM[C]//Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Los Alamitos, CA, USA:IEEE Computer Society,2023:13293-13302.
- [17]SANDSTR??M E,LI Y,VAN GOOL L,et al. Point-SLAM:dense neural point cloud-based SLAM[C]//2023 IEEE/CVF International Conference on Computer Vision(ICCV). Los Alamitos, CA, USA:IEEE Computer Society,2023:18387-18398.
- [18]MATSUKI H,MURAI R,KELLY P H J,et al. Gaussian splatting SLAM[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). Los Alamitos, CA, USA:IEEE Computer Society, 2024:18039-18048.
- [19]DENG T C,CHEN Y H,ZHANG L Y,et al. Compact 3D Gaussian splatting for dense visual SLAM[EB/OL].(2024-09-27)[2025-09-23]. https://arxiv. org/abs/2403. 11247.
- [20]CAMPOS C,ELVIRA R,RODR??GUEZ J J G,et al. ORBSLAM3:an accurate open-source library for visual,visualinertial,and multimap SLAM[J]. IEEE Transactions on Robotics,2021,37(6):1874-1890.
- [21]MCCORMAC J, HANDA A, DAVISON A, et al.SemanticFusion:dense 3D semantic mapping with convolutional neural networks[C]//2017 IEEE International Conference on Robotics and Automation(ICRA). New York, USA:IEEE,2017:4628-4635.
- [22]CEN J Z,FANG J M,YANG C,et al. Segment any 3D Gaussians[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto, CA, USA:AAAI,2025,39(2):1971-1979.
- [23]ZHOU S J,CHANG H R,JIANG S C,et al. Feature 3DGS:supercharging 3D Gaussian splatting to enable distilled feature fields[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Los Alamitos, CA,USA:IEEE Computer Society,2024:21676-21685.
- [24]CHEN L C,ZHU Y K,PAPANDREOU G,et al. Encoderdecoder with atrous separable convolution for semantic image segmentation[C]//15th European Conference on Computer Vision(ECCV). Berlin, Germany:Springer-Verlag Berlin,2018:833-851.
- [25]BADRINARAYANAN V,KENDALL A,CIPOLLA R. SegNet:a deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(12):2481-2495.