基于频率-空间卷积的轻量级人脸检测模型A lightweight face detection model based on frequency-spatial convolution
范佳炫,韩晶,滕尚志,吕学强
摘要(Abstract):
针对人脸检测中小尺度人脸和遮挡人脸难以准确识别、且实时性受限的问题,提出了一种轻量级人脸检测骨干网络FSNet。引入频率-空间卷积(frequency-spatial convolution,FSConv),通过融合全局频率建模与局部空间建模,高效捕获多频段信息并动态聚合特征,从而增强对小尺度人脸和复杂遮挡场景的表征能力。同时,设计了混合深度可分离双分支下采样模块(hybrid depthwise-separable dual-branch downsampling block, HDWBlock),结合最大池化与深度可分离卷积的互补优势,在下采样过程中兼顾高阶语义与局部判别特征,有效提升多尺度特征提取能力。实验结果表明,FSNet在RetinaFace框架下在公开的WIDER FACE Hard子集上检测精度达到91.22%,在小尺度人脸与遮挡场景下与主流模型精度相当,同时参数量与计算开销显著降低,在保证检测精度的前提下实现了更高的推理效率,展现出良好的轻量化特性与实际应用潜力。
关键词(KeyWords): 人脸检测;轻量级骨干网络;频率-空间卷积;混合下采样模块
基金项目(Foundation): 国家自然科学基金项目(62202061);; 北京市自然科学基金项目(4232025,4254096);; 北京市教委科研计划科技一般项目(KM202311232002)
作者(Author): 范佳炫,韩晶,滕尚志,吕学强
DOI: 10.16508/j.cnki.11-5866/n.2025.05.009
参考文献(References):
- [1] DALAL N,TRIGGS B. Histograms of oriented gradients for human detection[C]//2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR'05). Los Alamitos, CA, USA:IEEE Computer Society, 2005:886-893.
- [2] BOYKO N,BASYSTIUK O,SHAKHOVSKA N. Performance evaluation and comparison of software for face recognition,based on dlib and opencv library[C]//2018 IEEE 2nd International Conference on Data Stream Mining&Processing(DSMP). New York, USA:IEEE, 2018:478-482.
- [3] GIRSHICK R. Fast R-CNN[C]//2015 IEEE International Conference on Computer Vision(ICCV). New York, USA:IEEE, 2015:1440-1448.
- [4] REN S Q, HE K M, GIRSHICK R,et al. Faster R-CNN:towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137-1149.
- [5] REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once:unified, real-time object detection[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR). New York, USA:IEEE, 2016:779-788.
- [6] LIU W, ANGUELOV D, ERHAN D, et al. SSD:single shot multibox detector[C]//Computer Vision-ECCV 2016. Berlin,Germany:Springer-Verlag Berlin,2016:21-37.
- [7] LIN T Y, DOLLáR P,GIRSHICK R,et al. Feature pyramid networks for object detection[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR). New York,USA:IEEE,2017:936-944.
- [8] DENG J K,GUO J,VERVERAS E,et al. RetinaFace:singleshot multi-level face localisation in the wild[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). Los Alamitos, CA, USA:IEEE Computer Society:IEEE,2020:5202-5211.
- [9] QI D L, TAN W J, YAO Q, et al. YOLO5Face:why reinventing a face detector[C]//Computer Vision-ECCV 2022Workshops. Cham:Springer, 2023, 13805:228-244.
- [10] WANG A, CHEN H, LIU L H, et al. YOLOv10:real-time endto-end object detection[J]. Advances in Neural Information Processing Systems,2024,37:107984-108011.
- [11] KHANAM R, HUSSAIN M. YOLOv11:an overview of the key architectural enhancements[EB/OL].(2024-10-23)[2025-08-08]. https://www. arxiv. org/abs/2410. 17725.
- [12] TIAN Y J,YE Q X,DOERMANN D. YOLOv12:attentioncentric real-time object detectors[EB/OL].(2025-02-18)[2025-08-08]. https://arxiv. org/abs/2502. 12524.
- [13] YANG S,LUO P,LOY C C,et al. WIDER FACE:a face detection benchmark[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR). New York, USA:IEEE,2016:5525-5533.
- [14] XIAO T T,SINGH M,MINTUN E,et al. Early convolutions help transformers see better[J]. Advances in Neural Information Processing Systems,2021,34:30392-30400.
- [15] LIU X Y,PENG H W,ZHENG N X,et al. EfficientViT:memory efficient vision transformer with cascaded group attention[C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). Los Alamitos, CA, USA:EEE Computer Society,2023:14420-14430.
- [16] HU J,SHEN L,SUN G. Squeeze-and-excitation networks[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Los Alamitos, CA, USA:EEE Computer Society,2018:7132-7141.
- [17] WANG A,CHEN H,LIN Z J,et al. LSNet:see large, focus small[C]//2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). Los Alamitos, CA, USA:EEE Computer Society,2025:9718-9729.
- [18] CHEN L W,GU L,LI L,et al. Frequency dynamic convolution for dense image prediction[C]//2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). Los Alamitos, CA, USA:EEE Computer Society, 2025:30178-30188.
- [19] LIU Y,WANG F,DENG J K,et al. MogFace:towards a deeper appreciation on face detection[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). Los Alamitos, CA, USA:EEE Computer Society,2022:4083-4092.
- [20] LUO S,LI X F,ZHANG X L. Wide aspect ratio matching for robust face detection[J]. Multimedia Tools and Applications,2023,82(7):10535-10552.
- [21] ZHU Y J,CAI H X,ZHANG S,et al. TinaFace:strong but simple baseline for face detection[EB/OL].(2021-01-22)[2025-08-26]. https://arxiv. org/abs/2011. 13183.
- [22]刘家龙,李光辉,代成龙.基于锚点损失优化的细粒度人脸检测方法[J].模式识别与人工智能,2025,38(5):457-471.LIU J L,LI G H,DAI C L. Fine-grained face detection method based on anchor loss optimization[J]. Pattern Recognition and Artificial Intelligence,2025,38(5):457-471.(in Chinese)
- [23] LI J,WANG Y B,WANG C G,et al. DSFD:dual shot face detector[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). Los Alamitos, CA, USA:EEE Computer Society, 2019:5055-5064.
- [24] WANG G T,LI J,WU Z T,et al. EfficientFace:an efficient deep network with feature enhancement for accurate face detection[J]. Multimedia Systems,2023,29(5):2825-2839.
- [25] GUO J,DENG J K,LATTAS A,et al. Sample and computation redistribution for efficient face detection[C]//ICLR 2022:10th International Conference on Learning Representations. ICLR,2022:1-10.
- [26]王建,宋晓宁.融合多尺度特征的轻量级人脸检测算法[J].模式识别与人工智能,2022,35(6):507-515.WANG J,SONG X N. Lightweight face detection algorithm with multi-scale feature fusion[J]. Pattern Recognition and Artificial Intelligence,2022,35(6):507-515.(in Chinese)
- [27]徐铭,李华.基于改进YOLOv5s-face的Face5系列人脸检测算法[J].重庆理工大学学报(自然科学),2024,38(6):194-202.XU M,LI H. Face5 series face detection algorithm based on improved YOLOv5s-face[J]. Journal of Chongqing University of Technology(Natural Science),2024,38(6):194-202.(in Chinese)