nav emailalert searchbtn searchbox tablepage yinyongbenwen piczone journalimg journalInfo searchdiv qikanlogo popupnotification paper paperNew
2025, 06, v.52 1-7+67
非对称的编码器-解码器架构下图像分割方法研究
基金项目(Foundation): 四川省科技计划合作项目(2023YFG0036)
邮箱(Email): wanglynn@scu.edu.cn;
DOI:
摘要:

针对传统图像分割方法在捕捉全局长距离依赖信息时的局限性,和模型参数量庞大以及计算复杂度较高,导致实际应用中尤其在资源受限的场景面临较大计算负担的问题,提出一种非对称的编码器-解码器架构。在基于Mamba的U-Net经典分割架构中将视觉状态空间(VSS)块和卷积(CNN)构成VSS-CNN混合模块引入编码器,而解码器仅保留VSS模块。改进的空间注意力机制和通道注意力机制串联跳跃连接编码器与解码器,从而实现局部特征提取能力出色、具有远程依赖关系捕获能力的超轻量化模型。实验表明,以公开数据集ISIC2017为实验数据,在保证分割精度的前提下,该模型参数比传统纯视觉Mamba模型低99.94%,比目前最轻的视觉MambaU-Net模型低75.51%,比传统U-Net模型低99.84%。该模型能够在显著减少计算量的同时,保持优良的分割精度,从而满足实时应用需求。

Abstract:

Traditional image segmentation techniques often rely on deep learning models based on Convolutional Neural Networks and Transformer architectures. Although these models excel at local feature extraction, they have limitations in capturing long-range dependencies. Moreover, such models tend to have a large number of parameters and high computational complexity, which results in significant computational burdens, especially in resource-constrained environments. To address this issue, this paper proposes a lightweight image segmentation method based on Mamba. By introducing Mamba's efficient architecture combined with the classical U-Net structure, this method aims to tackle the challenges faced by image segmentation models in mobile device scenarios, such as large parameter sizes and inadequate processing speeds for real-time applications. Specifically, the method incorporates Visual State Space(VSS) blocks, which are used alongside convolutions(CNN) to form hybrid building blocks for capturing extensive contextual information. Additionally, a non-symmetric encoder-decoder structure is designed. Experiments on the public dataset ISIC2017 show that,while maintaining segmentation accuracy,the proposed model reduces the parameter count by 99.94%compared to traditional pure visual Mamba models,by 75.51%compared to the lightest existing visual Mamba U-Net model,and by 99.84%compared to the classic U-Net model.The designed model achieves significant reductions in computational complexity while maintaining excellent segmentation accuracy,thus meeting the demands of real-time applications.

参考文献

[1]齐向明,张志伟.边缘增强结合多尺度信息融合的皮肤病变分割[J].计算机系统应用,2024,33(11):157-166.

[2]李文生,张菁,卓力,等.基于Transformer的视觉分割技术进展[J].计算机学报,2024,47(12):2760-2782.

[3]支慧芳,韩建新,吴永飞.融合注意力与上下文信息的皮肤癌图像分割模型[J].计算机工程与设计,2024,45(9):2859-2865.

[4]彭琳娜,张红云,苗夺谦.基于边缘约束和改进Swin Unetr的复杂器官分割方法[J/OL].计算机科学,(2024-09-12)[2024-10-14].http://kns.cnki.net/kcms/detail/50.1075.TP.20240912.0933.010.html.

[5]RONNEBERGER O,FISCHER P,BROX T. U-net:Convolutional networks for biomedical image segmentation:Medical image computing and computer-assisted intervention-MICCAI 2015:18th international conference[C]. Munich:Springer International Publishing,2015:234-241.

[6]RUAN J,XIANG S,XIE M,et al. MALUNet:A multi-attention and light-weight unet for skin lesion segmentation:2022 IEEE International Conference on Bioinformatics and Biomedicine(BIBM)[C]. IEEE,2022:1150-1156.

[7]ZHOU Z,RAHMAN SIDDIQUEE M M,TAJBAKHSH N,et al.Unet:A nested u-net architecture for medical image segmentation:Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support:4th International Workshop, DLMIA2018, and 8th International Workshop, ML-CDS 2018, Held in Conjunction with MICCAI 2018[C]. Granada, Spain:Springer International Publishing,2018:3-11.

[8]CHEN J,LU Y,YU Q,et al. Transunet:Transformers make strong encoders for medical image segmentation[J]. arXiv preprint arXiv,2021:2102.04306.

[9]ZHANG Y,LIU H,HU Q. Transfuse:Fusing transformers and cnns for medical image segmentation:Medical image computing and computer assisted intervention-MICCAI 2021:24th international conference[C]. Strasbourg, France:Springer International Publishing,2021:14-24.

[10]崔丽群,郝思雅,栾五洋.基于Mamba的轻量级三维点云实例分割算法[J/OL].计算机工程与应用,(2024-09-05)[2024-10-14].http://kns.cnki.net/kcms/detail/11.2127.TP.20240905.1205.003.html.

[11]GU A,DAO T. Mamba:Linear-time sequence modeling with selective state spaces[J]. arXiv preprint arXiv,2023:2312.00752.

[12]HUANG T,PEI X,YOU S,et al. Localmamba:Visual state space model with windowed selective scan[J]. arXiv preprint arXiv,2024:2403.09338.

[13]MA J, LI F, WANG B. U-mamba:Enhancing long-range dependency for biomedical image segmentation[J]. arXiv preprint arXiv,2024:2401.04722.

[14]RUAN J,XIANG S. Vm-unet:Vision mamba unet for medical image segmentation[J]. arXiv preprint arXiv,2024:2402.02491.

[15]ZHU L,LIAO B,ZHANG Q,et al. Vision mamba:Efficient visual representation learning with bidirectional state space model[J].arXiv preprint arXiv,2024:2401.09417.

[16]CODELLA N C F,GUTMAN D,CELEBI M E,et al. Skin lesion analysis toward melanoma detection:A challenge at the 2017international symposium on biomedical imaging(isbi), hosted by the international skin imaging collaboration(isic):2018 IEEE 15th international symposium on biomedical imaging(ISBI 2018)[C].IEEE,2018:168-172.

基本信息:

DOI:

中图分类号:TP391.41

引用信息:

[1]陈春霞,丑西平,晏杭坤等.非对称的编码器-解码器架构下图像分割方法研究[J].机械,2025,52(06):1-7+67.

基金信息:

四川省科技计划合作项目(2023YFG0036)

引用

GB/T 7714-2015 格式引文
MLA格式引文
APA格式引文
检 索 高级检索