Title
题目
Dual structure-aware image filterings for semi-supervised medical image segmentation
面向半监督医学图像分割的双重结构感知图像滤波
01
文献速递介绍
准确的医学图像分割在计算机辅助诊断(CAD)系统中发挥着重要作用。传统的监督学习分割方法通过使用大量标注数据取得了令人瞩目的成果。然而,手动分割是繁琐且耗时的。近年来,半监督分割方法因利用容易获得的未标注图像来提高分割模型的准确性而受到了广泛关注。
主流的半监督分割方法基于一致性正则化(Zhao等,2023;Wang等,2023b;Yang等,2023;Basak和Yin,2023;Lei等,2022;Jin等,2022;Xiang等,2022;Basak等,2022;Lyu等,2022;Su等,2024;Adiga等,2024),其目标是在图像级别或/和模型级别的变化下产生一致的结果。特别地,许多方法旨在通过图像级别的变化来生成变异(Yu等,2019;Xu等,2021;You等,2022a,b;Bai等,2023)。一种流行的图像变异策略采用弱到强的范式(Fan等,2022;Liu等,2022b;Yang等,2023),其中通过弱增强版本生成的预测用于监督强增强版本。增强版本通常通过简单的随机增强(例如,高斯噪声Huang等,2022)、对抗性扰动(Peiris等,2021;Wang等,2023a)和CutMix技术(Chen等,2021;Yang等,2023)来生成。模型级别的变化主要采用均值教师框架(Tarvainen和Valpola,2017)或协同训练策略(Qiao等,2018;Chen等,2021)。在均值教师框架中,教师网络通常通过指数移动平均(EMA)从学生网络中获取。协同训练策略涉及训练两个独立的网络或解码器,采用不同的初始化,并使用每个模型的输出来互相监督对方的训练。
最近,利用伪标签进行监督的一致性正则化方法在半监督分割中取得了令人瞩目的表现(Chen等,2021;Yang等,2023;Basak和Yin,2023;Lyu等,2022;Liu等,2022a)。例如,CPS(Chen等,2021)通过两个初始化不同的网络生成不同的伪标签,并在它们之间应用互监督。得益于有效的强图像增强(例如,CutMix Yun等,2019)作为图像级别的变异,这些方法在自然图像中取得了出色的表现,能够避免模型对错误的伪标签过拟合(Chen等,2021;Yang等,2023;Liu等,2022b)。然而,这些现有的图像级别变异并未利用结构信息,而结构信息对于医学图像而言至关重要。此外,医学图像中的分布方差不像自然图像那样显著,这使得半监督医学图像分割更容易由于确认偏误(Arazo等,2020)而过拟合噪声伪标签。
Aastract
摘要
Semi-supervised image segmentation has attracted great attention recently. The key is how to leverageunlabeled images in the training process. Most methods maintain consistent predictions of the unlabeledimages under variations (e.g., adding noise/perturbations, or creating alternative versions) in the image and/ormodel level. In most image-level variation, medical images often have prior structure information, which hasnot been well explored. In this paper, we propose novel dual structure-aware image filterings (DSAIF) asthe image-level variations for semi-supervised medical image segmentation. Motivated by connected filteringthat simplifies image via filtering in structure-aware tree-based image representation, we resort to the dualcontrast invariant Max-tree and Min-tree representation. Specifically, we propose a novel connected filteringthat removes topologically equivalent nodes (i.e. connected components) having no siblings in the Max/Mintree. This results in two filtered images preserving topologically critical structure. Applying the proposed DSAIFto mutually supervised networks decreases the consensus of their erroneous predictions on unlabeled images.This helps to alleviate the confirmation bias issue of overfitting to noisy pseudo labels of unlabeled images, andthus effectively improves the segmentation performance. Extensive experimental results on three benchmarkdatasets demonstrate that the proposed method significantly/consistently outperforms some state-of-the-artmethods. The source codes will be publicly available.
半监督图像分割最近受到了广泛关注。其关键在于如何在训练过程中有效利用未标注的图像。大多数方法通过在图像和/或模型层面上对未标注图像进行变化(例如添加噪声/扰动,或者创建不同版本的图像),来保持未标注图像的预测一致性。在大多数图像级变化中,医学图像通常具有先验的结构信息,但这一点尚未得到充分探索。本文提出了一种新颖的双重结构感知图像滤波(DSAIF)方法,作为半监督医学图像分割中的图像级变化。受连接滤波的启发,连接滤波通过在结构感知的基于树的图像表示中进行滤波简化图像,我们采用了双重对比不变的Max树和Min树表示。具体来说,我们提出了一种新的连接滤波方法,去除Max/Min树中没有兄弟节点的拓扑等价节点(即连接组件)。这样就得到了两个滤波后的图像,保留了拓扑上关键的结构。将所提出的DSAIF应用于互监督网络,减少了它们在未标注图像上错误预测的一致性。这有助于缓解过拟合未标注图像的噪声伪标签问题,从而有效提高分割性能。在三个基准数据集上进行的广泛实验结果表明,所提出的方法显著且一致地优于一些最先进的方法。源代码将公开。
Method
方法
3.1. Overview
Semi-supervised semantic segmentation task aims to enhance theperformance of segmentation by leveraging a small set of labeledimages 𝑙 = {(𝑥 𝑙 , 𝑦𝑙 )} of 𝑁 labeled images, along with a large collectionof unlabeled images 𝑢 = {𝑥 𝑢} of 𝑀 unlabeled images, where 𝑁 ≪ 𝑀.We follow the classical consistency regularization-based semisupervised medical image segmentation framework, which is oftencomposed of image-level variations and model-level variations onunlabeled images. For the image-level variations, we resort to thedual contrast-invariant Max-tree and Min-tree representations (see Section 3.2 for the construction) for connected filterings. We propose noveldual structure-aware image filterings (DSAIF) as the image-level variations. More specifically, we propose a novel type of connected filtering,that preserves only the topologically critical nodes of the Max/Mintree. The corresponding filtering, named upper/lower structure-awareimage filtering, and denoted as USAIF/LSAIF, yields two differentimages that have the same topological structure as the original one. Wefurther leverage the invariance property of Max/Min-tree with respectto monotonically increasing contrast changes to further enforce theappearance diversity while preserving the topological image structure.For the model variations, we simply adopt cross pseudo supervision(CPS) method (Chen et al., 2021) as a baseline example to illustrateour method in Fig. 1. It is noteworthy that DSAIF can also be appliedto other mutual supervision framework such as MC-Net (Wu et al.,2021), MC-Net+ (Wu et al., 2022a), Co-BioNet (Peiris et al., 2023). Thepipeline of the proposed framework using MC-Net, MC-Net+, Co-BioNetas baseline is depicted in the Supplementary Material.
3.1. 概述
半监督语义分割任务旨在通过利用少量标注图像(Dl={(xl,yl)}\mathcal{D}l = {(\mathbf{x}l, \mathbf{y}l)}Dl={(xl,yl)})和大量未标注图像(Du={xu}\mathcal{D}u = {\mathbf{x}_u}Du={xu})来提升分割性能,其中 N≪MN \ll MN≪M。我们遵循经典的基于一致性正则化的半监督医学图像分割框架,该框架通常包括对未标注图像的图像级变异和模型级变异。
在图像级变异方面,我们采用双重对比不变的Max-tree和Min-tree表示(具体构造见3.2节),用于连接滤波。我们提出了新型的双重结构感知图像滤波(DSAIF)作为图像级变异。更具体地,我们提出了一种新的连接滤波类型,保留Max/Min树的拓扑关键节点。相应的滤波方法,称为上/下结构感知图像滤波(USAIF/LSAIF),生成两种不同的图像,这些图像具有与原始图像相同的拓扑结构。我们进一步利用Max/Min树对单调增加的对比度变化的不变性,进一步增强外观多样性,同时保持拓扑图像结构。
对于模型级变异,我们简单地采用交叉伪监督(CPS)方法(Chen等,2021)作为基准例子来说明我们的方法,如图1所示。值得注意的是,DSAIF也可以应用于其他互监督框架,如MC-Net(Wu等,2021)、MC-Net+(Wu等,2022a)、Co-BioNet(Peiris等,2023)。使用MC-Net、MC-Net+、Co-BioNet作为基准的提议框架管道可见于补充材料。
Conclusion
结论
We propose a novel image-level variation method named dualstructure-aware image filterings (DSAIF) for semi-supervised medicalimage segmentation. Specifically, we leverage the dual Max-tree andMin-tree image representation, and remove all nodes having no siblingsin the corresponding tree. This equals to remove all topologically equivalent regions while preserving topologically critical ones, resulting intwo images with diverse appearances while having the same topological structure as the original image. By incorporating the proposedDSAIF into mutually supervised networks, the consensus on erroneouspredictions for unlabeled images is decreased. This helps to alleviatethe confirmation bias issue, where models tend to overfit to noisypseudo labels, thereby enhancing the performance of segmentation. Extensive experimental results on three widely used benchmark datasetsdemonstrate that the proposed method significantly/consistently outperforms the state-of-the-art methods. In the future, we would like toexplore DSAIF in more semi-supervised medical image segmentationframeworks, and using tree of shapes for more structure-aware filters.Combining DSAIF with other topological analysis tools is also aninteresting direction to explore.
我们提出了一种新颖的图像级变化方法,称为双结构感知图像滤波(DSAIF),用于半监督医学图像分割。具体而言,我们利用双重Max树和Min树图像表示法,去除所有在相应树中没有兄弟节点的节点。这相当于去除所有拓扑等价的区域,同时保留拓扑关键的区域,从而生成两幅具有不同外观但与原始图像保持相同拓扑结构的图像。通过将提出的DSAIF融入到互监督网络中,减少了对未标记图像的错误预测共识。这有助于缓解确认偏差问题,即模型往往会对噪声伪标签过拟合,从而提高分割性能。在三个广泛使用的基准数据集上的大量实验结果表明,所提出的方法显著且一致地优于现有的最先进方法。未来,我们希望将DSAIF应用于更多的半监督医学图像分割框架,并利用形状树(Tree of Shapes)进行更多的结构感知滤波。将DSAIF与其他拓扑分析工具结合也是一个值得探索的有趣方向。
Figure
图
Fig. 1. The pipeline of the proposed DSAIF framework using mutual supervision of CPS (Chen et al., 2021) as the model-level variations. We propose novel dual structure-awareimage filterings (DSAIF) based on Max/Min-tree representation as the image-level variations. We remove every node (marked in red) without siblings in Max/Min-tree which istopologically equivalent to its ancestor node.
Fig. 1. 提出了基于CPS(Chen等,2021)模型级变异的互监督的DSAIF框架管道。我们提出了基于Max/Min-tree表示的双重结构感知图像滤波(DSAIF),作为图像级变异。我们移除Max/Min-tree中没有兄弟节点的每个节点(标记为红色),这些节点在拓扑结构上与其祖先节点等效。
Fig. 2. An illustrative example of the proposed DSAIF. For the Max-tree (e) and Min-tree (f) built on the original image (b), we remove every node (marked in red) without siblingswhich is topologically equivalent to its ancestor node. The two images reconstructed from filtered Max/Min-tree denoted as USAIF (a) and LSAIF © have the same topologicalstructure as the original image, but are of quite different appearances. The numbers after the letters in (a), (b), and © represent the gray level of the region. The numbers inparentheses in (d–e) (resp. (f–g)) means level ℎ in Eq. (1) (resp. Eq. (2)).
图 2. 提出的DSAIF的示意示例。对于原始图像(b)构建的Max-tree(e)和Min-tree(f),我们移除每个没有兄弟节点且与其祖先节点拓扑等价的节点(用红色标记)。从滤波后的Max/Min树重建的两幅图像,分别表示为USAIF(a)和LSAIF(c),它们具有与原始图像相同的拓扑结构,但外观差异显著。图中(a)、(b)和(c)后面的数字表示区域的灰度值。图(d–e)中的数字(即(f–g)中的数字)表示公式(1)中的层级ℎ(即公式(2)中的层级)。
Fig. 3. An illustrative example of leveraging the contrast-invariance property (a) of Max/Min-tree in DSAIF. Applying monotonically increasing contrast changes before DSAIFincreases the appearance diversity while preserving the same topological structure as the original images.
图 3. 在DSAIF中利用Max/Min-tree的对比不变性特性(a)的示意示例。对原始图像应用单调增加的对比度变化,在增加外观多样性的同时,保持与原始图像相同的拓扑结构。
Fig. 4. Some qualitative results of DSAIF on LA dataset (Xiong et al., 2021) (firstrow), Pancreas-CT (Clark et al., 2013) (middle row), and PROMISE12 (Litjens et al.,(bottom row). The changed images in (b) are obtained by applying monotonicallyincreasing contrast change to the original ones
图 4. DSAIF在LA数据集(Xiong等,2021)(第一行)、Pancreas-CT数据集(Clark等,2013)(中间行)和PROMISE12数据集(Litjens等,2014)(底部行)上的一些定性结果。图(b)中的变化图像是通过对原始图像应用单调增加的对比度变化得到的。
Fig. 5. Some qualitative segmentation results of DSAIF on LA dataset (Xiong et al., (first two rows), Pancreas-CT dataset (Clark et al., 2013) (middle two rows), andPROMISE12 dataset (Litjens et al., 2014) (bottom two rows)
图 5. DSAIF在LA数据集(Xiong等,2021)(前两行)、Pancreas-CT数据集(Clark等,2013)(中间两行)和PROMISE12数据集(Litjens等,2014)(底部两行)上的一些定性分割结果。
Fig. 6. Quantitative trend analysis on LA dataset under CPS baseline for different labeled data portions.
图 6. 在CPS基准下,LA数据集不同标注数据比例下的定量趋势分析。
Fig. 7. Quantitative trend analysis on Pancreas-NIH dataset under CPS baseline for different labeled data portions
图 7. 在CPS基准下,Pancreas-NIH数据集不同标注数据比例下的定量趋势分析。
Fig. 8. Quantitative trend analysis on PROMISE12 dataset under CPS baseline for different labeled data portions
图 8. 在CPS基准下,PROMISE12数据集不同标注数据比例下的定量趋势分析。
Fig. 9. (a) Dice score 𝐷𝑒 defined in Eq. (7) between erroneous predictions of two mutually supervised networks on unlabeled training images of PROMISE12 Dataset (Litjenset al., 2014) during the training process. (b) Dice score 𝐷𝑟 defined in Eq. (8) between correct predictions of two mutually supervised networks on unlabeled training images ofPROMISE12 Dataset (Litjens et al., 2014) during the training process
Fig. 9. (a) 错误预测的 Dice 系数 𝐷**𝑒,定义在公式(7)中,表示在训练过程中两个相互监督网络对 PROMISE12 数据集(Litjens et al., 2014)中未标注训练图像的预测误差。(b) 正确预测的 Dice 系数 𝐷**𝑟,定义在公式(8)中,表示在训练过程中两个相互监督网络对 PROMISE12 数据集(Litjens et al., 2014)中未标注训练图像的正确预测。
Fig. 10. (a) (resp. (b)) Dice score between the ground-truth and the network 𝑓**𝜃1(resp. 𝑓**𝜃2) outputs on unlabeled training images of PROMISE12 Dataset (Litjens et al., 2014) atdifferent iterations in the training process.
Fig. 10.(a) 和 (b) 分别展示了在 PROMISE12 数据集 (Litjens et al., 2014) 上,网络 𝑓𝜃1(或 𝑓𝜃2)在不同训练迭代过程中,与地面真值之间的 Dice 系数。
Table
表
Table 1Quantitative evaluation on the LA dataset (Xiong et al., 2021). We report the mean and standard deviation obtained over threeruns
表 1 LA 数据集(Xiong et al., 2021)上的定量评估。我们报告了三次实验的均值和标准差。
Table 2Quantitative evaluation on the Pancreas-NIH dataset (Clark et al., 2013). We report the mean and standard deviation obtained overthree runs.
表 2Pancreas-NIH 数据集(Clark et al., 2013)上的定量评估。我们报告了三次实验的均值和标准差。
Table 3Quantitative evaluation on the PROMISE12 dataset (Litjens et al., 2014). We report the mean and standard deviation obtained overthree runs.
表 3在 PROMISE12 数据集(Litjens et al., 2014)上的定量评估。我们报告了经过三次运行获得的均值和标准差。
Table 4Ablation study on LA dataset (Xiong et al., 2021) under 10% labeled data using CPS(Chen et al., 2021) as baseline. We report the mean and standard deviation obtainedover three runs.
表 4LA 数据集 (Xiong et al., 2021) 上的消融研究,在 10% 标注数据下,使用 CPS (Chen et al., 2021) 作为基线。我们报告了三次实验的均值和标准差。
Table 5Ablation study on the area threshold 𝜏 involved in the proposed DSAIF on LA dataset(Xiong et al., 2021) under 10% labeled data using CPS (Chen et al., 2021) as baseline.
表 5在 LA 数据集 (Xiong et al., 2021) 上的消融研究,研究了所提 DSAIF 中涉及的区域阈值 𝜏,在 10% 标注数据下,使用 CPS (Chen et al., 2021) 作为基线。
Table 6Cross-dataset performance on prostate segmentation. We report the mean and standarddeviation over three runs
Table 6跨数据集前列腺分割性能。我们报告了三次实验运行的均值和标准差。