用于多模态MRI重建的深度展开网络与空间对齐|文献速递-生成式模型与transformer在医学影像中的应用

Oldlee · 2024年11月27日

Title

题目

Deep unfolding network with spatial alignment for multi-modal MRI reconstruction

用于多模态MRI重建的深度展开网络与空间对齐

01

文献速递介绍

磁共振成像（MRI）因其非侵入性、高分辨率和显著的软组织对比度，已成为广泛应用的医学影像技术。然而，由于反复采集MRI信号空间编码和硬件限制，MRI扫描速度相对较慢。这个耗时的过程可能导致患者的不适，导致其移动，从而在图像中引入运动伪影，可能会对后续的疾病诊断产生负面影响。因此，加速MRI采集在临床实践中具有重要意义。为了加速MRI采集，一种可行的策略是减少采集的k空间数据量，然后通过欠采样数据重建完全采样的图像。

压缩感知MRI（CS-MRI）方法能够在采样率显著低于奈奎斯特采样定理要求的情况下，从欠采样数据中实现准确重建（Lustig等，2007）。它们旨在根据不同的先验（例如结构稀疏性（Lai等，2016；Yang等，2015）、非局部稀疏性（Qu等，2014；Eksioglu，2016））设计一些手工制作的正则化器，并将其纳入算法优化中，以约束解空间。尽管其理论上有保证，但手工制作最优正则化器仍然具有挑战性。作为替代，基于深度学习的方法因其准确性和速度（Wang等，2016；Sriram等，2020；Zhu等，2018）在MRI重建中受到了广泛关注。通过学习鲁棒的特征表示，基于深度学习的方法取得了显著的重建性能。然而，大多数基于深度学习的方法是黑箱过程，缺乏可解释性，而可解释性在临床实践中是必需的。为了解决这个黑箱问题，深度展开网络（Yang等，2020a；Xin等，2022；Huang等，2024；Zhang等，2022；Jiang等，2023b，a）被提出，将成像模型和领域知识整合到网络中。它们将优化算法的迭代过程展开为深度神经网络，从而使学习过程具有可解释性。

尽管深度展开网络在MRI重建中具有潜力，但大多数方法主要集中在利用单一模态的信息。然而，在临床实践中，通常会获取不同对比度的MRI图像，因为每种模态揭示了不同的组织和器官特征，而模态间的互补信息有助于更准确的诊断。尽管不同模态的MRI图像显示不同的信号类型，但它们在空间上是对应的，描绘了相同的解剖结构。研究（Xiang等，2018；Feng等，2023；Bian等，2022；Lei等，2023）表明，通过利用另一种模态的信息（参考模态），可以提高一种模态（目标模态）的重建效果。但是，这些多模态MRI重建方法都假设图像是完美对齐的，这在实践中是罕见的。模态间的错位可能会由于未充分挖掘不同模态的相关性而对重建性能产生负面影响。

为了缓解模态间的错位，Lai等（2017）提出交替迭代配准和重建，以将参考模态与目标模态的中间重建结果对齐。然而，传统的迭代优化方法相对耗时。考虑到模态间错位的深度学习方法在重建和空间对齐精度上表现更好，但仍然存在两个主要的共同限制：（1）空间对齐任务未能与重建过程自适应地结合，导致两者之间的互补性不足；（2）整个框架缺乏可解释性。例如，在重建之前，Xuan等（2022）和Liu等（2021）仅将参考模态的图像与目标模态的欠采样图像对齐，忽略了欠采样图像中的伪影可能对空间对齐产生的负面影响。

为了解决上述限制，本文提出了一个深度展开网络与空间对齐（DUN-SA）框架。具体来说，我们首先推导出一个新的联合对齐-重建模型，在该模型中，设计了一个对齐的跨模态先验项，用于补偿模态之间的错位。然后，我们将其松弛为跨模态空间对齐和多模态重建任务，并提出了一个优化算法来交替求解这两个任务。具体而言，我们使用基于梯度的算法进行空间对齐，使用半二次分裂（HQS）算法进行重建。通过交替优化这两个任务，我们解决了该模型。进一步地，我们展开了该算法的迭代阶段，并设计了相应的网络模块。最后，我们提出了一个端到端的深度展开网络，并具有可解释性。本文的主要贡献如下：

• 我们提出了一个新颖的联合对齐-重建模型，用于多模态MRI重建，在该模型中，设计了一个对齐的跨模态先验项，以补偿模态之间的错位，同时学习跨模态先验。利用基于梯度的技术和HQS技术，我们设计了一个优化算法，交替求解该模型。

• 通过展开提出的算法的迭代阶段，并将其与特别设计的网络模块相结合，我们构建了一个名为DUN-SA的深度展开网络，具有清晰的可解释性。

• 我们设计了一个对齐的跨模态先验学习模块（AIPLB），通过参考模态的对齐图像学习跨模态先验，并利用去噪模块（DB）充分挖掘模态内部的先验。

• 通过在fastMRI数据集、IXI数据集、In-house数据集和BraTs 2018数据集上的广泛实验，我们证明了提出的DUN-SA在定量和定性重建结果上均优于现有的最先进方法。

Aastract

摘要

Multi-modal Magnetic Resonance Imaging (MRI) offers complementary diagnostic information, but somemodalities are limited by the long scanning time. To accelerate the whole acquisition process, MRI reconstruction of one modality from highly under-sampled k-space data with another fully-sampled reference modality isan efficient solution. However, the misalignment between modalities, which is common in clinic practice, cannegatively affect reconstruction quality. Existing deep learning-based methods that account for inter-modalitymisalignment perform better, but still share two main common limitations: (1) The spatial alignment task isnot adaptively integrated with the reconstruction process, resulting in insufficient complementarity betweenthe two tasks; (2) the entire framework has weak interpretability. In this paper, we construct a novel DeepUnfolding Network with Spatial Alignment, termed DUN-SA, to appropriately embed the spatial alignmenttask into the reconstruction process. Concretely, we derive a novel joint alignment-reconstruction model witha specially designed aligned cross-modal prior term. By relaxing the model into cross-modal spatial alignmentand multi-modal reconstruction tasks, we propose an effective algorithm to solve this model alternatively.Then, we unfold the iterative stages of the proposed algorithm and design corresponding network modulesto build DUN-SA with interpretability. Through end-to-end training, we effectively compensate for spatialmisalignment using only reconstruction loss, and utilize the progressively aligned reference modality to provideinter-modality prior to improve the reconstruction of the target modality. Comprehensive experiments onfour real datasets demonstrate that our method exhibits superior reconstruction performance compared tostate-of-the-art methods.

多模态磁共振成像（MRI）提供了互补的诊断信息，但某些模态受到扫描时间过长的限制。为了加速整个采集过程，从高度欠采样的k空间数据中重建某一模态的MRI，并利用另一个完全采样的参考模态，是一个有效的解决方案。然而，模态之间的错位在临床实践中较为常见，这可能会对重建质量产生负面影响。现有的基于深度学习的方法考虑了模态间的错位，虽然性能更好，但仍然存在两个主要的共同限制：（1）空间对齐任务没有与重建过程自适应地结合，导致两者之间的互补性不足；（2）整个框架的可解释性较弱。本文构建了一个新颖的深度展开网络与空间对齐（DUN-SA），将空间对齐任务恰当地嵌入到重建过程中。具体来说，我们推导出了一个新的联合对齐-重建模型，并设计了一个特别的对齐跨模态先验项。通过将该模型松弛为跨模态空间对齐和多模态重建任务，我们提出了一种有效的算法，交替求解该模型。然后，我们展开了该算法的迭代阶段，并设计了相应的网络模块，以构建具有可解释性的DUN-SA。通过端到端训练，我们仅利用重建损失有效地补偿了空间错位，并利用逐步对齐的参考模态提供跨模态先验，从而提高目标模态的重建效果。在四个实际数据集上的全面实验表明，我们的方法相比于最先进的方法，具有更优的重建性能。

Method

方法

In Section 3, we propose a joint alignment-reconstruction model formulti-modal MRI reconstruction and design an optimization algorithmin Section 3.1. Then, we unfold the iterative stages of the proposedalgorithm with corresponding network modules and build DUN-SA inSection 3.2. Finally, we introduce the details of network parametersand network training in Section 3.3.

在第3节中，我们提出了一个用于多模态MRI重建的联合对齐-重建模型，并在第3.1节中设计了一个优化算法。接着，在第3.2节中，我们展开了所提算法的迭代阶段，并设计了相应的网络模块，构建了DUN-SA。最后，在第3.3节中，我们介绍了网络参数和网络训练的详细信息。

Conclusion

结论

In this paper, we propose a novel joint alignment and reconstructionmodel for multi-modal MRI reconstruction. By developing an alignedcross-modal prior term, we integrate the spatial alignment task into thereconstruction process. We design an optimization algorithm for solvingit and then unfold each iterative stage into the corresponding networkmodule. As a result, we have constructed a deep unfolding networkwith interpretability, termed DUN-SA. Through end-to-end training, wefully leverage both intra-modality and inter-modality priors. Comprehensive experiments conducted on four real datasets have demonstratedthat the proposed DUN-SA outperforms current state-of-the-art methodsin both quantitative and qualitative assessments. Additionally, we haveverified that DUN-SA is relatively robust to misalignment, with minimalimpact on spatial alignment even as acceleration factors increase.

在本文中，我们提出了一种新的联合对齐和重建模型用于多模态MRI重建。通过开发对齐的跨模态先验项，我们将空间对齐任务集成到重建过程中。我们设计了解决该问题的优化算法，并将每个迭代阶段展开为相应的网络模块。最终，我们构建了一个具有可解释性的深度展开网络，称为DUN-SA。通过端到端训练，我们充分利用了模态内先验和模态间先验。在四个真实数据集上进行的综合实验表明，所提出的DUN-SA在定量和定性评估中均优于当前最先进的方法。此外，我们还验证了DUN-SA在错位的情况下相对鲁棒，随着加速因子的增加，空间对齐的影响最小。

Figure

图

Fig. 1. The overall structure of the proposed Deep Unfolding Network with Spatial Alignment (DUN-SA) consists of SAM (Spatial Alignment Module) and RM (ReconstructionModule). The RM is composed of AIPLB (Aligned Inter-modality Prior Learning Block), DB (Denoising Block), and DCB (Data Consistency Block). SAM is used to solve spatialalignment task, while RM is for reconstruction task. Specifically, AIPLB is used to learn aligned inter-modality prior, DB is used to learn denoising prior, and DCB is used toenforce data consistency constraint.

图1. 所提出的深度展开网络与空间对齐（DUN-SA）的整体结构由SAM（空间对齐模块）和RM（重建模块）组成。RM包括AIPLB（对齐的跨模态先验学习模块）、DB（去噪模块）和DCB（数据一致性模块）。SAM用于解决空间对齐任务，而RM则用于重建任务。具体而言，AIPLB用于学习对齐的跨模态先验，DB用于学习去噪先验，DCB用于强制执行数据一致性约束。

Fig. 2. Architecture of Spatial Alignment Module (SAM).

图2. 空间对齐模块（SAM）的架构。

Fig. 3. Detailed configurations of SA-Net, ProxNet𝑍 and ProxNet𝑆

图3. SA-Net、ProxNet𝑍 和 ProxNet𝑆 的详细配置。

Fig. 4. Visual comparison with representative methods for 4× acceleration under 1D equispaced subsampling mask on fastMRI dataset. First row: Reconstructed images by differentmethods; second row: Zoomed-in region of interest; third row: Equispaced mask of 4× acceleration and error maps of different methods.

图4. 在 fastMRI 数据集上，使用 1D equispaced 下采样掩模对 4× 加速的代表性方法的视觉比较。第一行：不同方法重建的图像；第二行：感兴趣区域的放大图；第三行：4× 加速的 equispaced 掩模及不同方法的误差图。

Fig. 5. Visual comparison with representative methods for 8× acceleration under 1D equispaced subsampling mask on the IXI dataset. First row: Reconstructed images by differentmethods; second row: Zoomed-in region of interest; third row: Equispaced mask of 8× acceleration and error maps of different methods.

图5. 在 IXI 数据集上，使用 1D equispaced 下采样掩模对 8× 加速的代表性方法的视觉比较。第一行：不同方法重建的图像；第二行：感兴趣区域的放大图；第三行：8× 加速的 equispaced 掩模及不同方法的误差图。

Fig. 6. Visual comparison with representative methods for 8× acceleration under 1D equispaced subsampling mask on the In-house dataset. First row: Reconstructed images bydifferent methods; second row: Zoomed-in region of interest; third row: Equispaced mask of 8× acceleration and error maps of different methods

图6. 在 In-house 数据集上，使用 1D equispaced 下采样掩模对 8× 加速的代表性方法的视觉比较。第一行：不同方法重建的图像；第二行：感兴趣区域的放大图；第三行：8× 加速的 equispaced 掩模及不同方法的误差图。

Fig. 7. Visual comparison with representative methods for 8× acceleration under 1D random subsampling mask on the BraTs 2018 dataset. First row: Reconstructed images bydifferent methods; second row: Zoomed-in region of interest; third row: Random mask of 8× acceleration and error maps of different methods.

图7. 在 BraTs 2018 数据集上，使用 1D random 下采样掩模对 8× 加速的代表性方法的视觉比较。第一行：不同方法重建的图像；第二行：感兴趣区域的放大图；第三行：8× 加速的 random 掩模及不同方法的误差图。

Fig. 8. Comparison on the learning trajectories of different models on the fastMRI dataset for 4× acceleration under equispaced mask.

图8. 在 fastMRI 数据集上，使用 equispaced 掩模对 4× 加速的不同模型学习轨迹比较。

Fig. 9. Quantitative comparison of multi-modal MRI reconstruction on the fastMRI dataset with different scales of simulated spatial misalignment. Left y-axes are for reconstruction performances (‘‘DUN-SA’’, ‘‘SAN’’, ‘‘Single Modality SAN’’ and ‘‘Single Modality DUN-SA’’) while the right y-axes are for the ‘‘Difference’’ between ‘‘DUN-SA’’ and ‘‘SAN’’.

图9. 在 fastMRI 数据集上，不同尺度的模拟空间错位下，多模态 MRI 重建的定量比较。左侧 y 轴表示重建性能（“DUN-SA”，“SAN”，“单模态 SAN”和“单模态 DUN-SA”），右侧 y 轴表示 “DUN-SA” 与 “SAN” 之间的“差异”。

Fig. 10. The visualization of the effects of spatial alignment on the fastMRI dataset. (a) shows the original fully-sampled T1 image. (b) represents the results of aligning anunder-sampled T2 image with a fully-sampled T1 image by traditional method. © depicts the results of integrating traditional spatial alignment with reconstruction for jointoptimization. (d) displays the result of DUN-SA. Details are shown in the first row: zoomed-in view of aligned T1 images and third row: zoomed-in views of checkerboardvisualizations

图10. 空间对齐对 fastMRI 数据集的影响可视化。(a) 显示原始完全采样的 T1 图像。(b) 表示通过传统方法将欠采样的 T2 图像与完全采样的 T1 图像对齐的结果。© 描述了将传统空间对齐与重建整合进行联合优化的结果。(d) 显示 DUN-SA 的结果。具体细节见第一行：对齐的 T1 图像的放大视图，第三行：棋盘可视化的放大视图。

Fig. 11. The visualization of the effects of spatial alignment on the In-house dataset. (a), (b) represent fully-sampled T2-weighted image and fully-sampled T1-weighted image,respectively. ©, (e), and (g) depict T1-weighted images aligned using SAN under different acceleration factors, while (d), (f), and (h) display T1-weighted images aligned usingDUN-SA under different acceleration factors. In the second row, a grid is used to facilitate observation of the spatial position of each aforementioned image, with zoomed-inviews presented in the third row. In the fourth row, checkerboard visualizations are employed to demonstrate the misalignment between T2-weighted image and T1-weightedimage/aligned T1-weighted image, and the last row magnifies the corresponding areas to display the details more clearly

图11. 空间对齐对 In-house 数据集的影响可视化。(a) 和 (b) 分别表示完全采样的 T2 加权图像和完全采样的 T1 加权图像。©、(e) 和 (g) 展示了在不同加速因子下，使用 SAN 对齐的 T1 加权图像，而 (d)、(f) 和 (h) 展示了在不同加速因子下，使用 DUN-SA 对齐的 T1 加权图像。在第二行中，使用网格帮助观察每个图像的空间位置，第三行展示了放大视图。在第四行中，使用棋盘可视化展示 T2 加权图像与 T1 加权图像/对齐后的 T1 加权图像之间的错位，最后一行放大相应区域以更清晰地展示细节。

Fig. 12. The PSNR and SSIM curves on the fastMRI, IXI, In-house and BraTs datasets with different numbers of stages k.

图12. 在 fastMRI、IXI、In-house 和 BraTs 数据集上，不同阶段数 k 下的 PSNR 和 SSIM 曲线。

Fig. 13. Visual comparison with effect of each component for 8× acceleration under 1D equispaced subsampling mask on the fastMRI dataset. First row: Reconstructed images bydifferent methods; second row: Zoomed-in region of interest; third row: Equispaced mask of 8× acceleration and error maps of different methods.

图13. 在 fastMRI 数据集上，8× 加速下 1D 等距子采样掩码的各组件效果的视觉比较。第一行：不同方法重建的图像；第二行：感兴趣区域的放大图；第三行：8× 加速的等距掩码和不同方法的误差图。

Fig. 14. Visualization of Intermediate Results at Stage t: ground truth 𝑥 𝑔𝑡, reconstructed image 𝑥 𝑡 , warped reference image  (𝑥ref, 𝜙𝑡 ) denoted as 𝑥 𝑡 𝑟𝑒𝑓 , inter-modality prior 𝑧 𝑡 , and intra-modality prior 𝑠 𝑡 .

图 14. 阶段 ttt 的中间结果可视化：地面实况 xgtx{gt}xgt、重建图像 xtx_txt、变形参考图像 T(xref,φt)\mathcal{T}(x{ref}, \varphi_t)T(xref,φt) 表示为 xrefx_{ref}xref、跨模态先验 ztz_tzt 和模态内先验 sts_tst。

Table

表

Table 1Quantitative evaluation of DUN-SA vs. other methods on the fastMRI dataset for 4× and 8× acceleration under equispaced and random 1Dsubsampling masks, where T1-weighted images are used as reference modality to assist the reconstruction of T2-weighted images. Best resultsare emphasized in bold, and the second best are emphasized with an underline.

表1在 equispaced 和 random 1D 下采样掩模下，DUN-SA 与其他方法在 fastMRI 数据集上对 4× 和 8× 加速的定量评估，其中 T1 加权图像作为参考模态辅助 T2 加权图像的重建。最佳结果以粗体强调，第二最佳结果以下划线标出。

Table 2Quantitative evaluation of DUN-SA vs. other methods on the IXI dataset for 4× and 8× acceleration under equispaced and random 1Dsubsampling masks, where PD-weighted images are used as reference modality to assist the reconstruction of T2-weighted images. Best resultsare emphasized in bold, and the second best are emphasized with an underline.

表2在 equispaced 和 random 1D 下采样掩模下，DUN-SA 与其他方法在 IXI 数据集上对 4× 和 8× 加速的定量评估，其中 PD 加权图像作为参考模态辅助 T2 加权图像的重建。最佳结果以粗体强调，第二最佳结果以下划线标出。

Table 3Quantitative evaluation of DUN-SA vs. other methods on the In-house dataset for 4× and 8× acceleration under equispaced and random 1Dsubsampling masks, where T1-weighted images are used as reference modality to assist the reconstruction of T2-weighted images. Best resultsare emphasized in bold, and the second best are emphasized with an underline.

表3在 equispaced 和 random 1D 下采样掩模下，DUN-SA 与其他方法在 In-house 数据集上对 4× 和 8× 加速的定量评估，其中 T1 加权图像作为参考模态辅助 T2 加权图像的重建。最佳结果以粗体强调，第二最佳结果以下划线标出。

Table 4Quantitative evaluation of DUN-SA vs. other methods on the BraTs 2018 dataset for 4× and 8× acceleration under equispaced and random1D subsampling masks, where the Reference/Target modalities are T2/FLAIR; T1/FLAIR; T1CE/FLAIR; T1/T1CE; T1CE/T2. Best results areemphasized in bold.

表4在 equispaced 和 random 1D 下采样掩模下，DUN-SA 与其他方法在 BraTs 2018 数据集上对 4× 和 8× 加速的定量评估，其中参考/目标模态分别为 T2/FLAIR、T1/FLAIR、T1CE/FLAIR、T1/T1CE 和 T1CE/T2。最佳结果以粗体强调。

Table 5Evaluation of adaptation to scenarios with imperfect reference data. US-Half/RecHalf (US-All/Rec-all) indicates that half (all) reference modality data to beunder-sampled/reconstructed.

表5对不完美参考数据场景的适应性评估。US-Half/Rec-Half（US-All/Rec-All）表示参考模态数据的一半（全部）需要被欠采样/重建。

Table 6Effect of each component on the performance of DUN-SA on the fastMRI, IXI, In-house and BraTs datasets for 4× and 8× acceleration underequispaced and random subsampling masks, measured in PSNR and SSIM.

表 6 各个组件对 DUN-SA 在 fastMRI、IXI、In-house 和 BraTs 数据集上 4× 和 8× 加速下的表现影响，使用均方信噪比（PSNR）和结构相似性指数（SSIM）在等间隔和随机子采样掩模下进行衡量。

Table 7 Complexity analysis of representative models.

表 7 代表性模型的复杂度分析。