Title
题目
Fourier Convolution Block with global receptive field for MRI reconstruction
用于MRI重建的具有全局感受野的傅里叶卷积块
01
文献速递介绍
从欠采样的磁共振成像(MRI)信号中重建图像可以显著减少扫描时间并改善临床实践。然而,基于卷积神经网络(CNN)的方法虽然在MRI重建中表现出色,但由于其感受野(RF)的限制,可能无法有效捕捉全局特征。这对于重建来说尤为重要,因为混叠伪影是全局分布的。最近视觉Transformer的进展进一步强调了大感受野的重要性。在本研究中,我们提出了一种新颖的全局傅里叶卷积块(FCB),具有整个图像的感受野和较低的计算复杂度,通过将常规的空间域卷积转化为频率域实现。在实际应用中,有效感受野和训练卷积核的可视化显示,FCB提高了重建模型的感受野。我们在四种流行的CNN架构上对脑部和膝盖MRI数据集进行了评估。带有FCB的模型在PSNR和SSIM方面优于基线模型,并展示了更丰富的细节和纹理恢复能力。
Aastract
摘要
Reconstructing images from under-sampled Magnetic Resonance Imaging (MRI) signals significantly reducesscan time and improves clinical practice. However, Convolutional Neural Network (CNN)-based methods,while demonstrating great performance in MRI reconstruction, may face limitations due to their restrictedreceptive field (RF), hindering the capture of global features. This is particularly crucial for reconstruction,as aliasing artifacts are distributed globally. Recent advancements in Vision Transformers have furtheremphasized the significance of a large RF. In this study, we proposed a novel global Fourier Convolution Block(FCB) with whole image RF and low computational complexity by transforming the regular spatial domainconvolutions into frequency domain. Visualizations of the effective RF and trained kernels demonstrated thatFCB improves the RF of reconstruction models in practice. The proposed FCB was evaluated on four popularCNN architectures using brain and knee MRI datasets. Models with FCB achieved superior PSNR and SSIMthan baseline models and exhibited more details and texture recovery.
从欠采样的磁共振成像(MRI)信号中重建图像可以显著减少扫描时间并改善临床实践。然而,基于卷积神经网络(CNN)的方法虽然在MRI重建中表现出色,但由于其感受野(RF)的限制,可能无法有效捕捉全局特征。这对于重建来说尤为重要,因为混叠伪影是全局分布的。最近视觉Transformer的进展进一步强调了大感受野的重要性。在本研究中,我们提出了一种新颖的全局傅里叶卷积块(FCB),具有整个图像的感受野和较低的计算复杂度,通过将常规的空间域卷积转化为频率域实现。在实际应用中,有效感受野和训练卷积核的可视化显示,FCB提高了重建模型的感受野。我们在四种流行的CNN架构上对脑部和膝盖MRI数据集进行了评估。带有FCB的模型在PSNR和SSIM方面优于基线模型,并展示了更丰富的细节和纹理恢复能力。
Method
方法
3.1. Fourier convolution
We will demonstrate the capability of Fourier Convolution toachieve both local and global RF in discrete scenarios. Fourier Convolution involves element-wise multiplication in the frequency domain. TheDiscrete Fourier Transform (DFT) transforms 2D images in the neuralnetwork to the frequency domain and is formulated as:
𝑋 ( 𝑘1 , 𝑘2 ) =𝑁∑−1𝑛=0𝑁∑−1𝑚=0𝑥(𝑛, 𝑚)𝑁*2exp [ −2 𝑁 𝜋𝑗 ( 𝑛𝑘1 + 𝑚𝑘2 )
The Fourier Convolution is then formulated as the element-wise multiplication between the spectrum and a production kernel W with thesame size:
𝑌* ( 𝑘1 , 𝑘2 ) = 𝑋 ( 𝑘1 , 𝑘2 ) ⊙ 𝑊 ( 𝑘1 , 𝑘2 )
With the definition of inverse DFT (IDFT):
𝑥(𝑛, 𝑚) =𝑁∑−1𝑘1=0 𝑘 𝑁∑−12=0𝑋 ( 𝑘1 , 𝑘2 ) exp [ 2 𝑁 𝜋𝑗 ( 𝑛𝑘1 + 𝑚𝑘2 ) ]
it could be deduced that there is an equivalence between FourierConvolution and spatial convolution:
IDFT{𝑌 } = 𝑥 ?
where ‘‘∗’’ represents convolution with circle-padding:
𝑥 ∗ 𝑤 =𝑁∑−1𝑝=0𝑁∑−1𝑞=0𝑥(𝑝, 𝑞)𝑤 (⌊(𝑛 − 𝑝) ∕𝑁⌋, ⌊(𝑚 − 𝑞) ∕𝑁⌋)
The convolution kernel 𝑤 is the IDFT of FCB kernel 𝑊 with size 𝑁 ×𝑁but it can be expressed as a zero-padding version of some smaller kernel 𝑤′ with a size of 𝐾(1 ≤ 𝐾 ≤ 𝑁):𝑤(𝑛, 𝑚*) = { 𝑤′ (𝑛, 𝑚), 𝑛 ≤ 𝐾, 𝑚* ≤ 𝐾0, others It implies that, although production kernel 𝑊 in Fourier domain hasa constant size as the same as the input, its spatial counterpart couldbe a zero-padded convolution kernel of which the size 𝐾 × 𝐾 variesfrom 1 × 1 to the input size 𝑁 × 𝑁. This point highlights that Fourierconvolution can result in a global RF, as it could be equivalent to aconvolution with a kernel size that matches the input image. The localRF is also accessible when the Fourier convolution corresponds thespatial one with small size.
3.1. 傅里叶卷积
我们将展示傅里叶卷积在离散场景中实现局部和全局感受野(RF)的能力。傅里叶卷积涉及在频域中的元素逐点相乘。离散傅里叶变换(DFT)将神经网络中的二维图像转换为频域,公式如下:
X(k1,k2)=∑n=0N−1∑m=0N−1x(n,m)⋅exp[−2πjN(nk1+mk2)]X(k_1, k_2) = \sum{n=0}^{N-1}\sum{m=0}^{N-1} x(n, m) \cdot \exp \left[ -\frac{2\pi j}{N} (n k_1 + m k_2) \right]X(k1,k2)=∑n=0N−1∑m=0N−1x(n,m)⋅exp[−N2πj(nk1+mk2)]
傅里叶卷积随后被定义为频谱与同样大小的卷积核 WWW 的逐元素相乘:
Y(k1,k2)=X(k1,k2)⋅W(k1,k2)Y(k_1, k_2) = X(k_1, k_2) \cdot W(k_1, k_2)Y(k1,k2)=X(k1,k2)⋅W(k1,k2)
逆离散傅里叶变换(IDFT)的定义为:
x(n,m)=∑k1=0N−1∑k2=0N−1X(k1,k2)⋅exp[2πjN(nk1+mk2)]x(n, m) = \sum{k_1=0}^{N-1} \sum{k_2=0}^{N-1} X(k_1, k_2) \cdot \exp \left[ \frac{2\pi j}{N} (n k_1 + m k_2) \right]x(n,m)=∑k1=0N−1∑k2=0N−1X(k1,k2)⋅expN2πj(nk1+mk2)
可以推导出傅里叶卷积与空间卷积之间的等价性:
IDFT{Y}=xw\text{IDFT}{Y} = x \ast wIDFT{Y}=xw
其中“\ast”表示带环形填充的卷积:
xw=∑p=0N−1∑q=0N−1x(p,q)⋅w(⌊n−pN⌋,⌊m−qN⌋)x \ast w = \sum{p=0}^{N-1} \sum{q=0}^{N-1} x(p, q) \cdot w \left( \left\lfloor \frac{n - p}{N} \right\rfloor, \left\lfloor \frac{m - q}{N} \right\rfloor \right)xw=∑p=0N−1∑q=0N−1x(p,q)⋅w(⌊Nn−p⌋,⌊Nm−q⌋)
卷积核 𝑤 是傅里叶卷积块(FCB)核 𝑊 的逆离散傅里叶变换(IDFT),大小为 N × N,但它可以表示为某个更小卷积核 𝑤′ 的零填充版本,大小为 K(1 ≤ K ≤ N):
w’(n, m), & \text{当} n \leq K, m \leq K \ 0, & \text{其他情况} \end{cases}
这意味着,虽然傅里叶域中的卷积核 W 大小与输入相同,但它的空间对应物可以是一个大小为 K × K 的零填充卷积核,大小范围从 1 × 1 到输入大小 N × N 不等。该点突显出傅里叶卷积能够实现全局感受野,因为它等价于卷积核大小与输入图像匹配的卷积。同时,当傅里叶卷积对应于小尺寸的空间卷积时,也可以实现局部感受野。
Conclusion
结论
This paper introduced a novel convolution block design with aglobal receptive field for MRI reconstruction CNNs. The experimental results showed that the proposed FCB effectively improved thereconstruction performance of the baseline CNN models. At differentundersampling rates, models enhanced with FCB achieved better quantitative metrics in various datasets, even with additional noise added.Furthermore, these models exhibited superior capability in recoveringintricate details, including texture and edges. Notably, the models withFCB outperformed Vision Transformers, which are considered mainstream models with large receptive fields. Additionally, FCB modelsalso surpassed methods that embed k-space data to enhance long-rangeconnections. FCB also demonstrated low computational complexity,with experiments showing that its runtime is significantly less than Vision Transformers and comparable to traditional CNNs with an 11 × 11convolution kernel.Through visualization, it is demonstrated that the proposed FCBeffectively scales up the RF of CNNs. Unlike other approaches thatfocus on the architecture design of CNNs, our approach pays attentionto the basic convolution layer in CNNs. Some other works use cascaded or pyramidal architectures (Schlemper et al., 2017; Chen et al.,2022; Sriram et al., 2020a) to capture long-distance correlation. Ourproposed FCB serves as a Plug-and-Play block that has the potential tobe incorporated into these various CNN architectures. Moreover, FCBprocesses data in the hidden layer as real values and utilizes conjugatesymmetry in the frequency domain to save memory and time. Formethods building complex CNNs (Wang et al., 2020; Cole et al., 2021),FCB could also be smoothly integrated in complex mode.The proposed FCB approach still has some limitations. Althoughits usage memory is comparable to spatial convolution, the numberof parameters in FCB is still significantly higher, leading to increasedstorage memory costs. This presents a tradeoff between the memoryand speed when a large RF is desired. Additionally, the proposed FCBinvolves repeated FFTs and IFFTs, which may impact computing efficiency. Some works (Ayat et al., 2019; Watanabe and Wolf, 2021) haveattempted to design pure spectral-based CNNs to address this issue,but the question of activation functions in the Fourier domain remainsopen. Future research in this area could explore ways to improve thecomputing efficiency of FCB.
本文介绍了一种用于MRI重建卷积神经网络(CNN)的新型卷积块设计,具备全局感受野。实验结果表明,所提出的傅里叶卷积块(FCB)有效提升了基线CNN模型的重建性能。在不同的欠采样率下,增强了FCB的模型在各种数据集上均获得了更好的定量指标,即使在加入额外噪声的情况下也是如此。此外,这些模型在恢复复杂细节方面表现出色,包括纹理和边缘。值得注意的是,带有FCB的模型性能优于主流的具有大感受野的Vision Transformers。此外,FCB模型还超越了嵌入k空间数据以增强远距离连接的方法。实验表明,FCB的计算复杂度较低,其运行时间显著短于Vision Transformers,并且与传统的11×11卷积核的CNN相当。
通过可视化展示,证明了所提出的FCB有效扩展了CNN的感受野。与其他关注CNN架构设计的方法不同,我们的方法着重于CNN中的基本卷积层。一些其他研究采用了级联或金字塔架构(Schlemper等人,2017年;Chen等人,2022年;Sriram等人,2020年a)来捕捉长距离相关性。我们提出的FCB作为一个即插即用模块,具有潜力被集成到各种CNN架构中。此外,FCB在隐藏层中处理数据为实值,并利用频域中的共轭对称性来节省内存和时间。对于构建复杂CNN的方法(Wang等人,2020年;Cole等人,2021年),FCB也可以在复杂模式下平滑集成。
然而,所提出的FCB方法仍然存在一些局限性。尽管其内存使用与空间卷积相当,但FCB中的参数数量仍然显著更多,导致存储内存成本增加。当需要更大感受野时,这在内存与速度之间形成了权衡。此外,FCB涉及反复的快速傅里叶变换(FFT)和逆傅里叶变换(IFFT),这可能影响计算效率。一些研究(Ayat等人,2019年;Watanabe和Wolf,2021年)试图设计纯基于频谱的CNN来解决这一问题,但关于傅里叶域中的激活函数问题仍未解决。未来的研究可以探讨如何提高FCB的计算效率。
R
Figure
图
Fig. 1. The illustration of the proposed Fourier Convolution Block (FCB). (a) Threetraditional convolution layers with kernel sizes of 3 × 3, 11 × 11, and 21 × 21. (b)The equivalent FCBs with the size of 𝑁×𝑁 corresponding to the traditional convolutionlayers with different kernel size shown on the top.
图1. 所提出的傅里叶卷积块(FCB)的示意图。(a) 三个传统的卷积层,卷积核大小分别为3×3、11×11和21×21。(b) 相应的等效FCB,大小为 N×N,与上方显示的不同卷积核大小的传统卷积层对应。
Fig. 2. The architecture of baseline models and proposed convolution blocks (only one iteration in MoDL and VSNet is shown). Abbrev: DW = Depth-wise Convolution, PW =Point-Wise Convolution. The convolution blocks are consistently colored with the model views, with DW layers highlighted in red. This indicates that the convolution operationhere can be replaced by FCB.
图2. 基线模型与所提出卷积块的架构(MoDL 和 VSNet 中仅显示一次迭代)。缩写:DW = 深度卷积,PW = 点卷积。卷积块与模型视图的颜色一致,DW层以红色突出显示。这表明此处的卷积操作可以替换为傅里叶卷积块(FCB)。
Fig. 3. A reconstruction example of the T2-weighted data in validation set in brain dataset. The top displays the result at 8× acceleration, while the bottom displays the result at 12× acceleration. PSNR and SSIM of the single image reconstructed are noted in the top right-hand corner. The second and fifth rows depict the reconstruction of the zoomedregion marked in the ground truth. The third and sixth rows display the residual error in this zoomed region, with all errors multiplied by 10 for better visualization.
图3. 脑部数据集中验证集T2加权数据的重建示例。顶部显示了8×加速的结果,底部显示了12×加速的结果。重建图像的PSNR和SSIM值标注在右上角。第二行和第五行展示了在真实值中标记的放大区域的重建结果。第三行和第六行显示了该放大区域中的残差错误,所有错误都放大10倍以便于可视化。
Fig. 4. A reconstruction example of the T1-weighted data in validation set in brain dataset
图4. 脑部数据集中验证集T1加权数据的重建示例。
Fig. 5. A reconstruction example of the validation set in knee dataset. The top displays the result at 8× acceleration, while the bottom displays the result at 12× acceleration.
图5. 膝盖数据集中验证集的重建示例。顶部显示了8×加速的结果,底部显示了12×加速的结果。
Fig. 6. An example of brain T1-weighted data reconstructed by F-MoDL and other methods.
图6. 由F-MoDL和其他方法重建的脑部T1加权数据示例。
Fig. 7. PSNR results on the knee validation dataset at 8× acceleration when FCB isapplied to different layers in the model. (a) Results for UNet. (b) Results for MoDL.© Results for VSNet.
图7. 在膝盖验证数据集上以8×加速时,将FCB应用于模型不同层的PSNR结果。(a) UNet的结果。(b) MoDL的结果。© VSNet的结果。
Fig. 8. Spectral visualization of convolution kernels in UNet and F-UNet (FCB wasdeployed in the last 6 layers). (a) Spectral maps of UNet. (b) Spectral maps of F-UNet.The spectral amplitude of convolution kernels or the kernels in FCB are shown fromthe left to right along the depth of UNet. The upper and lower rows correspond to thedouble convolution layers in each UNet block. The ranks of spectrum of kernels arenoted in the bottom right-hand corner of each map
图8. UNet和F-UNet(在最后6层中使用了FCB)的卷积核的频谱可视化。(a) UNet的频谱图。(b) F-UNet的频谱图。从左到右沿着UNet的深度展示了卷积核或FCB中的卷积核的频谱幅度。上排和下排分别对应每个UNet块中的双卷积层。每个图的右下角标注了卷积核频谱的秩。
Fig. 9. (a) ERF of UNet. (b) ERF of F-UNet. © PSF of the 2D Poisson sampling pattern.The ERF of F-UNet covered a larger region, closing to the sampling PSF.
图9. (a) UNet的有效感受野(ERF)。(b) F-UNet的有效感受野(ERF)。© 2D泊松采样模式的点扩散函数(PSF)。F-UNet的有效感受野覆盖了更大的区域,更接近采样PSF。
Table
表
Table 1Comparison of the computation operations and parameters between regular convolutionand FCB.
表1 常规卷积与傅里叶卷积块(FCB)在计算操作和参数上的比较。
Table 2Quantitative results on the validation set in brain and knee datasets at 8× and 12× acceleration.
表2 脑部和膝盖数据集在8×和12×加速时验证集上的定量结果。
Table 3Quantitative results on the validation set using Cartesian and Radial mask.
表3 使用笛卡尔和径向掩模在验证集上的定量结果。
Table 4Quantitative results on the validation set when 10% Or 20% Gaussian noise added inbrain dataset at 8× acceleration.
表4 在脑部数据集的验证集中,加入10%或20%高斯噪声时在8×加速下的定量结果。
Table 5Quantitative results compared with other methods
表5 与其他方法的定量比较结果。
Table 6Comparison the reconstruction performance and runtime between UNet with different Kernel sizes, FasterFC-UNet and F-UNet on the kneedataset at 8× acceleration.
表6 比较不同卷积核大小的UNet、FasterFC-UNet和F-UNet在膝盖数据集上8×加速时的重建性能和运行时间。
Table 7Ablation study assessing the impact of modifications and the Re-parametrization method on the knee validation dataset at 8× acceleration.
表7 消融研究,评估修改和重新参数化方法对膝盖验证数据集在8×加速下的影响。