Title
题目
Reinforced physiology-informed learning for image completion frompartial-frame dynamic PET imaging
基于增强型生理信息感知学习的部分帧动态PET成像图像补全
01
文献速递介绍
正电子发射断层显像(PET)相关研究内容翻译 正电子发射断层显像(PET)是一种广泛应用的分子成像技术,通过放射性示踪剂实现体内代谢与生理过程的可视化(Hooker与Carson,2019;Beyer等,2020)。注射放射性示踪剂后,数据可通过两种模式采集:静态模式与动态模式(Boellaard,2009)。与静态PET相比,动态PET能够持续追踪示踪剂在体内随时间的分布情况,借助时间信息实现无创表征与定量评估(Wu等,2024;Meng等,2024),进而助力提取示踪剂的动力学特性及生理过程相关信息。 基于生理学的动力学模型是动态PET数据分析的核心(Zhang等,2020;Wang等,2020;Chen等,2021),可帮助理解示踪剂如何从血液进入组织,以及在组织内的代谢速率。总体而言,动态PET成像及其动力学建模在药物研发、疾病诊断与治疗响应监测领域展现出日益重要的潜力(Meikle等,2021)。 然而,在标准临床场景中,针对不同示踪剂的动态扫描需采用复杂的采集方案。18F-氟代脱氧葡萄糖(18F-FDG)等常用示踪剂的动态扫描通常需持续60分钟以上,以确保示踪剂从正常组织中完全清除(Wang等,2022)。长时间扫描给患者与医疗人员带来较大负担,降低扫描设备的检查量(throughput),同时加剧患者运动引发的运动伪影。此外,长时间动态PET扫描必然会生成大量数据帧,对海量数据集的采集与存储构成重大挑战,这一问题亟待有效解决(Feng等,2021;Hu等,2020)。 缺失帧补全是缩短动态PET总采集时间的有效策略。通常可通过缩短单时间窗方案或采用双时间窗方案减少扫描时间,其中缺失时间帧的图像需通过传统线性插值或动力学模型进行插值或外推获取。Scott等(2019)将动脉自旋标记(ASL)磁共振成像(MRI)得到的脑血流量(CBF)信息融入药代动力学建模,利用该模型外推缺失的30分钟时间-活性曲线(TACs)。Kolinger等(2021)与Wang等(2022)提出双时间窗方案,通过两次独立PET扫描缩短时间,随后采用线性插值或基于动力学模型的插值方法估算静息期的缺失数据点。 为提高预测精度,多种结合深度学习与动力学模型的方法应运而生。Hong等(2023)采用神经常微分方程(N-ODE),以数据驱动方式模拟解析方法,在更长时间范围内预测时间-活性曲线(TACs)。Liang等(2023)将U型网络(U-Net)用作特征提取器,捕捉图像结构的先验知识,再将提取的特征向量输入参数生成器,得到隔室模型参数,进而生成缺失的时间-活性曲线(TACs)。但上述深度学习方法需大量训练数据集用于监督学习,且无法灵活安排患者扫描时间;其训练过程局限于单一补全模式(如通过初始30分钟扫描预测后续30分钟图像),同时还需计算动力学建模的解析解,增加了计算负担。 受上述研究(Wang等,2022;Hong等,2023;Liang等,2023)启发,本文提出一种新型无监督深度学习方法,用于完成部分帧动态PET图像的补全任务。具体而言,该方法将动态PET固有的生理原理融入网络,引导网络输出逼近已知的部分数据;通过训练网络更新动力学参数,实现缺失图像的无监督补全。所提网络框架基于物理信息神经网络(PINN)与时间隐式神经表征(TINR)构建。 据我们所知,物理信息神经网络(PINN)通过将物理定律作为先验知识融入网络损失函数,充分利用神经网络强大的拟合能力(Karniadakis等,2021;Podina等,2023;Markidis,2021;de Vries等,2023)。这些物理定律包括反映时空域边界条件与初始条件的项,以及解域内选定点位(称为配点)的物理残差项(Cuomo等,2022;Chen等,2020;Kashinath等,2021)。而时间隐式神经表征(TINR)则利用神经网络对一维序列或二维图像在不同时间点的数值进行连续表征(Fons等,2022)。 本研究方法充分利用神经表征(INR)的时空表征能力,强化原始生理约束,能够从部分帧动态PET数据中提取潜在生理信息,进而实现缺失帧图像信息的恢复与动力学参数的预测。综上,我们通过数据项、边界项及增强型生理残差项对网络施加约束,并通过实验验证了这些约束的有效性及其对权衡参数的不敏感性。研究首先设计并比较了三种灵活的扫描方案,选定最优方案进行详细分析;将所提模型应用于模拟数据、大鼠真实数据及患者数据,验证模型的可行性与性能。本研究的主要贡献如下: (1)设计的增强型物理信息神经网络(RPINN)无需求解动力学模型的解析解,而是将生理约束直接融入损失函数,从而降低计算复杂度,提高计算效率。 (2)增强型物理信息神经网络(RPINN)属于无监督深度学习补全方法,无需特定训练数据集,可应用于数据有限的不同物种与器官,适用范围广泛。 (3)增强型物理信息神经网络(RPINN)能显著缩短动态PET的扫描时间,且可灵活应用于不同的缩短式PET扫描方案,提高扫描设备检查量、改善患者舒适度,并减少患者运动引发的误差。
Aastract
摘要
Dynamic positron emission tomography(PET) imaging using 18F-FDG typically requires over an hour to acquirea complete time series of images. Therefore, reducing dynamic PET scan time is crucial for minimizing errorscaused by patient movement and increasing the throughput of the imaging equipment. However, shorteningthe scanning time will lead to the loss of images in some frames, affecting the accuracy of PET parameterestimation. In this paper, we proposed a method that combined physiology-informed learning with timeimplicit neural representations for kinetic modeling and missing-frame dynamic PET image completion. Basedon the two-tissue compartment model, three types of constraint terms were constructed for network training,including data terms, boundary terms, and reinforced physiology residual terms. The method works effectivelywithout the need for specific training datasets, making it feasible even with limited data. Three commonlyused scanning schemes were defined to verify the feasibility of the proposed method and the performance wasevaluated based on simulation data and real rat data. The best-performing scheme was selected for detailedanalysis of PET images and parameter maps on datasets of four human organs obtained from Biograph VisionQuadra. Our method outperforms traditional nonlinear least squares (NLLS) fitting in both reconstructionquality and computational efficiency. The metrics calculated from different organs, such as the brain (SSIM >0.98) and the thorax (PSNR > 40), show that the proposed network can achieve promising performance.
18F-FDG动态正电子发射断层显像(PET)相关研究内容翻译 采用18F-氟代脱氧葡萄糖(18F-FDG)的动态正电子发射断层显像(PET)通常需要1小时以上才能获取完整的图像时间序列。因此,缩短动态PET扫描时间对于减少患者运动引发的误差、提高成像设备的 throughput( throughput 此处指设备单位时间内的检查量)至关重要。但缩短扫描时间会导致部分帧图像缺失,进而影响PET参数估算的准确性。 本文提出一种将生理信息感知学习与时间隐式神经表征相结合的方法,用于动力学建模与动态PET缺失帧图像补全。该方法以双组织隔室模型为基础,构建了三类用于网络训练的约束项,包括数据项、边界项及增强型生理残差项。此方法无需特定训练数据集即可有效运行,即便在数据有限的情况下也具备可行性。 研究定义了三种常用扫描方案以验证所提方法的可行性,并基于模拟数据与大鼠真实数据评估其性能;随后选取性能最优的方案,在Biograph Vision Quadra设备获取的四组人体器官数据集上,对PET图像及参数图进行详细分析。结果表明,该方法在重建质量与计算效率上均优于传统非线性最小二乘(NLLS)拟合。从不同器官计算得出的指标(如脑部结构相似性指数(SSIM)>0.98、胸部峰值信噪比(PSNR)>40)显示,所提网络可实现良好的性能表现。
Method
方法
2.1. Problem formulation
In the proposed RPINN model, the inverse problem of image completion can be formulated as:arg min𝜃,𝜔𝑁∑𝑙=1𝐿(𝑆𝑙𝐺𝐹𝜃,𝜔(𝑡), 𝑋𝑙 ) + 𝜇 ‖ ‖ ‖ 𝐷𝐹𝜃,𝜔(𝑡) − 𝑔(𝐹𝜃,𝜔(𝑡), 𝜔, 𝑡) ‖ ‖ ‖2(1)The first term in Eq. (1) calculates the error between the PET images predicted by the model and the known 𝑁-frame partial images:𝑋𝐾* = { 𝑋𝑙 }𝑁𝑙*=1, quantifying it through the loss function 𝐿. Where 𝜃, 𝜔represent the network parameters and kinetic parameters that need tobe optimized, respectively. 𝐹𝜃,𝜔(𝑡) ∈ R𝑁𝑇 ×𝑁𝑃 represents the activityvalues of the tissue at 𝑁𝑇 discrete time points for 𝑁𝑃 voxels. Bymultiplying the integral operator 𝐺 ∈ R𝑁𝐴×𝑁𝑇 , the PET images forall 𝑁𝐴 frames are obtained. The row vector 𝑆𝑙 ∈ R1×𝑁𝐴 is used toextract the corresponding 𝑙th frame image. The latter term of Eq. (1)represents the physiological constraint of the kinetic model, whichis specifically constrained by the differential equations governing thetracer distribution. Where 𝜇 is the regularization coefficient, 𝐷 is thedifferential operator, and 𝑔 represents the right-hand side of the kineticdifferential equation. By optimizing the parameters 𝜃 and 𝜔, we canreconstruct the missing frame images: 𝑋𝐸 = { 𝑋𝑙 }𝑁𝐴𝑙=
𝑁+1:𝑋𝐸 = 𝑆𝑙𝐺𝐹𝜃 ∗,𝜔∗ (𝑡), 𝑙 ∈ {𝑁 + 1, … , 𝑁𝐴} (2)
2.1 问题构建 在本文提出的增强型物理信息神经网络(RPINN)模型中,图像补全的逆问题可构建为: 式(1)中第一项计算模型预测的PET图像与已知的$N$帧部分图像($X_K = {X_l}{l=1}N$)之间的误差,并通过损失函数$\mathcal{L}$对该误差进行量化。其中,$\theta$和$\omega$分别代表需优化的网络参数与动力学参数;$\mathcal{F}{\theta,\omega}(t) \in \mathbb{R}^{N_T \times N_P}$代表$N_P$个体素在$N_T$个离散时间点的组织活性值;通过与积分算子$\mathcal{G} \in \mathbb{R}^{N_A \times N_T}$相乘,可得到全部$N_A$帧的PET图像;行向量$S_l \in \mathbb{R}^{1 \times N_A}$用于提取对应的第$l$帧图像。 式(1)中后一项代表动力学模型的生理约束,该约束由描述示踪剂分布的微分方程具体实现。其中,$\mu$为正则化系数,$\mathcal{D}$为微分算子,$\mathcal{g}$代表动力学微分方程的右侧项。 通过对参数$\theta$和$\omega$进行优化,可重建缺失帧图像($X_E = {X_l}{l=N+1}^{N_A}$),
Conclusion
结论
In this work, we presented RPINN, an unsupervised deep learningapproach for dynamic PET image completion from partial frame data.The primary contribution of this method is its ability to significantlyreduce scan times and increase instrument throughput without requiring training datasets. We demonstrated the flexibility and effectivenessof our approach through three flexible scanning schemes. Quantitativeand qualitative analyses on simulations, small-animal PET scans, andparticularly on real patient data from a total-body PET scanner (Biograph Vision Quadra), validate the high performance, accuracy, androbustness of the proposed method in complex physiological environments. RPINN shows significant potential as a practical and powerfultool for fast dynamic PET imaging.
研究内容摘要 本研究提出了一种无监督深度学习方法——RPINN,用于从部分帧数据中补全动态PET图像。 该方法的主要贡献在于,无需训练数据集即可显著缩短扫描时间、提高设备吞吐量。 我们通过三种灵活的扫描方案,验证了该方法的灵活性与有效性。 在模拟数据、小动物PET扫描数据,尤其是全身PET扫描仪(Biograph Vision Quadra)获取的真实患者数据上进行的定量与定性分析表明,所提方法在复杂生理环境中具有优异的性能、准确性与稳健性。 RPINN作为一种实用且高效的工具,在快速动态PET成像领域展现出巨大潜力。 要不要我帮你整理一份该研究核心贡献与验证数据类型的总结表,方便快速梳理关键信息?
Results
结果
4.1. Simulation study
Fig. 3(a) illustrates the prediction results of scheme 1 under twodifferent phantom datasets. In both datasets, compared to the initialframes, the prediction error of Scheme 1 consistently increases in thelater frames. The error map in Fig. 3(a) also demonstrates this trend.Therefore, the correction using information from the later frames, asimplemented in schemes 2 and 3, is necessary. The predicted AIF curvesfor the two datasets under Scheme 1 are shown in Fig. 3(b). Fig.4 shows the CRC-STD curves for the tumor regions predicted usingdifferent schemes for two frames. It can be observed that, in bothdatasets, Scheme 2 achieves higher CRC with a lower STD. Scheme 1exhibits the worst stability and performance in tumor prediction for thelater frames.
Fig. 5.(a) shows the linear fit results of the least squares methodbetween the true and predicted kinetic parameters on the Zubal dataset.It can be observed that across all three schemes, the network generallyexhibits smaller prediction errors for 𝑘1 and 𝑉**𝐵, but larger deviationsfor 𝑘4 . Schemes 2 and 3 incorporate information from both early andlate frames, leading to more accurate predictions of 𝑘1 , 𝑘2 , and 𝑘4 .For 𝑉**𝐵, the results of the three schemes are similar, while scheme 1performs best for 𝑘3 prediction. Overall, scheme 2 shows the best predictive performance. Fig. 5.(b) shows the evaluation metrics betweenthe predicted partial-frame images and the ground truth. Frames 21–25are the common frames predicted by all three schemes. In scheme 1, theimage quality metrics show significant fluctuations in the late frames,and has larger errors, whereas metrics of scheme 2 are more stableand have the smallest deviations overall. The results indicate that theimages from the last 10 min contain more information than those from20–30 min.
4.1 模拟研究 图3(a)展示了方案1在两个不同体模数据集下的预测结果。在两个数据集中,与初始帧相比,方案1的预测误差在后续帧中均持续增大。图3(a)中的误差图也体现了这一趋势。因此,如方案2和方案3所采用的那样,利用后续帧信息进行校正十分必要。方案1在两个数据集下的预测动脉输入函数(AIF)曲线如图3(b)所示。 图4展示了采用不同方案对两个帧的肿瘤区域预测得到的对比噪声比(CRC)-标准差(STD)曲线。可以观察到,在两个数据集中,方案2均实现了更高的CRC和更低的STD。方案1在后续帧的肿瘤预测中表现出最差的稳定性和性能。 图5(a)展示了在Zubal数据集上,真实动力学参数与预测动力学参数之间的最小二乘法线性拟合结果。可以观察到,在所有三个方案中,网络对参数k₁和VB的预测误差通常较小,而对k₄的预测偏差较大。方案2和方案3结合了早期帧与晚期帧的信息,使得对k₁、k₂和k₄的预测更准确。对于VB,三个方案的结果相近;而在k₃的预测上,方案1表现最佳。总体而言,方案2展现出最优的预测性能。 图5(b)展示了预测的部分帧图像与真实图像(ground truth)之间的评价指标。第21-25帧是三个方案共同预测的帧。在方案1中,图像质量指标在晚期帧中出现显著波动,且误差较大;而方案2的指标更稳定,总体偏差最小。结果表明,最后10分钟的图像比20-30分钟时段的图像包含更多信息。
Figure
图

Fig. 1. Network framework. The scanning scheme use patient data as an example. The total scan time was 65 min, with all three schemes using the known30 min images to predict the remaining time frames.
图1 网络框架 该扫描方案以患者数据为例进行展示。总扫描时间为65分钟,三种方案均采用已知的30分钟图像来预测剩余时间帧的图像。

Fig. 2. The prediction errors of the three networks under three different schemes.
图2 三种不同方案下三个网络的预测误差

Fig. 3. The prediction results of dynamic PET images generated by zubal phantom and xcat phantom under scheme 1: (a) shows the predicted 4-frame images,error maps and evaluation metrics under two datasets, respectively. The last three frames are predicted unsupervisedly. (b) shows the predicted blood inputfunction under two datasets. Both the images and the AIF were normalized by dividing by the maximum value of the AIF
图3 方案1下Zubal体模与XCAT体模生成的动态PET图像预测结果 (a)分别展示了两个数据集下的4帧预测图像、误差图及评价指标,其中最后3帧为无监督预测结果。 (b)展示了两个数据集下的预测血液输入函数(即动脉输入函数AIF)。 图像与动脉输入函数(AIF)均通过除以AIF的最大值进行了归一化处理。

Fig. 4. CRC–STD curves of the tumor ROI under three different schemes on two datasets: (a) Frame 23 and (b) Frame 25. Solid lines and dashed lines correspondto data generated from the Zubal and XCAT phantoms, respectively. Marks were plotted every 500 epochs.
图4 两个数据集上三种不同方案下肿瘤感兴趣区域(ROI)的对比噪声比(CRC)-标准差(STD)曲线 (a)第23帧;(b)第25帧。 实线与虚线分别对应Zubal体模和XCAT体模生成的数据,每500个迭代周期(epoch)绘制一个标记点。

Fig. 5. Model performance test. (a) Linear regression of rate constants (𝑘1 , 𝑘2 , 𝑘3 , 𝑘4 , 𝑉**𝐵 ) estimated with different schemes. The horizontal axis is the groundtruth, the vertical axis is the estimated value, and the fitting results are shown above. (b) RMSE, MAE, SSIM, PSNR and the Bias of the image predicted byproposed model with different schemes.
图5 模型性能测试 (a)不同方案估算的速率常数(k₁、k₂、k₃、k₄、V_B)的线性回归分析。横轴为真实值(ground truth),纵轴为估算值,拟合结果标注于图上方。 (b)不同方案下所提模型预测图像的均方根误差(RMSE)、平均绝对误差(MAE)、结构相似性指数(SSIM)、峰值信噪比(PSNR)及偏差(Bias)。

Fig. 6. A slice of a rat kidney. From top to bottom are the ground truth and the prediction results of the three schemes, respectively. The red box highlights thepredicted time-frame images for each scheme
图6 大鼠肾脏的一个切片 从上到下分别为真实图像(ground truth)以及三种方案的预测结果。红色方框标注了每种方案对应的预测时间帧图像。

Fig. 7. Frame-wise comparison of (a) SSIM and (b) PSNR metrics across three schemes. Boxplots show distributions from 16 rats at four temporal frames (52,54, 56, 58). , and ns representing 𝑃 < 0.05, 𝑃 < 0.001, and non-significant, respectively
图7 三种方案间(a)结构相似性指数(SSIM)与(b)峰值信噪比(PSNR)指标的逐帧对比 箱线图展示了16只大鼠在4个时间帧(第52、54、56、58帧)上的指标分布情况。图中标记的“”“**”和“ns”分别代表统计显著性水平为P<0.05、P<0.001和无统计学意义(非显著)。

Fig. 8. (a) A slice of the thorax. (b) A slice of the liver. The ground truth and the corresponding prediction results are shown from top to bottom for each slice.
图8 (a)胸部的一个切片;(b)肝脏的一个切片。每个切片均从上到下依次展示真实图像(ground truth)及对应的预测结果。

Fig. 9. Calculating the Bland-Altman plots for the three predicted frames in the liver region. The dashed line represents the mean difference, while the solidlines indicate the 95% confidence intervals of the differences. The 𝑥-axis represents the mean of the ground truth and predicted values for each voxel, while the𝑦-axis represents the error between the ground truth and predicted values.
图9 肝脏区域三个预测帧的Bland-Altman图 虚线代表平均差值,实线代表差值的95%置信区间。横轴代表每个体素的真实值(ground truth)与预测值的均值,纵轴代表真实值与预测值之间的误差。

Fig. 10. Original and predicted kidney images (grayscale, top two rows) with corresponding error maps (bottom three rows) for frames 55–61: (a) MAE, (b)Variance, and © Bias. All metrics calculated within kidney region.
图10 第55-61帧的肾脏原始图像与预测图像(灰度图,上方两行)及对应的误差图(下方三行) (a)平均绝对误差(MAE)图;(b)方差(Variance)图;(c)偏差(Bias)图。 所有指标均在肾脏区域内计算得出。

Fig. 11. (a) shows examples of predicted and ground truth images at frame 61 for three brain slices. (b) shows predicted and ground truth TACs for thecorresponding regions. The dashed lines represent the predicted frames, and the solid lines indicate the known ground truth
图11 (a)展示了三个脑部切片在第61帧的预测图像与真实图像(ground truth)示例。 (b)展示了对应区域的预测时间-活性曲线(TACs)与真实时间-活性曲线(TACs)。虚线代表预测帧,实线代表已知的真实值(ground truth)。

Fig. 12. Evaluation metrics of the predicted images across four datasets under both methods. , , and denote 𝑃 < 0.05, 𝑃 < 0.01, and 𝑃 < 0.001, respectively.
图12 两种方法在四个数据集上预测图像的评价指标 图中标记的“”“”和“”分别代表统计显著性水平为P<0.05、P<0.01和P<0.001。 如果后续需要补充图中具体涉及的“两种方法”名称或“四个数据集”的具体类别,我可以帮你进一步完善这份翻译内容,需要吗?

Fig. 13. Example of the generated parametric Ki images obtained from four regions. The reference images and the Ki images from our proposed method wereobtained by performing a linear regression fit to the last 13 frames of the Patlak plot data points. The main difference between the two methods is that the65 min data points were generated by the proposed network, with 35 min of data being unsupervised.
图13 四个区域生成的Ki参数图示例 参考图像与本研究提出方法生成的Ki参数图,均通过对Patlak图数据点的最后13帧进行线性回归拟合得到。 两种方法的主要差异在于:65分钟的数据点由所提网络生成,其中35分钟的数据为无监督生成。 关于你询问的“四个数据集的具体类别”,结合前文翻译内容(如体模数据、大鼠肾脏数据、人体胸部/肝脏/大脑数据),推测可能包含Zubal体模数据集、XCAT体模数据集、大鼠肾脏数据集及人体(胸部/肝脏/大脑等部位)数据集,但需以原文中对“四个数据集”的明确界定为准。若你能提供更多原文背景,我可帮你进一步确认。

Fig. B.14. Five mask position diagrams of four data sets.
图B.14 四个数据集的五种掩膜位置图
Table
表

Table 1Kinetic parameter setting
表 1 动力学参数设置

Table 2Evaluation metrics of rat kidney dataset for the predicted time frame images.
表 2 大鼠肾脏数据集预测时间帧图像的评价指标

Table 3The metrics between the truth TACs and predicted TACs across different regions of the brain
表 3 大脑不同区域的真实时间 - 活性曲线(TACs)与预测时间 - 活性曲线(TACs)之间的评价指标

Table 4Evaluation metrics for the parametric images of the four datasets and the fivemasked regions within each dataset.
表 4 四个数据集及其每个数据集中五个掩膜区域的参数图评价指标

Table 5Image evaluation metrics of four methods across four datasets.
表 5 四种方法在四个数据集上的图像评价指标

Table 6The effect of various kinetic parameter initialization methods on network performance, including six sets ofuniform distribution, five sets of random initialization with different means and variances, and nonlinear leastsquares initialization.
表6 不同动力学参数初始化方法对网络性能的影响 (含6组均匀分布初始化、5组不同均值和方差的随机初始化,以及非线性最小二乘初始化)