Title
题目
Identification of Parkinson’s diseasePACE subtypes and repurposingtreatments through integrative analysesof multimodal data
通过多模态数据综合分析识别帕金森病PACE亚型并重新利用治疗方法
01
文献速递介绍
帕金森病(PD)是一种严重的神经退行性疾病,其临床表现和进展具有显著的异质性。本研究旨在通过综合分析多种数据模态来应对PD的异质性。我们利用机器学习和深度学习分析了新发PD患者(≥5年)的临床进展数据,以表征个体的表型进展轨迹并进行PD亚型分类。我们发现了三种表现出不同进展模式的PD亚型:缓慢进展亚型(PD-I),基线严重程度较轻且进展速度较慢;中等进展亚型(PD-M),基线严重程度较轻但进展速度适中;以及快速进展亚型(PD-R),症状进展速度最快。我们发现脑脊液中的P-tau/α-突触核蛋白比率和某些脑区的萎缩是这些亚型的潜在标志物。
通过网络分析方法对遗传和转录组数据进行分析,识别出与各亚型相关的分子模块。例如,PD-R特定模块表明STAT3、FYN、BECN1、APOA1、NEDD4和GATA2是PD-R的潜在驱动基因。它还提示神经炎症、氧化应激、代谢、PI3K/AKT和血管生成途径是快速PD进展(即PD-R)的潜在驱动因素。此外,我们通过网络分析方法和细胞系药物-基因特征数据,确定了可重新利用的药物候选物,以针对这些亚型特异性分子模块。我们进一步使用两个大型真实世界患者数据库估算了这些药物的治疗效果;我们获得的真实世界证据强调了二甲双胍在改善PD进展中的潜力。
总之,这项工作有助于更好地理解PD进展的临床和病理生理复杂性,并加速精准医学的发展。
Method
方法
Study cohorts for PD subtyping
The present study included two longitudinal PD cohorts for identifying PDsubtypes: the Parkinson’s Progression Markers Initiative (PPMI, http://www.ppmi-info.org) 25 and the Parkinson Disease Biomarkers Program
PD亚型划分的研究队列
本研究包括两个纵向PD队列用于识别PD亚型:帕金森进展标志物计划(PPMI,http://www.ppmi-info.org)和帕金森病生物标志物项目
Results
结果
In this investigation, we adopted data of participants in the Parkinson’sProgressionMarkers Initiative (PPMI) study, an international observationalPD study that systematically collected clinical, biospecimen, multi-omics,and brain imaging data of participants25. Our analysis included 406 de novoPD participants (PD diagnosis within the last 2 years and untreated atenrollment) in the PPMI cohort, comprising 140 (34.5%) women and 266(65.5%) men, with an average age of 59.6 ± 10.0 years at PD onset; 188healthy control (HC) volunteers, comprising 67 (35.6%) women and 121(64.4%) men; and 61 participants who had dopamine transporter scanswithout evidence of dopaminergic deficit (SWEDD), comprising 23 (37.7%)and 38 (62.3%) men (see Supplementary Table 1). Specifically, we developeda deep learning model to capture PD phenotypic progression trends usingover 5-year longitudinal clinical assessments of the de novo PD, HC, andSWEDD participants. Clustering analysis was conducted based on thelearned progression profiles among the de novo PD participants to derivesubtypes. We further examined individual’s neuroimaging, CSF, genetic,and transcriptomic data to identify subtype-specific biomarkers andmolecular modules. To demonstrate robustness of our method in capturingPD progression trends to identify subtypes, we replicated our deep learningmodel among participants in the Parkinson Disease Biomarkers Program(PDBP)26. More details of participants included in this study can be found inthe “Methods” and Supplementary Table 1.
在本研究中,我们采用了帕金森进展标志物计划(PPMI)研究的数据。PPMI是一项国际观察性PD研究,系统地收集了参与者的临床、生物样本、多组学和脑成像数据。我们的分析包括了PPMI队列中406名新发PD参与者(在过去2年内确诊PD且在入组时未接受治疗),其中140名(34.5%)为女性,266名(65.5%)为男性,PD发病时的平均年龄为59.6 ± 10.0岁;188名健康对照(HC)志愿者,其中67名(35.6%)为女性,121名(64.4%)为男性;以及61名没有显示多巴胺能缺陷的多巴胺转运体扫描(SWEDD)参与者,其中23名(37.7%)为女性,38名(62.3%)为男性(见补充表1)。具体来说,我们开发了一个深度学习模型,通过对新发PD、HC和SWEDD参与者超过5年的纵向临床评估,捕捉PD表型进展趋势。基于新发PD参与者的进展特征进行了聚类分析,以推导亚型。我们进一步检查了个体的神经影像、脑脊液、遗传和转录组数据,以识别亚型特异性生物标志物和分子模块。为了证明我们的方法在捕捉PD进展趋势和识别亚型方面的稳健性,我们在帕金森病生物标志物项目(PDBP)中的参与者中复制了我们的深度学习模型。关于本研究中参与者的更多详细信息,请参见“方法”和补充表1。
Figure
图
Fig. 1 | A diagram illustrating the present analysis. a Collecting longitudinalclinical data from the Parkinson’s Progression Markers Initiative (PPMI) and Parkinson’s Disease Biomarkers Program (PDBP) cohorts and conducting necessarydata cleaning and preprocessing. b Development of a deep phenotypic progressionembedding (DPPE) model to learn a progression embedding vector for each individual, which encodes his/her PD symptom progression trajectory. cCluster analysiswith the learned embedding vectors to identify PD subtypes, each of which reveal aunique PD progression pattern. d Identifying CSF biomarkers and imaging markersthe discovered PD subtypes. e Construction of PD subtype-specific molecularmodules based on genetic and transcriptomic data, along with human proteinprotein interactome (PPI) network analyses, using network medicine approaches.f In silico drug repurposing based on subtype-specific molecular profiles and validation of drug candidates’ treatment efficiency based on analysis of large-scale realworld patient databases, i.e., the INSIGHT and OneFlorida + . g Architecture of theDPPE model. Specifically, DPPE engaged two Long-Short Term Memory (LSTM)units—one as encoder receiving an individual’s longitudinal clinical records andcompacting them into a low-dimensional embedding space; while another taking theindividual’s embedding vector to reconstruct the original clinical records. DPPE wastrained by minimizing the reconstruction difference.
图1 | 说明本次分析的示意图。a 从帕金森进展标志物计划(PPMI)和帕金森病生物标志物项目(PDBP)队列中收集纵向临床数据,并进行必要的数据清理和预处理。b 开发深度表型进展嵌入(DPPE)模型,为每个个体学习一个进展嵌入向量,该向量编码其PD症状进展轨迹。c 使用学习到的嵌入向量进行聚类分析,识别PD亚型,每个亚型揭示出独特的PD进展模式。d 识别所发现PD亚型的脑脊液(CSF)生物标志物和成像标志物。e 基于遗传和转录组数据构建PD亚型特异性分子模块,并结合人类蛋白-蛋白相互作用(PPI)网络分析,采用网络医学方法。f 基于亚型特异性分子特征进行计算机模拟药物再利用,并通过分析大规模真实世界患者数据库(如INSIGHT和OneFlorida+)验证药物候选者的治疗效果。g DPPE模型的架构。具体来说,DPPE使用了两个长短期记忆(LSTM)单元——一个作为编码器接收个体的纵向临床记录并将其压缩到低维嵌入空间;另一个则接收个体的嵌入向量以重建原始临床记录。DPPE通过最小化重建差异进行训练。
Fig. 2 | Progression patterns of the three PD subtypes within the PPMI cohort.a Averaged progression trajectories in clinical manifestations by subtypes, withshading indicating standard error of the mean (SEM). b Sankey diagrams showingevolution patterns of motor phenotypes (tremor dominant, indeterminate, andPIGD) by subtypes. c Sankey diagrams showing evolution patterns of cognitionphenotypes (normal cognition, MCI, and dementia) by subtypes. d Sankey diagramsshowing evolution patterns of mood phenotypes (normal, mild depression, moderate depression, and severe depression) by subtypes. e Sankey diagrams showingevolution patterns of sleep phenotypes (REM sleep behavior disorder [RBD]negative and positive) by subtypes.
图2 | PPMI队列中三种PD亚型的进展模式。a 各亚型临床表现的平均进展轨迹,阴影表示平均标准误(SEM)。b 桑基图显示不同亚型运动表型(震颤为主型、不确定型和姿势不稳步态障碍型[PIGD])的演变模式。c 桑基图显示不同亚型认知表型(正常认知、轻度认知障碍[MCI]和痴呆)的演变模式。d 桑基图显示不同亚型情绪表型(正常、轻度抑郁、中度抑郁和重度抑郁)的演变模式。e 桑基图显示不同亚型睡眠表型(REM睡眠行为障碍[RBD]阴性和阳性)的演变模式。
Fig. 3 | CSF biomarkers and neuroimaging markers of the identified subtypes.a CSF biomarkers by PD subtypes. On each box plot, the central mark indicates themedian value and the bottom and top edges of the box indicate the interquartilerange (IQR) with whiskers covering the most extreme values within 1.5 × IQR.b Regions showing significant signals in 1-year brain atrophy between a pair ofsubtypes. 1-year brain atrophy was measured by cortical thickness and white mattervolume from 34 region of interests (ROIs), defined by the Desikan-Killiany atlas(averaged over the left and right hemispheres). Color density denotes significance iterms of -log10(P).
图3 | 识别出的亚型的脑脊液(CSF)生物标志物和神经影像标志物a 各PD亚型的CSF生物标志物。在每个箱线图上,中心标记表示中位值,箱体的底部和顶部边缘表示四分位距(IQR),须线覆盖1.5 × IQR内的最极端值。b 显示一对亚型之间1年脑萎缩显著信号的区域。1年脑萎缩通过皮层厚度和34个感兴趣区域(ROIs)白质体积测量,ROIs由Desikan-Killiany图谱定义(左半球和右半球的平均值)。颜色密度表示-log10(P)的显著性。
Fig.4|PDRsubtypespecificmolecularmodulesrevealingpotentialbiologicalmechanismsofrapidPDprogression.aGeneticmolecularmoduleofPDR.bPathwaysenrichedbasedongeneticmolecularmoduleofPDR.cAsubnetworkoftranscriptomicmolecularmoduleofPDR.TheentiretranscriptomicmolecularmoduleofPDRcanbeintheSupplementaryFig.9.dPathwaysenrichedbasedontranscriptomicmolecularmoduleofPD-R.
图4 | PD-R亚型特异性分子模块揭示的快速PD进展的潜在生物机制a PD-R的遗传分子模块。b 基于PD-R遗传分子模块的富集通路。c PD-R转录组分子模块的子网络。PD-R的整个转录组分子模块可以在补充图9中找到。d 基于PD-R转录组分子模块的富集通路。
Fig.5|IdentifiedrepurposabledrugcandidatesforpreventingPDprogressionbytargetingsubtypespecificmolecularchanges.aGenesetenrichmentanalysis(GSEA)basedonsubtypespecificgenemoduleswithbulkRNAseqdataofindividualsandtranscriptomicsbaseddruggenesignaturedatainhumancelllinesidentifiedrepurposabledrugcandidatesfordifferentPDpacesubtypes.TreatmenteffectestimationusingtheINSIGHTdatawithinthebroadPDpopulation(b)andprobablePDRpopulation©.TreatmenteffectestimationusingtheOneFlorida+datawithinthebroadPDpopulation(d)andprobablePDRpopulation(e).aThedrugdoesn’thavesufficientpatientdata(<100)foranalysis.bThedrugdoesnothavesufficientbalancedemulatedtrials(<10).NTindicatesthenumberofeligiblePDpatientswhoreceivedthetesteddrugafterPDinitiation.
图5 | 针对亚型特异性分子变化识别用于预防PD进展的可重新利用药物候选物a 基于亚型特异性基因模块的基因集合富集分析(GSEA)与个体的整体RNA-seq数据和人类细胞系中的转录组药物-基因特征数据,识别不同PD进展亚型的可重新利用药物候选物。b 在广泛PD人群中使用INSIGHT数据估计治疗效果。c 在可能的PD-R人群中使用INSIGHT数据估计治疗效果。d 在广泛PD人群中使用OneFlorida+数据估计治疗效果。e 在可能的PD-R人群中使用OneFlorida+数据估计治疗效果。a 该药物没有足够的患者数据(<100)进行分析。b 该药物没有足够平衡的模拟试验(<10)。NT表示在PD发病后接受测试药物治疗的合格PD患者人数。
Fig. 6 | Comparisons of the identified pace subtypes with conventional motorsubtypes and prior data-driven subtypes. Notably, our subtyping algorithm wascompletely data-driven and hypothesis-free. In addition, since our method modeledindividuals’ phenotypic progression profile for PD subtyping, the identified subtypesdemonstrated unique progression patterns and, importantly, were stable over time.
图6 | 识别出的进展亚型与传统运动亚型和先前数据驱动亚型的比较。值得注意的是,我们的亚型划分算法完全基于数据驱动且无假设。此外,由于我们的方法为PD亚型划分建模了个体的表型进展特征,识别出的亚型展示了独特的进展模式,并且,重要的是,这些亚型在时间上是稳定的。
Table
表
Table 1 | Demographics and baseline clinical characteristics by subtypes within the PPMI cohort
表1 | PPMI队列中各亚型的人口统计学和基线临床特征
Table 1 (continued) | Demographics and baseline clinical characteristics by subtypes within the PPMI cohort
表1(续)| PPMI队列中各亚型的人口统计学和基线临床特征
Table 2 | Annual progression rates in clinical manifestations and CSF biomarkers by subtypes assessed by linear mixed effectsmodels within the PPMI cohort
表2 | PPMI队列中通过线性混合效应模型评估的各亚型临床表现和脑脊液(CSF)生物标志物的年进展率