Li's 影像组学视频学习笔记(29)-ICC的计算
强璐 以下是我的代码和原始数据:
import pingouin as pg
import pandas as pd
import numpy as np
import os
folderPath = ‘C:\Users\ltp-0810\Desktop\Radiomics\ICC\’
data1 = pd.read_csv(os.path.join(folderPath, ‘reader1.csv’))
data2 = pd.read_csv(os.path.join(folderPath, ‘reader2.csv’))
data1.insert(0, ‘reader’, np.ones(data1.shape[0]))
data2.insert(0, ‘reader’, np.ones(data2.shape[0])*2)
data1.insert(0, ‘patient’, range(data1.shape[0]))
data2.insert(0, ‘patient’, range(data2.shape[0]))
data_inter = pd.concat([data1, data2]) ###组间
##for 循环计算每个特征的一致性
ICC_inter = [] ##组间ICC
for colName in data_inter.columns[3:]:
ICC = pg.intraclass_corr(data=data_inter, targets=‘patient’, raters=‘reader’, ratings=colName)
ICC = ICC.iloc[2, 2] ##选择 ICC3
ICC_inter.append(ICC)
min(ICC_inter_) ##结果发现ICC最小值有小于0
- 已编辑
以下是我的代码和原始数据,运行后报错
报错:
_File “<stdin>”, line 2
ICC = pg.intraclass_corr(data=data, targets=‘R’, raters=‘PAT’, ratings=colName)
^
IndentationError: expected an indented block
ICC.apend(ICC)
Traceback (most recent call last):
File “<stdin>”, line 1, in <module>
AttributeError: ‘list’ object has no attribute ‘apend’_
代码:
import pingouin as pg
import pandas as pd
import numpy as np
import xlrd
import os
fPath = ‘/Users/caiqian890307/Downloads’
data = pd.read_excel(os.path.join(fPath, ‘phantom1_6.xlsx’))
ICC = []
for colName in data_.columns[3:]:
ICC = pg.intraclass_corr(data=data, targets=‘R’, raters=‘PAT’, ratings=colName)
ICC.apend(ICC)
print(ICC)
麻烦李老师帮忙看下代码错误原因,谢谢。
[未知用户] 我用spss也有小于零的,这个可能跟正相关和负相关概念有关,但是网上也有的说你看绝对值就行,用icc的绝对值评价其稳定程度。
import pingouin as pg
import pandas as pd
import numpy as np
import xlrd
import os
fPath = ‘/Users/caiqian890307/Downloads’
data = pd.read_excel(os.path.join(fPath, ‘phantom1_6.xlsx’))
ICC = []
for colName in data.columns[3:]:
ICC = pg.intraclass_corr(data=data, targets=‘R’, raters=‘PAT’, ratings=colName)
ICC.append(ICC)
print(ICC)
报错:
_File “<stdin>”, line 2
ICC = pg.intraclass_corr(data=data, targets=‘R’, raters=‘PAT’, ratings=colName)
^
IndentationError: expected an indented block
Hukui 你的数据里面评估者G,m评估数据有完全一样的,所以运行时会报警,以下是我弄得代码:
ICC_ = [] #建一个空表名ICC__与下面ICC不同名,防止数据被覆盖
for colName in data.columns[2:]: #你的数据从第3列开始是要进行比较的
ICC = pg.intraclass_corr(data=data, targets=‘R’, raters=‘PAT’, ratings=colName)
ICC = ICC.iloc[0, 2] ##选择ICC1时为[0,1], 选择ICC2时后面为[1,2]…
ICC.append(ICC)
print(ICC)
实在不行你再看看单独比较一列数据时会不会报错
- 已编辑
ltp0810 代码修改如下:
import pingouin as pg
import pandas as pd
import numpy as np
import os
data_inter = pd.read_excel(‘/Users/caiqian890307/Downloads/pythonforicc/phantom1_6.xlsx’)
icc_inter = []
for colName in data_inter.columns[2:]:
ICC = pg.intraclass_corr(data = data_inter, targets = “R”, raters = “PAT”, ratings = colName)
ICC = ICC.iloc[2,2]
icc_inter.append(ICC)
print(icc_inter)
运行还是报错,主要报错点在于"ICC = pg.intraclass_corr(data = data_inter, targets = “R”, raters = “PAT”, ratings = colName)"。
当我进行单个运行的时候,改代码为"ICC = pg.intraclass_corr(data = data_inter, targets = “R”, raters = “PAT”, ratings =
- “LongRunLowGrayLevelEmphasis.3”)",是运行成功的。我打印colName及其类型的时候,显示为"LowGrayLevelEmphasis.3″
<class ‘str’>
“JointAverage.3”
<class ‘str’>
“SumAverage.3”
<class ‘str’>
“JointEntropy.3”
<class ‘str’>
“ClusterShade.3”
<class ‘str’>
“MaximumProbability.3”
<class ‘str’>
“Idmn.3”
<class ‘str’>
“JointEnergy.3”
<class ‘str’>
“Contrast.6”
<class ‘str’>
“DifferenceEntropy.3”
我在想是不是读取列名的时候,数据类型发生转换了,把特征名称里面的数字和字母间自动添加了符号".“,导致循环时包含无效字符”.",发生的错误。我没有下载你上传的附件的权限,所以不知道你的文件里面特征名称后面有没有数字。我自己再排除看看
ltp0810 我把所有特征名称改为"num”后,再print(colName),显示的是num.1
num.2
num.3
num.4
num.5
num.6
num.7
num.8
num.9
num.10 一直到所有特征结束 num.419
所以我再想是不是colName不是单纯的字符串结构,里面还包含了index内容,直接使用colName不符合pg.intraclass_corr()函数需要的字符串形式。
[未知用户]
[未知用户] 我把读取的原文件写入csv文件,打开发现文件里面的有一部分特征名称的后面被添加了“.2″等类似尾缀,比如“ZoneEntropy.3 SmallAreaLowGrayLevelEmphasis.3 Coarseness.3 Complexity.3 Strength.3 Contrast.7 Busyness.3″。麻烦你看下你打印你读取的列名看名称后面有没有被添加类似的尾缀。
ltp0810 谢谢,可以使用,引号需要改成英文,在您的基础上给出我的
import pingouin as pg
import pandas as pd
import numpy as np
import os
folderPath = r"E:\CT_ICGR15\feeature_selection\delete"
data1 = pd.read_csv(os.path.join(folderPath, “10.csv”))
data2 = pd.read_csv(os.path.join(folderPath, “11.csv”))
data1.insert(0,“reader”,np.ones(data1.shape[0]))
data2.insert(0,“reader”,np.ones(data2.shape[0])*2)
data1.insert(0,“patient”,range(data1.shape[0]))
data2.insert(0,“patient”,range(data2.shape[0]))
data_inter = pd.concat([data1, data2]) ###组间
print(data_inter.columns)
ICC_inter = [] ##组间ICC
for colName in data_inter.columns[3:]:
ICC = pg.intraclass_corr(data=data_inter, targets="patient", raters="reader", ratings=colName)
ICC = ICC.iloc[2, 2] ##选择 ICC3 ##选择ICC1时为[0,1], 选择ICC2时后面为[1,2]…
ICC_inter.append(ICC)
print(ICC_inter)
df=pd.DataFrame(ICC_inter)
df.to_csv(‘E:\\ICC_inter.csv’)
楼主您好,我在使用for循环后只输出一个结果,是因为每次循环都覆盖了前一次的结果吗?怎么解决呢?以下是我的代码,感谢赐教!
import pingouin as pg
import pandas as pd
import numpy as np
import os
folderPath = “C:\\Users\\HP15-ab006TX\\Desktop\\ICC”
data1 = pd.read_csv(os.path.join(folderPath, “AA.csv”))
data2 = pd.read_csv(os.path.join(folderPath, “BB.csv”))
data1.insert(0,“reader”,np.ones(data1.shape[0]))
data2.insert(0,“reader”,np.ones(data2.shape[0])*2)
data1.insert(0,“patient”,range(data1.shape[0]))
data2.insert(0,“patient”,range(data2.shape[0]))
data_inter = pd.concat([data1, data2])
print(data_inter.columns)
icc_inter = []
for colName in data_inter.columns[2:]:
ICC = pg.intraclass_corr(data=data_inter, targets="patient", raters="reader", ratings=colName)
ICC = ICC.iloc[0, 2]
icc_inter.append(ICC)
print(icc_inter)