强璐 楼主您好,当我使用pingouin计算每个特征的ICC时,发现计算出的ICC存在小于0的数,好像ICC只能是0-1这个范围吧,存在疑惑,向您请教一下!!!

    强璐 以下是我的代码和原始数据:
    import pingouin as pg
    import pandas as pd
    import numpy as np
    import os
    folderPath = ‘C:\Users\ltp-0810\Desktop\Radiomics\ICC\’
    data1 = pd.read_csv(os.path.join(folderPath, ‘reader1.csv’))
    data2 = pd.read_csv(os.path.join(folderPath, ‘reader2.csv’))

    data1.insert(0, ‘reader’, np.ones(data1.shape[0]))
    data2.insert(0, ‘reader’, np.ones(data2.shape[0])*2)

    data1.insert(0, ‘patient’, range(data1.shape[0]))
    data2.insert(0, ‘patient’, range(data2.shape[0]))

    data_inter = pd.concat([data1, data2]) ###组间

    ##for 循环计算每个特征的一致性
    ICC_inter = [] ##组间ICC
    for colName in data_inter.columns[3:]:
    ICC = pg.intraclass_corr(data=data_inter, targets=‘patient’, raters=‘reader’, ratings=colName)
    ICC = ICC.iloc[2, 2] ##选择 ICC3
    ICC_inter.append(ICC)
    min(ICC_inter_) ##结果发现ICC最小值有小于0

    reader1.txt
    1MB
    reader2.txt
    1MB

      ltp0810 没想通 怀疑是bug,可以把有问题的组挑出来 用其他软件 比如 SPSS做一下看看

        以下是我的代码和原始数据,运行后报错
        报错:
        _File “<stdin>”, line 2
        ICC = pg.intraclass_corr(data=data, targets=‘R’, raters=‘PAT’, ratings=colName)
        ^
        IndentationError: expected an indented block

        ICC.apend(ICC)
        Traceback (most recent call last):
        File “<stdin>”, line 1, in <module>
        AttributeError: ‘list’ object has no attribute ‘apend’_

        代码:
        import pingouin as pg
        import pandas as pd
        import numpy as np
        import xlrd
        import os
        fPath = ‘/Users/caiqian890307/Downloads’
        data = pd.read_excel(os.path.join(fPath, ‘phantom1_6.xlsx’))
        ICC = []
        for colName in data_.columns[3:]:
        ICC = pg.intraclass_corr(data=data, targets=‘R’, raters=‘PAT’, ratings=colName)
        ICC.apend(ICC)
        print(ICC)

        phantom1-6.xlsx
        70kB

        麻烦李老师帮忙看下代码错误原因,谢谢。


        [未知用户] 我用spss也有小于零的,这个可能跟正相关和负相关概念有关,但是网上也有的说你看绝对值就行,用icc的绝对值评价其稳定程度。


        import pingouin as pg
        import pandas as pd
        import numpy as np
        import xlrd
        import os
        fPath = ‘/Users/caiqian890307/Downloads’
        data = pd.read_excel(os.path.join(fPath, ‘phantom1_6.xlsx’))
        ICC = []
        for colName in data.columns[3:]:
        ICC = pg.intraclass_corr(data=data, targets=‘R’, raters=‘PAT’, ratings=colName)
        ICC.append(ICC)
        print(ICC)

        报错:
        _File “<stdin>”, line 2
        ICC = pg.intraclass_corr(data=data, targets=‘R’, raters=‘PAT’, ratings=colName)
        ^
        IndentationError: expected an indented block

          Hukui 你的数据里面评估者G,m评估数据有完全一样的,所以运行时会报警,以下是我弄得代码:

          ICC_ = [] #建一个空表名ICC__与下面ICC不同名,防止数据被覆盖
          for colName in data.columns[2:]: #你的数据从第3列开始是要进行比较的
          ICC = pg.intraclass_corr(data=data, targets=‘R’, raters=‘PAT’, ratings=colName)
          ICC = ICC.iloc[0, 2] ##选择ICC1时为[0,1], 选择ICC2时后面为[1,2]…
          ICC.append(ICC)
          print(ICC
          )
          实在不行你再看看单独比较一列数据时会不会报错

            ltp0810 代码修改如下:
            import pingouin as pg
            import pandas as pd
            import numpy as np
            import os

            data_inter = pd.read_excel(‘/Users/caiqian890307/Downloads/pythonforicc/phantom1_6.xlsx’)

            icc_inter = []
            for colName in data_inter.columns[2:]:
            ICC = pg.intraclass_corr(data = data_inter, targets = “R”, raters = “PAT”, ratings = colName)
            ICC = ICC.iloc[2,2]
            icc_inter.append(ICC)
            print(icc_inter)

            运行还是报错,主要报错点在于"ICC = pg.intraclass_corr(data = data_inter, targets = “R”, raters = “PAT”, ratings = colName)"。

            当我进行单个运行的时候,改代码为"ICC = pg.intraclass_corr(data = data_inter, targets = “R”, raters = “PAT”, ratings =

            • “LongRunLowGrayLevelEmphasis.3”)",是运行成功的。我打印colName及其类型的时候,显示为"LowGrayLevelEmphasis.3″
              <class ‘str’>
              “JointAverage.3”
              <class ‘str’>
              “SumAverage.3”
              <class ‘str’>
              “JointEntropy.3”
              <class ‘str’>
              “ClusterShade.3”
              <class ‘str’>
              “MaximumProbability.3”
              <class ‘str’>
              “Idmn.3”
              <class ‘str’>
              “JointEnergy.3”
              <class ‘str’>
              “Contrast.6”
              <class ‘str’>
              “DifferenceEntropy.3”
              我在想是不是读取列名的时候,数据类型发生转换了,把特征名称里面的数字和字母间自动添加了符号".“,导致循环时包含无效字符”.",发生的错误。我没有下载你上传的附件的权限,所以不知道你的文件里面特征名称后面有没有数字。我自己再排除看看

            ltp0810 我把所有特征名称改为"num”后,再print(colName),显示的是num.1
            num.2
            num.3
            num.4
            num.5
            num.6
            num.7
            num.8
            num.9
            num.10 一直到所有特征结束 num.419
            所以我再想是不是colName不是单纯的字符串结构,里面还包含了index内容,直接使用colName不符合pg.intraclass_corr()函数需要的字符串形式。

            [未知用户]
            [未知用户] 我把读取的原文件写入csv文件,打开发现文件里面的有一部分特征名称的后面被添加了“.2″等类似尾缀,比如“ZoneEntropy.3 SmallAreaLowGrayLevelEmphasis.3 Coarseness.3 Complexity.3 Strength.3 Contrast.7 Busyness.3″。麻烦你看下你打印你读取的列名看名称后面有没有被添加类似的尾缀。

              Hukui 你加一下我的qq:3043663828,咱共同学习一下吧

                问题已经解决,感谢两位的指导,特别是感谢ltp0810

                  6 个月 后

                  ltp0810 谢谢,可以使用,引号需要改成英文,在您的基础上给出我的

                  import pingouin as pg

                  import pandas as pd

                  import numpy as np

                  import os

                  folderPath = r"E:\CT_ICGR15\feeature_selection\delete"

                  data1 = pd.read_csv(os.path.join(folderPath, “10.csv”))

                  data2 = pd.read_csv(os.path.join(folderPath, “11.csv”))

                  data1.insert(0,“reader”,np.ones(data1.shape[0]))

                  data2.insert(0,“reader”,np.ones(data2.shape[0])*2)

                  data1.insert(0,“patient”,range(data1.shape[0]))

                  data2.insert(0,“patient”,range(data2.shape[0]))

                  data_inter = pd.concat([data1, data2]) ###组间

                  print(data_inter.columns)

                  ICC_inter = [] ##组间ICC

                  for colName in data_inter.columns[3:]:

                  ICC = pg.intraclass_corr(data=data_inter, targets="patient", raters="reader", ratings=colName)
                  
                  ICC = ICC.iloc[2, 2] ##选择 ICC3 ##选择ICC1时为[0,1], 选择ICC2时后面为[1,2]…
                  
                  ICC_inter.append(ICC)
                  
                  print(ICC_inter) 

                  df=pd.DataFrame(ICC_inter)

                  df.to_csv(‘E:\\ICC_inter.csv’)

                  2 个月 后

                  楼主您好,我在使用for循环后只输出一个结果,是因为每次循环都覆盖了前一次的结果吗?怎么解决呢?以下是我的代码,感谢赐教!

                  import pingouin as pg

                  import pandas as pd

                  import numpy as np

                  import os

                  folderPath = “C:\\Users\\HP15-ab006TX\\Desktop\\ICC”

                  data1 = pd.read_csv(os.path.join(folderPath, “AA.csv”))

                  data2 = pd.read_csv(os.path.join(folderPath, “BB.csv”))

                  data1.insert(0,“reader”,np.ones(data1.shape[0]))

                  data2.insert(0,“reader”,np.ones(data2.shape[0])*2)

                  data1.insert(0,“patient”,range(data1.shape[0]))

                  data2.insert(0,“patient”,range(data2.shape[0]))

                  data_inter = pd.concat([data1, data2])

                  print(data_inter.columns)

                  icc_inter = []

                  for colName in data_inter.columns[2:]:

                  ICC = pg.intraclass_corr(data=data_inter, targets="patient", raters="reader", ratings=colName)

                  ICC = ICC.iloc[0, 2]

                  icc_inter.append(ICC)

                  print(icc_inter)

                    说点什么吧...