切换至 "中华医学电子期刊资源库"

中华医学超声杂志(电子版) ›› 2022, Vol. 19 ›› Issue (06) : 554 -560. doi: 10.3877/cma.j.issn.1672-6448.2022.06.011

浅表器官超声影像学

基于超声影像组学的乳腺癌预测模型的诊断性能研究
张盼盼1, 张青陵1,(), 侯银1, 张良西1, 沈忠兵1, 鲁柯兵1, 杭哲1, 朱向明1   
  1. 1. 241001 芜湖,安徽省皖南医学院第一附属医院超声医学科
  • 收稿日期:2020-11-09 出版日期:2022-06-01
  • 通信作者: 张青陵
  • 基金资助:
    中央引导地方科技发展专项项目(2017070802D152)

Diagnostic performance of ultrasound-based radiomics models for predicting breast cancer

Panpan Zhang1, Qingling Zhang1,(), Yin Hou1, Liangxi Zhang1, Zhongbing Shen1, Kebing Lu1, Zhe Hang1, Xiangming Zhu1   

  1. 1. Department of Ultrasonography, The First Affiliated Hospital of Wannan Medical College, Wuhu 241001, China
  • Received:2020-11-09 Published:2022-06-01
  • Corresponding author: Qingling Zhang
引用本文:

张盼盼, 张青陵, 侯银, 张良西, 沈忠兵, 鲁柯兵, 杭哲, 朱向明. 基于超声影像组学的乳腺癌预测模型的诊断性能研究[J]. 中华医学超声杂志(电子版), 2022, 19(06): 554-560.

Panpan Zhang, Qingling Zhang, Yin Hou, Liangxi Zhang, Zhongbing Shen, Kebing Lu, Zhe Hang, Xiangming Zhu. Diagnostic performance of ultrasound-based radiomics models for predicting breast cancer[J]. Chinese Journal of Medical Ultrasound (Electronic Edition), 2022, 19(06): 554-560.

目的

评估并比较基于不同机器学习算法建立的乳腺癌超声影像组学预测模型的诊断性能。

方法

回顾性收集2017年1月至2019年4月就诊皖南医学院第一附属医院、有明确病理结果的乳腺肿块病例828例,以2018年8月31日为节点将其分为训练集(526例)和验证集(302例),提取肿块的超声影像组学特征并进行特征筛选,运用k最近邻(kNN)、逻辑回归(LR)、朴素贝叶斯(NB)、随机森林(RF)和支持向量机(SVM)5种机器学习算法分别建立预测模型,使用重复交叉验证方法做内部验证,计算比较各模型的敏感度、特异度、阳性预测值(PPV)和阴性预测值(NPV),并实施外部验证,绘制ROC曲线并比较ROC曲线下面积(AUC)以评价模型的鉴别诊断性能,绘制校准曲线评价模型校准度。

结果

从提取的109个影像组学特征中筛选出19个特征建立了5种机器学习算法模型。在内部验证中,5种模型的敏感度、特异度、PPV、NPV比较,总体差异均有统计学意义(P均<0.001)。LR模型的特异度、PPV、NPV中位数分别为0.769、0.816、0.778,3项指标均高于其他4种模型;敏感度中位数为0.824,高于kNN、RF和SVM模型。此外,SVM模型的特异度、PPV、NPV中位数分别为0.706、0.774、0.759,虽均低于LR模型,但均高于其他3种模型。在外部验证中,LR、SVM、RF、kNN和NB的AUC依次为0.890、0.832、0.821、0.746和0.703,其中LR与SVM的AUC差异有统计学意义(P=0.012);此外,各模型在校准性能上表现并不一致,LR和SVM模型的校准曲线显示乳腺癌实际概率与预测概率之间的一致性较好。

结论

以超声影像组学特征为基础,运用不同机器学习算法建立的乳腺癌超声预测模型,均表现出较高的诊断性能,其中LR模型表现最为突出;选择合适的机器学习算法有助于进一步提高预测模型的诊断性能,提供更加准确的量化预测结果。

Objective

To assess and compare the diagnostic performance of ultrasound-based radiomics models constructed by different machine learning (ML) algorithms for identifying breast cancer.

Methods

Between January 2017 and April 2019, 828 consecutive patients with pathologically confirmed breast lesions at the First Affiliated Hospital of Wannan Medical College were included in this study. The patients were divided into a training set (n=526) and a validation set (n=302) according to the cutoff date of August 31, 2018. Radiomics features were extracted from the gray-scale ultrasound images. After features selection, five ML models based on five ML algorithms including k-nearest neighbor (kNN), logistics regression (LR), naive Bayes (NB), random forest (RF), and support vector machine (SVM) were constructed using the training set. Internal validation was performed using repeated k-fold cross-validation. Diagnostic metrics including sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were calculated for comparison. Furthermore, model discrimination and calibration were assessed in the external validation via ROC and calibration curve analysis.

Results

Nineteen from 109 radiomics features were selected as effective features, and five ML prediction models were established. All the diagnostic metrics were statistically different across the models in the internal validation (P<0.001). The median specificity, PPV, and NPV of the LR model were 0.769, 0.816, and 0.778, respectively; the median sensitivity was 0.824, which was higher than that of the kNN, RF, and SVM models. The median specificity, PPV, and NPV of SVM model were 0.706, 0.774, and 0.759, respectively, which were lower than those of the LR model, but higher than those of the other three models. The LR, SVM, RF, kNN, and NB models demonstrated good discrimination in external validation, with area under the curve (AUC) values of 0.890, 0.832, 0.821, 0.746, and 0.703, respectively. Of these, a significant difference was observed in AUC values between the LR and SVM models (P=0.012). While all the models did not perform consistently, the calibration curves for LR and SVM models indicated that the actual probability and the predicted probability agreed well.

Conclusion

The five ultrasound-based radiomics models developed based on different ML algorithms all demonstrate high diagnostic performance, but the LR model stands out for its superior performance. By applying an appropriate ML algorithm, the diagnostic performance of the final model could be further enhanced.

图1 乳腺肿块影像组学特征提取超声图像。图a为常规二维超声显示乳腺肿块最大径切面;图b为使用3D-Slicer软件描记肿块轮廓
表1 训练集与验证集病例基线特征比较[例(%)]
图2 应用LASSO回归进行乳腺肿块影像组学特征筛选。图a为使用10倍交叉验证选择最佳参数;图b为109个特征的筛选过程
表2 内部验证5种预测模型对乳腺肿块的诊断性能[MP25P75)]
图3 外部验证5种预测模型诊断乳腺肿块的ROC曲线注:LR、SVM、RF、kNN、NB分别为5种不同的预测模型;AUC为ROC曲线下面积
图4 外部验证5 种预测模型的校准曲线。图a~e 分别为LR、SVM、RF、kNN、NB 5 种预测模型的校准曲线(校准曲线表示预测值对应的实际值,如果预测值=实际值,则与对角线重合;如果预测值>实际值,即高估风险,曲线位于对角线上方;如果预测值<实际值,即低估风险,曲线位于对角线下方)注:Predicted Probability 为模型预测风险;Actual Probability为实际发生风险
1
Evans A, Trimboli RM, Athanasiou A, et al. Breast ultrasound: recommendations for information to women and referring physicians by the European Society of Breast Imaging [J]. Insights Imaging, 2018, 9(4): 449-461.
2
周建桥, 詹维伟. 超声乳腺影像报告数据系统及其解读 [J/CD]. 中华超声医学杂志(电子版), 2011, 10 (3): 1332-1341.
3
Lam DL, Entezari P, Duggan C, et al. A phased approach to implementing the Breast Imaging Reporting and Data System (BI-RADS) in low-income and middle-income countries [J]. Cancer, 2020, 126 Suppl 10: 2424-2430.
4
Stavros AT, Freitas AG, deMello GGN, et al. Ultrasound positive predictive values by BI-RADS categories 3-5 for solid masses: An independent reader study [J]. Eur Radiol, 2017, 27(10): 4307-4315.
5
Varella MAS, da Cruz JT, Rauber A, et al. Role of BI-RADS ultrasound subcategories 4A to 4C in predicting breast cancer [J]. Clinical breast cancer, 2018, 18(4): e507-e511.
6
Lambin P, Rios-Velazquez E, Leijenaar R, et al. Radiomics: extracting more information from medical images using advanced feature analysis [J]. Eur J Cancer, 2012, 48(4): 441-446.
7
Gillies RJ, Kinahan PE, Hricak HJR. Radiomics: images are more than pictures, they are data [J]. Radiology, 2016, 278(2): 563-577.
8
van Griethuysen JJM, Fedorov A, Parmar C, et al. Computational radiomics system to decode the radiographic phenotype [J]. Cancer Res, 2017, 77(21): e104-e107.
9
Sauerbrei W, Royston P, Binder H. Selection of important variables and determination of functional form for continuous predictors in multivariable model building [J]. Stat Med, 2007, 26(30): 5512-5528.
10
Alba AC, Agoritsas T, Walsh M, et al. Discrimination and calibration of clinical prediction models: users' guides to the medical literature [J]. JAMA, 2017, 318(14): 1377-1384.
11
American Cancer Society. Breast Cancer Facts & Figures 2019-2020 [M]. Atlanta: American Cancer Society, 2019.
12
Amin MB, Edge SB, Greene FL, et al. AJCC Cancer Staging Manual [M]. 8th ed. New York: Springer, 2017.
13
Feng RM, Zong YN, Cao SM, et al. Current cancer situation in China: good or bad news from the 2018 Global Cancer Statistics? [J]. Cancer Commun (Lond), 2019, 39(1): 22.
14
Bray F, Ferlay J, Soerjomataram I, et al. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries [J]. CA Cancer J Clin, 2018, 68(6): 394-424.
15
Upadhaya T, Vallieres M, Chatterjee A, et al. Comparison of radiomics models built through machine learning in a multicentric context with independent testing: identical data, similar algorithms, different methodologies [J]. IEEE Transactions on Radiation and Plasma Medical Sciences, 2019, 3(2): 192-200.
16
Uddin S, Khan A, Hossain ME, et al. Comparing different supervised machine learning algorithms for disease prediction [J]. BMC Med Inform Decis Mak, 2019, 19(1): 281.
17
Luo WQ, Huang QX, Huang XW, et al. Predicting Breast Cancer in Breast Imaging Reporting and Data System (BI-RADS) Ultrasound Category 4 or 5 Lesions: A Nomogram Combining Radiomics and BI-RADS [J]. Sci Rep, 2019, 9(1): 11921.
18
Yu FH, Wang JX, Ye XH, et al. Ultrasound-based radiomics nomogram: A potential biomarker to predict axillary lymph node metastasis in early-stage invasive breast cancer [J]. Eur J Radiol, 2019, 119: 108658.
19
Dasgupta A, Brade S, Sannachi L, et al. Quantitative ultrasound radiomics using texture derivatives in prediction of treatment response to neo-adjuvant chemotherapy for locally advanced breast cancer [J]. 2020, 11(42): 3782-3792.
[1] 张梅芳, 谭莹, 朱巧珍, 温昕, 袁鹰, 秦越, 郭洪波, 侯伶秀, 黄文兰, 彭桂艳, 李胜利. 早孕期胎儿头臀长正中矢状切面超声图像的人工智能质控研究[J]. 中华医学超声杂志(电子版), 2023, 20(09): 945-950.
[2] 唐玮, 何融泉, 黄素宁. 深度学习在乳腺癌影像诊疗和预后预测中的应用[J]. 中华乳腺病杂志(电子版), 2023, 17(06): 323-328.
[3] 陈永庄, 莫小乔, 谢天. 心血管事件患者术后30 d死亡风险决策树模型的构建与评估——基于少数类样本合成过采样技术算法[J]. 中华危重症医学杂志(电子版), 2023, 16(05): 390-398.
[4] 范帅华, 郭伟, 郭军. 基于机器学习的决策树算法在血流感染预后预测中应用现状及展望[J]. 中华实验和临床感染病杂志(电子版), 2023, 17(05): 289-293.
[5] 张俊, 罗再, 段茗玉, 裘正军, 黄陈. 胃癌预后预测模型的研究进展[J]. 中华普通外科学文献(电子版), 2023, 17(06): 456-461.
[6] 李坤河, 寇萌佳, 邝立挺. 肝移植术后二次气管插管的危险因素及预测模型的建立[J]. 中华普通外科学文献(电子版), 2023, 17(05): 366-371.
[7] 唐旭, 韩冰, 刘威, 陈茹星. 结直肠癌根治术后隐匿性肝转移危险因素分析及预测模型构建[J]. 中华普外科手术学杂志(电子版), 2024, 18(01): 16-20.
[8] 王万川, 麦广智, 张晓槟, 冯家豪. 腹腔镜下LAR术治疗低位直肠癌保肛术中钉仓数量与术后LARS的关系[J]. 中华普外科手术学杂志(电子版), 2023, 17(05): 491-496.
[9] 李晓阳, 刘柏隆, 周祥福. 大数据及人工智能对女性盆底功能障碍性疾病的诊断及风险预测[J]. 中华腔镜泌尿外科杂志(电子版), 2023, 17(06): 549-552.
[10] 邢晓伟, 刘雨辰, 赵冰, 王明刚. 基于术前腹部CT的卷积神经网络对腹壁切口疝术后复发预测价值[J]. 中华疝和腹壁外科杂志(电子版), 2023, 17(06): 677-681.
[11] 韩冰, 顾劲扬. 深度学习神经网络在肝癌诊疗中的研究及应用前景[J]. 中华肝脏外科手术学电子杂志, 2023, 12(05): 480-485.
[12] 杨静, 顾红叶, 赵莹莹, 孙梦霞, 查园园, 王琪. 老年血液透析患者短期死亡的影响因素及列线图预测模型的预测作用[J]. 中华肾病研究电子杂志, 2023, 12(05): 254-259.
[13] 郭震天, 张宗明, 赵月, 刘立民, 张翀, 刘卓, 齐晖, 田坤. 机器学习算法预测老年急性胆囊炎术后住院时间探索[J]. 中华临床医师杂志(电子版), 2023, 17(9): 955-961.
[14] 王亚丹, 吴静, 黄博洋, 王苗苗, 郭春梅, 宿慧, 王沧海, 王静, 丁鹏鹏, 刘红. 白光内镜下结直肠肿瘤性质预测模型的构建与验证[J]. 中华临床医师杂志(电子版), 2023, 17(06): 655-661.
[15] 王俊杰, 尹晓亮, 刘二腾, 陆军, 祁鹏, 胡深, 杨希孟, 陈鲲鹏, 张东, 王大明. 机器学习对预测颈内动脉非急性闭塞患者血管内再通术成功的潜在价值[J]. 中华脑血管病杂志(电子版), 2023, 17(05): 464-470.
阅读次数
全文


摘要