An automatic deep learning-based recognition model for liver trauma ultrasound images

Yanjie Wang; Yukun Luo; Xuelei He; Qing Song; Kun Wang; Jun Ma; Peng Han; Shuoshuo Li; Linli Kang

doi:10.3877/cma.j.issn.1672-6448.2022.03.002

Chinese Journal of Medical Ultrasound (Electronic Edition) >

2022 , Vol. 19 >Issue 03: 195 - 199

DOI: https://doi.org/10.3877/cma.j.issn.1672-6448.2022.03.002

Abdominal Ultrasound

An automatic deep learning-based recognition model for liver trauma ultrasound images

Yanjie Wang ¹ ,
Yukun Luo ^,²^,^† ,
Xuelei He ³ ,
Qing Song ² ,
Kun Wang ⁴ ,
Jun Ma ¹ ,
Peng Han ² ,
Shuoshuo Li ² ,
Linli Kang ²

Expand

^1.Department of Ultrasound, the First Medical Center of Chinese PLA General Hospital, Beijing 100853, China; Chinese PLA Medical School, Beijing 100853, China
^2.Department of Ultrasound, the First Medical Center of Chinese PLA General Hospital, Beijing 100853, China
^3.CAS Key Laboratory of Molecular Imaging, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China; School of Information Sciences and Technology, Northwest University, Xi'an 710127, China
^4.CAS Key Laboratory of Molecular Imaging, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China

Corresponding author: Luo Yukun, Email: lyk301@163.com

Received date: 2021-12-08

Online published: 2022-04-15

Copyright

No content published by the journals of Chinese Medical Association may be reproduced or abridged without authorization. Please do not use or copy the layout and design of the journals without permission.

All articles published represent the opinions of the authors, and do not reflect the official policy of the Chinese Medical Association or the Editorial Board, unless this is clearly specified.

Fold

Abstract

Objective

To establish a convolutional neural network-based liver injury diagnostic model (CNLDM) and evaluate its diagnostic value for liver trauma.

Methods

A total of 2009 ultrasound images of liver trauma and 1302 ultrasound images of normal liver were obtained through animal experiments, which were used as the training set and validation set of the model. As an external test set of the model, a retrospective collection of 153 ultrasound images of liver trauma and 81 liver ultrasound images without liver trauma was performed at the First Medical Center of Chinese PLA General Hospital from January 2015 to April 2021. Six doctors of different seniority interpreted the external test set, respectively. Receiver operating characteristic curve (ROC) analysis and decision curve analysis (DCA) were used to test the performance of the model, and the differences between the six physicians and CNLDM model in sensitivity, specificity, accuracy, negative predictive value, and positive predictive value of liver trauma were compared.

Results

The diagnostic performance of the CNLDM (sensitivity, 80%; specificity, 77%; positive predictive value, 87%; negative predictive value, 66%) was better than that of the junior physician group (sensitivity, 61%; specificity, 75%; positive predictive value, 82%; negative predictive value, 51%) (H=15.306, P＜0.001), inferior to that of the senior physician group (sensitivity, 84%; specificity, 86%; positive predictive value, 92%; negative predictive value, 75%) (H=3.289, P＜0.001), and close to that of the medium physician group (P＞0.05). DCA showed that the model had good test set returns when the threshold was between 0.4-0.6.

Conclusion

The artificial intelligence-based ultrasound model can accurately distinguish normal liver from abnormal liver with trauma, which is of great significance for further guiding clinical diagnosis and treatment.

Key words： Artificial intelligence; Deep learning; Hepatic trauma; Diagnosis; Ultrasonography

Cite this article

Yanjie Wang , Yukun Luo , Xuelei He , Qing Song , Kun Wang , Jun Ma , Peng Han , Shuoshuo Li , Linli Kang . An automatic deep learning-based recognition model for liver trauma ultrasound images[J]. Chinese Journal of Medical Ultrasound (Electronic Edition), 2022 , 19(03) : 195 -199 . DOI: 10.3877/cma.j.issn.1672-6448.2022.03.002

肝是人体重要的生理器官，在外力作用下容易发生肝实质挫裂伤^［1］，占所有腹部损伤的15%左右，部分患者还可合并某些并发症，如继发性出血、胆道损伤、胆漏和腹膜炎综合征等^［2］。超声具有实时动态、便捷及可重复等优势，是外伤患者初筛的重要手段之一。肝实质挫伤急性期（1周以内）的超声图像往往表现为不均匀高回声^［3］，随着时间延长，回声降低、范围缩小。有研究表明，超声识别钝性和穿透性肝实质损伤的特异度较高，但敏感度较低，仍有部分肝实质损伤被漏诊^{［4, 5］}。

深度学习方法是人工智能技术的一个分支，它是由多层神经网络自动提取图像特征，建立程序化分析模型，输出病变预测结果^［6］。深度学习辅助超声诊断不同疾病的作用已被多项研究证实^{［7, 8］}，它可以发现肉眼无法识别的图像细微纹理特征，这或许有助于辅助超声图像中肝实质挫裂伤的发现。本研究将深度学习与超声图像相联系，提出了卷积神经网络肝损伤模型（convolution network liver damage model，CNLDM），用于识别外伤患者的肝实质挫裂伤存在与否，并与不同年资医师判读效能进行对比，以检测模型的诊断效能。

资料与方法

一、对象

研究对象包括实验动物和临床患者。本研究经解放军总医院伦理委员会批准（审批号：伦审第S2020-323-01）。动物实验部分尊重动物的基本福利；临床试验为回顾性研究，免除了患者的知情同意。

数据由2部分构成。第1部分，动物实验获取图像数据，作为训练集和验证集。实验动物为18只巴马小型猪，体质量在30~35 kg之间，雌雄不限。麻醉、固定、备毛，经耳缘静脉建立静脉通道，持续麻醉并监测生命体征。3名从事超声检查工作4年以上的超声医师经过专门肝创伤超声评估培训后进行实验。使用自制打击器建立肝实质挫裂伤模型，通过人眼可视化和超声造影确定肝实质挫裂伤的位置，共获得含创伤灶（不同切面的）超声图像2009张及正常肝实质超声图像1302张。第2部分，临床数据作为测试集。回顾性选取2015年1月至2021年4月就诊于解放军总医院第一医学中心确诊的肝实质挫裂伤患者37例，共153张超声图像。数据纳入标准：（1）因外伤就诊于本院进行急诊超声检查，外伤史在1周以内；（2）经临床或其他影像学证实为肝实质挫裂伤；（3）病例资料完整。排除标准：（1）图像质量不佳；（2）合并其他病理性原因（如肝硬化等）导致肝实质发生弥漫性病变，或因非外伤原因导致的肝实质损伤。同时，选取81例没有肝实质挫裂伤患者的肝超声图像共81张作为对照组数据。

二、仪器与方法

1.仪器：采用Philips EPIQ 7、GE Logiq E9及Mindray M9彩色多普勒超声诊断仪，低频凸阵探头，探头频率为3.5~5.0 MHz。二维超声图像以JPG格式存储、导出。所有图片被调整为224×224像素大小。

2.模型构建：本研究在传统的卷积神经网络^［9］模型基础上通过迁移学习，经由ImageNet训练的ResNet50进行构建，获得CNLDM模型。其中，卷积网络1~4层被冻结，第5层为可训练层（图1）。通过增大数据集、增强数据、降低学习率及采用正则化参数等方法进行数据预处理以避免模型过拟合。该模型按照2∶1比例划分训练集和验证集，对动物数据通过三折交叉验证进行训练。即在3份均分数据中随机选取2份作为训练集，1份作为验证集，依次进行3次训练后获得3个模型，组装3个模型获得最终动物数据训练模型。考虑到动物超声图像和临床超声图像之间的差异，使用93张临床图像用于微调动物模型，获得CNLDM模型（图2）。

显示原图|下载原图ZIP|生成PPT

图1 卷积神经网络肝损伤模型构建示意图

显示原图|下载原图ZIP|生成PPT

图2 卷积神经网络肝损伤模型（CNDLM）数据分配及模型整体框架图

3.模型性能测试：比较CNLDM模型与不同年资医师组判读结果之间的差异。6名不同经验超声医师分为低年资、中年资及高年资组，每组各2名，超声从业经验分别为4年、8年及15年。6名超声医师分别对临床测试集图像进行判读。所有医师均独立作出诊断，且在判读前不知道真实结果。

三、统计学分析

使用R 21软件进行统计分析。三折交叉验证过程中的验证集用于评估动物模型的性能。测试集用于评估CNLDM模型与不同医师的预测效能。模型测试结果以受试者操作特征（receiver operating characteristic，ROC）曲线及决策曲线分析（decision curve analysis，DCA）表示。比较CNLDM模型及不同年资组医师诊断的敏感度、特异度、阳性预测值、阴性预测值及准确性，以Bootstrap方法计算模型95%可信区间。为了表示不同年资医师组的诊断水平平均值，医师组识别的结果以

x ¯

±s表示。使用Kruskal-Wallis test检验比较模型与医师识别的整体差异性。所有统计检验均为双边检验，P＜0.05为差异具有统计学意义。

结果

临床测试集数据结果显示，CNLDM模型识别肝实质挫裂伤的敏感度、特异度及准确性分别为80%、77%及79%，AUC达0.86（95%可信区间：0.82~0.89），可以较为可靠的识别肝实质挫裂伤（图3）。DCA分析显示模型在阈值0.4~0.6之间有较好的测试集收益（图4）。低、中、高年资医师组诊断敏感度（61%、72%、84%）、特异度（75%、89%、86%）及准确性（66%、78%、85%）不同，CNLDM模型诊断效能优于低年资医师组，略差于高年资医师组，差异具有统计学意义（P＜0.05），而与中年资医师组诊断效能接近，差异无统计学意义（P＞0.05，表1）。使用“热图”进行模型的可视化解释，其中红色部分为模型重点关注的可疑肝实质挫裂伤区域（图5）。

显示原图|下载原图ZIP|生成PPT

图3 卷积神经网络肝损伤模型（CNLDM）与不同年资医师组在测试集中的效能分析。蓝色曲线为CNLDM模型预测效能受试者操作特征曲线。图中三个点为低、中、高年资医师组诊断的平均效能（
$x ¯$
±s）

显示原图|下载原图ZIP|生成PPT

图4 卷积神经网络肝损伤模型的决策曲线分析。对于测试集曲线，阈值在0.4~0.6之间，模型有高于0.6的左右的收益

表1 CNLDM模型与不同年资医师识别肝实质挫裂伤的效能对比

组别	敏感度	特异度	阳性预测值	阴性预测值	准确性
CNLDM模型	0.80（0.61~0.91）	0.77（0.65~0.95）	0.87（0.82~0.96）	0.66（0.53~0.80）	0.79（0.71~0.84）
低年资^a	0.61±0.14	0.75±0.03	0.82±0.02	0.51±0.08	0.66±0.08
中年资^b	0.72±0.02	0.89±0.03	0.92±0.02	0.63±0.03	0.78±0.03
高年资^c	0.84±0.08	0.86±0.06	0.92±0.03	0.75±0.09	0.85±0.03

注：CNLDM为卷积神经网络肝损伤模型。模型诊断效能以95%可信区间表示，不同年资医师组的诊断效能以

x ¯

±s表示。与CNLDM模型诊断效能比较，H=15.306，^aP＜0.001；H=3.698，^bP=0.157；H=3.289，^cP＜0.001

显示原图|下载原图ZIP|生成PPT

图5 模型案例的输出结果可视化。热图是以模型计算的最后一层特征层，即高维特征与输入图像叠加得到的可视化图像，其中红色部分为模型重点关注区域，即模型经过学习，预测图像中存在肝创伤概率较高的区域

讨论

近年来，计算机深度学习在辅助超声决策方面的应用日益广泛，包括图像分类^{［10, 11］}、定位分割^{［12, 13］}、性质识别^{［14, 15］}、功能评估^［16］及预后预测^{［17, 18］}等，为辅助临床诊疗工作提供了便利。肝大部分位于右上腹，面积大且质地较脆，当腹部受到外力撞击后肝较易受到损伤。创伤后需要紧急评估患者生命体征和器官状态，包括血压、心率、精神状况及脏器受损情况等，以帮助临床医师做出最适合患者的治疗决策^{［19, 20, 21］}。在野战战场等紧急条件下，超声是战术救治最为便捷、可行、有效的影像学手段，它不仅可以评估脏器受损及内出血情况，还可以指导救治的可视化操作，在紧急救治的前线发挥了不可替代的作用^［22］。本研究基于深度学习网络，创新性地将人工智能应用于超声图像肝实质挫裂伤的识别，旨在帮助临床提高肝实质挫裂伤的检出率。

本研究提出了一种基于具有ImageNet预训练权重的ResNet50结构的迁移学习模型，用于依据超声图像检测肝实质挫裂伤。由于临床肝实质挫裂伤相关图像数量不足，难以满足模型训练要求，因此该模型首先在动物队列中进行训练，通过预训练的权重保持已有的低层次特征，然后通过临床图像微调模型，学习针对临床肝实质挫裂伤相关图像的更高层次特征。在评估该模型的效能时，使用临床患者的数据构建测试队列，比较不同年资超声医师组与CNLDM模型判读的效能差异。结果表明，该模型在超声图像中可以较好地预测肝实质挫裂伤，且效能优于低年资医师，稍逊于高年资医师，与具有8年超声诊断经验的中年资医师具有相近的诊断效能。这对于医疗资源有限的基层医院或情况紧急的野战条件下的救治具有重要意义，可以及时评估病情，指导制定进一步的治疗方案。

本研究有一些局限性。首先，该模型是预测超声图像中是否存在实质挫裂伤，模型只能输出“有”或“无”的预测结果。虽然模型的可视化可以解释它，但结果并不完全准确。肝挫裂伤的位置和大小的识别，即定位问题，将在以后的研究中具体阐述。此外，虽然本研究使用增大数据集、增强数据、降低学习率及采用正则化参数等方法进行数据预处理，但模型训练仍存在一定的过拟合问题，有待未来研究中增加样本数量，进一步优化模型^［23］。

综上所述，本研究提出的基于超声图像的深度学习模型，对肝实质挫裂伤自动识别进行了初步探索，该模型诊断效能优于初级医师，尤其对于医疗条件有限的基层医院或野战部队，具有一定的应用价值，可以较好地辅助超声诊断，提高肝实质挫裂伤的检出率，为临床评估病情提供参考信息。

References

Publishing order | Descend order by publishing year | Descend order by cited within

1	van As AB, Millar AJW. Management of paediatric liver trauma [J]. Pediatr Surg Int, 2017, 33(4): 445-453.

2	Buci S, Torba M, Gjata A, et al. The rate of success of the conservative management of liver trauma in a developing country [J]. World J Emerg Surg, 2017, 12: 24.

3	Padalino P, Bomben F, Chiara O, et al. Healing of blunt liver injury after non-operative management: role of ultrasonography follow-up [J]. Eur J Trauma Emerg Surg, 2009, 35(4): 364-370.

4	Badger SA, Barclay R, Campbell P, et al. Management of liver trauma [J]. World J Surg, 2009, 33(12): 2522-2537.

5	Stengel D, Bauwens K, Rademacher G, et al. Emergency ultrasound-based algorithms for diagnosing blunt abdominal trauma [J]. Cochrane Database Syst Rev, 2015, 2015(9): CD004446.

6	Chan HP, Samala RK, Hadjiiski LM, et al. Deep learning in medical image analysis [J]. Adv Exp Med Biol, 2020, 1213: 3-21.

7	Pehrson LM, Lauridsen C, Nielsen MB. Machine learning and deep learning applied in ultrasound [J]. Ultraschall Med, ,2018, 39(4): 379-381.

8	Brattain LJ, Telfer BA, Dhyani M, et al. Machine learning for medical ultrasound: status, methods, and future opportunities [J]. Abdom Radiol (NY), 2018, 43(4): 786-799.

9	Lai S, Xu L, Liu K, et al. Recurrent convolutional neural networks for text classification [C]. 29th AAAI Conference on Artificial Intelligence, Austin, Texas, USA, 2015.

10	Xie H N, Wang N, He M, et al. Using deep‐learning algorithms to classify fetal brain ultrasound images as normal or abnormal [J]. Ultrasound Obstet Gynecol, 2020, 56(4): 579-587.

11	Xue L, Jiang Z, Fu T, et al. Transfer learning radiomics based on multimodal ultrasound imaging for staging liver fibrosis [J]. Eur Radiol, 2020, 30(5): 2973-2983.

12	Zhou R, Fenster A, Xia Y, et al. Deep learning‐based carotid media‐adventitia and lumen‐intima boundary segmentation from three‐dimensional ultrasound images [J]. Med Phys, 2019, 46(7): 3180-3193.

13	Karimi D, Zeng Q, Mathur P, et al. Accurate and robust deep learning-based segmentation of the prostate clinical target volume in ultrasound images [J]. Med Image Anal, 2019, 57: 186-196.

Jin Z, Zhu Y, Zhang S, et al. Ultrasound computer-aided diagnosis (CAD) based on the thyroid imaging reporting and data system (TI-RADS) to distinguish benign from malignant thyroid nodules and the diagnostic performance of radiologists with different diagnostic experience [J]. Med Sci Monit, 2020, 26: e918452.

15	Shia W, Lin L, Chen D. Classification of malignant tumours in breast ultrasound using unsupervised machine learning approaches [J]. Sci Rep, 2021, 11(1): 1-11.

16	Kuo C, Chang C, Liu K, et al. Automation of the kidney function prediction and classification through ultrasound-based kidney imaging using deep learning [J]. NPJ Digit Med, 2019, 2: 29.

17	Liu F, Liu D, Wang K, et al. Deep learning radiomics based on contrast-enhanced ultrasound might optimize curative treatments for very-early or early-stage hepatocellular carcinoma patients [J]. Liver Cancer, 2020, 9(4): 397-413.

18	Liu D, Liu F, Xie X, et al. Accurate prediction of responses to transarterial chemoembolization for patients with hepatocellular carcinoma by using artificial intelligence in contrast-enhanced ultrasound [J]. Eur Radiol, 2020, 30(4): 2365-2376.

19	Gondek S, Schroeder ME, Sarani B. Assessment and resuscitation in trauma management [J]. Surg Clin North Am, 2017, 97(5): 985-998.

20	Schembari E, Sofia M, Latteri S, et al. Blunt liver trauma: effectiveness and evolution of non-operative management (NOM) in 145 consecutive cases [J]. Updates Surg, 2020, 72(4): 1065-1071.

21	Parra-Romero G, Contreras-Cantero G, Orozco-Guibaldo D, et al. Trauma abdominal: experiencia de 4961 casos en el occidente de México [J]. Cir Cir, 2019, 87(2): 183-189.

22	战术战伤救治中的超声技术应用专家共识 [J/CD]. 中华医学超声杂志(电子版), 2019, 16(12): 892-898.

23	Recht MP, Dewey M, Dreyer K, et al. Integrating artificial intelligence into the clinical practice of radiology: challenges and recommendations [J]. Eur Radiol, 2020, 30(6): 3576-3584.

Options

Outlines

模态框（Modal）标题

Abstract

Cite this article

资料与方法

一、对象

二、仪器与方法

图1 卷积神经网络肝损伤模型构建示意图

图2 卷积神经网络肝损伤模型（CNDLM）数据分配及模型整体框架图

三、统计学分析

结果

图3 卷积神经网络肝损伤模型（CNLDM）与不同年资医师组在测试集中的效能分析。蓝色曲线为CNLDM模型预测效能受试者操作特征曲线。图中三个点为低、中、高年资医师组诊断的平均效能（x¯±s）

图4 卷积神经网络肝损伤模型的决策曲线分析。对于测试集曲线，阈值在0.4~0.6之间，模型有高于0.6的左右的收益

表1 CNLDM模型与不同年资医师识别肝实质挫裂伤的效能对比

图5 模型案例的输出结果可视化。热图是以模型计算的最后一层特征层，即高维特征与输入图像叠加得到的可视化图像，其中红色部分为模型重点关注区域，即模型经过学习，预测图像中存在肝创伤概率较高的区域

讨论

References

图3 卷积神经网络肝损伤模型（CNLDM）与不同年资医师组在测试集中的效能分析。蓝色曲线为CNLDM模型预测效能受试者操作特征曲线。图中三个点为低、中、高年资医师组诊断的平均效能（
$x ¯$
±s）