Wu, An Ni, Kulbay, Merve, Cheng, Phillip M., Cadrin-Chênevert, Alexandre, Létourneau-Guillon, Laurent, Chartrand, Gabriel, Chong, Jaron, Montagnon, Emmanuel, Ben Ayed, Ismail et Tang, An.
2025.
« Deep learning models connecting images and text: A primer for radiologists ».
Radiographics, vol. 45, nº 9.
Prévisualisation |
PDF
BenAyed-I-2025-31945.pdf - Version publiée Licence d'utilisation : Creative Commons CC BY. Télécharger (2MB) | Prévisualisation |
Résumé
In radiology practice, medical images are described and interpreted by radiologists in text reports. Recent technical developments enabling deep learning models to connect images and text may facilitate the radiologic workflow. These developments include advances in data embedding, self-supervised learning, zero-shot learning, and transformer-based model architectures. Models connecting images and text can be divided into four categories: (a) Text-image alignment models associate text descriptions with corresponding images. (b) Image-to-text models create text descriptions from images. (c) Text-to-image models generate images from text descriptions. (d) Multimodal models integrate and interpret multiple types of data such as images, videos, text, and numbers simultaneously. Potential clinical applications of these models include automated captioning of medical images, generation of the preliminary radiology report, and creation of educational images. These advances may enable case prioritization, streamlining of clinical workflows, and improvements in diagnostic accuracy.
Type de document: | Article publié dans une revue, révisé par les pairs |
---|---|
Professeur: | Professeur Ben Ayed, Ismail |
Affiliation: | Génie des systèmes |
Date de dépôt: | 19 sept. 2025 13:35 |
Dernière modification: | 24 sept. 2025 23:48 |
URI: | https://espace2.etsmtl.ca/id/eprint/31945 |
Actions (Authentification requise)
![]() |
Dernière vérification avant le dépôt |