PaliGemma-CXR: Adapting an Open-Weight Vision-Language Model for Chest X-ray Interpretation

Recent advancements in vision-language models (VLMs) have demonstrated remarkable capabilities across diverse domains. In this talk, we explore the effectiveness of VLMs in a transfer learning setting, where a pre-trained model is fine-tuned on domain specific data. We first introduce PaliGemma 2, a state-of-the-art, open weight VLM from Google with detection and segmentation capabilities. We then present its application to chest X-ray (CXR) interpretation, detailing the adaptation process that achieved state-of-the-art performance on radiology report generation. This talk highlights the potential of VLMs to democratize access to advanced medical image analysis tools with practical guidance on how to leverage them.

About the speaker
Sahar Kazemzadeh

Sahar Kazemzadeh

Software Engineer, Health AI
at Google

Andreas Steiner

Andreas Steiner

Research Engineer
at Google

NLP-Summit

When

Online Event | April 1-2, 2025

Contact

nlpsummit@johnsnowlabs.com

Presented by

jhonsnow_logo