PaliGemma-CXR: Adapting an Open-Weight Vision-Language Model for Chest X-ray Interpretation
Recent advancements in vision-language models (VLMs) have demonstrated remarkable capabilities across diverse domains. In this talk, we explore the effectiveness of VLMs in a transfer learning setting, where a pre-trained model is fine-tuned on domain specific data. We first introduce PaliGemma 2, a state-of-the-art, open weight VLM from Google with detection and segmentation capabilities. We then present its application to chest X-ray (CXR) interpretation, detailing the adaptation process that achieved state-of-the-art performance on radiology report generation. This talk highlights the potential of VLMs to democratize access to advanced medical image analysis tools with practical guidance on how to leverage them.

Sahar Kazemzadeh
Software Engineer, Health AI
at Google

Andreas Steiner
Research Engineer
at Google