Visual Foundational Models
Image encoders are often trained for project-specific tasks, failing to capture an optimal representation of the medical scan. We aim to develop powerful encoders capable of generating high-quality embeddings that encode the complex relationships between pixels tuned to the medical domain. Subsequently, we work on various downstream tasks to utilize the embeddings for clinical use-cases, such as:
- Automatic Radiology Report Generation with Image Grounding
- Promptable Segmentation via Textual Queries