Large Vision Language Models (LVLMs) have recently revolutionized computational pathology. LVLMs transform pathology image embeddings into tokens recognizable by large language models, facilitating zero-shot image classification, description generation, question answering, and interactive diagnostics. In clinical practice, pathological assessments often require the analysis of entire tissue slides, integrating information from multiple sub-regions and magnification levels. However, existing LVLM frameworks have been restricted to the analysis of small, predefined regions of interest, lacking the ability to analyze pyramidal, gigapixel-scale whole-slide images (WSIs). In this work, we introduce ALPaCA (Adapting Llama for Pathology Context Analysis), and train the first general-purpose slide-level LVLM, leveraging 35,913 WSIs with curated descriptions alongside 341,051 question and answer pairs encompassing diverse diagnoses, procedures, and tissue types. By developing LongFormer, a vision-text interactive slide-level adaptor, and integrating it with a Gaussian mixture model-based prototyping adaptor, followed by training with Llama3.1, ALPaCA achieves superior performance in slide-level question answering, achieving over 90% accuracy in close-ended tests and high accuracy in open-ended questions as evaluated by expert pathologists, highlighting its potential for slide-level computer-aided diagnosis systems. Additionally, we show that ALPaCA can be readily fine-tuned on in-depth, organ-specific, or disease-specific datasets, underscoring its adaptability and utility for specialized pathology tasks.
Competing Interest StatementThe authors have declared no competing interest.
Funding StatementWe acknowledge funding and support from Cancer Research UK and the Cancer Research UK Cambridge Centre [CTRQQR-2021-100012], The Mark Foundation for Cancer Research [RG95043], GE HealthCare, and the CRUK National Cancer Imaging Translational Accelerator (NCITA) [A27066]. Additional support was also provided by the National Institute of Health Research (NIHR) Cambridge Biomedical Research Centre [NIHR203312] and EPSRC Tier-2 capital grant [EP/P020259/1]. The computations described in this research were performed in part using the Sulis Tier 2 HPC platform, hosted by the Scientific Computing Research Technology Platform at the University of Warwick. Sulis is funded by EPSRC Grant EP/T022108/1 and the HPC Midlands+ consortium. Additional computations were carried out using the Baskerville Tier 2 HPC service (https://www.baskerville.ac.uk/), which is operated by Advanced Research Computing at the University of Birmingham. Baskerville was funded by the EPSRC and UKRI through the World Class Labs scheme (EP/T022221/1) and the Digital Research Infrastructure programme (EP/W032244/1). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. The results shown here are in whole or part based upon data generated by the TCGA Research Network (https://www.cancer.gov/tcga). Part of the data used for the analyses described in this manuscript were obtained from the GTEx Portal (accessed on May 31, 2024).
Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
Only existing public datasets were used.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Data AvailabilityAll data produced in the present work are contained in the manuscript
Comments (0)