With the rise of generative AI, there is a growing need for automatic metrics to evaluate the factual accuracy of clinical text, yet no such tool exists for echocardiography. Existing evaluation tools often underperform in this domain. To address this, we developed EchoGraph, a BERT-based model trained on densely annotated echocardiography reports using a schema tailored for this subspecialty and a dedicated F1-style reward to emphasize clinically important components. EchoGraph demonstrated strong performance in predicting entities (micro F1 0.85) and relations (micro F1 0.70). Its F1 reward was more sensitive to detecting corrupted report content than the RadGraph F1 reward (showing a 50%-60% vs. 6%-12% drop). EchoGraph thus offers an effective solution for evaluating and advancing language model-based applications in echocardiography, supporting the development of more accurate and clinically meaningful AI-generated reports.
Competing Interest StatementThe authors have declared no competing interest.
Funding StatementChieh-Ju Chao, MD, is supported by research funding from the CV Prospective award from the Mayo Clinic Department of Cardiovascular Medicine and the AI/ML Enablement award from the Center for Digital Health at the Mayo Clinic.
Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
The institutional review board of the Mayo Clinic gave approval for this work (protocol#22-010944).
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Data AvailabilityWe released our annotations on the external MIMIC-EchoNote dataset, as well as EchoGraph-annotated MIMIC-EchoNote data to facilitate future research. The Mayo Clinic data set is not publicly available due to patient privacy considerations, however, it can be obtained from the corresponding author upon reasonable request.
AbbreviationsAIArtificial IntelligenceDLDeep LearningEchoEchocardiographyFTFine-TuningGPTGenerative Pre-trained TransformerNLPNatural Language ProcessingSeq2seqSequence-to-sequenceTTETransthoracic EchocardiographyTEETransesophageal EchocardiographyLLMLarge Language Model
Comments (0)