Background Chatbots powered by large language models (LLMs) have recently emerged as prominent sources of information. However, their ability to propagate misinformation as well as information, particularly in specialized fields like audiology and otolaryngology, remains underexplored. This study aimed to evaluate the accuracy of six popular chatbots – ChatGPT, Gemini, Claude, DeepSeek, Grok, and Mistral – in response to questions framed around a range of unproven methods in audiological and otolaryngological care.
Methods A set of 50 questions was developed based on common conversations between patients and clinicians. We then posed these questions to the chatbots. We tested each chatbot 10 times to account for variable responses, producing a total of 3,000 responses. The responses were compared with correct answers based on the general opinion of 11 professionals. The consistency of the responses was evaluated by Cohen’s Kappa.
Results Most chatbot responses to the majority of questions were deemed accurate. Grok consistently performed best, where its answers aligned perfectly with the opinions of the experts. Deepseek exhibited the lowest accuracy, scoring 95.8%. Mistral exhibited the lowest consistency, scoring 0.96.
Conclusions Although the evaluated chatbots generally avoided endorsing scientifically unsupported methods, some of the answers given could mislead and facilitate misinformation. The best performer among the group was Grok, which provided consistently accurate responses, showing it has potential for use in clinical and educational settings.
Competing Interest StatementThe authors have declared no competing interest.
Funding StatementThis study did not receive any funding.
Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Data AvailabilityAll data produced are attached as a supplementary file.
Comments (0)