Large Language Models’ Responses to Spinal Cord Injury: A Comparative Study of Performance

Spinal Cord Injury (SCI) 2016 Facts and Figures at a Glance (2016). J Spinal Cord Med 39 (4):493–494. https://doi.org/10.1080/10790268.2016.1210925

Priebe MM, Chiodo AE, Scelza WM, Kirshblum SC, Wuermser LA, Ho CH (2007) Spinal cord injury medicine. 6. Economic and societal issues in spinal cord injury. Arch Phys Med Rehabil 88 (3 Suppl 1):S84-88. https://doi.org/10.1016/j.apmr.2006.12.005

Article  Google Scholar 

Courtine G, Sofroniew MV (2019) Spinal cord repair: advances in biology and technology. Nat Med 25 (6):898–908. https://doi.org/10.1038/s41591-019-0475-6

Article  CAS  PubMed  Google Scholar 

Li J, Luo W, Xiao C, Zhao J, Xiang C, Liu W, Gu R (2023) Recent advances in endogenous neural stem/progenitor cell manipulation for spinal cord injury repair. Theranostics 13 (12):3966–3987. https://doi.org/10.7150/thno.84133

Article  CAS  PubMed  PubMed Central  Google Scholar 

Tian T, Zhang S, Yang M (2023) Recent progress and challenges in the treatment of spinal cord injury. Protein Cell 14 (9):635–652. https://doi.org/10.1093/procel/pwad003

Article  CAS  PubMed  PubMed Central  Google Scholar 

Gonzalez-Hernandez G, Sarker A, O’Connor K, Savova G (2017) Capturing the Patient’s Perspective: a Review of Advances in Natural Language Processing of Health-Related Text. Yearb Med Inform 26 (1):214–227. https://doi.org/10.15265/iy-2017-029

Article  CAS  PubMed  PubMed Central  Google Scholar 

Feghali J, Jimenez AE, Schilling AT, Azad TD (2022) Overview of Algorithms for Natural Language Processing and Time Series Analyses. Acta Neurochir Suppl 134:221–242. https://doi.org/10.1007/978-3-030-85292-4_26

Article  PubMed  Google Scholar 

Courant R, Edberg M, Dufour N, Kalogeiton V (2023) Transformers and Visual Transformers. In: Colliot O (ed) Machine Learning for Brain Disorders. Humana Copyright 2023, The Author(s). New York, NY, pp 193–229. https://doi.org/10.1007/978-1-0716-3195-9_6

De Angelis L, Baglivo F, Arzilli G, Privitera GP, Ferragina P, Tozzi AE, Rizzo C (2023) ChatGPT and the rise of large language models: the new AI-driven infodemic threat in public health. Front Public Health 11:1166120. https://doi.org/10.3389/fpubh.2023.1166120

Article  PubMed  PubMed Central  Google Scholar 

Lee P, Bubeck S, Petro J (2023) Benefits, Limits, and Risks of GPT-4 as an AI Chatbot for Medicine. N Engl J Med 388 (13):1233–1239. https://doi.org/10.1056/NEJMsr2214184

Article  PubMed  Google Scholar 

Thirunavukarasu AJ, Ting DSJ, Elangovan K, Gutierrez L, Tan TF, Ting DSW (2023) Large language models in medicine. Nat Med 29 (8):1930–1940. https://doi.org/10.1038/s41591-023-02448-8

Article  CAS  PubMed  Google Scholar 

Hung YC, Chaker SC, Sigel M, Saad M, Slater ED (2023) Comparison of Patient Education Materials Generated by Chat Generative Pre-Trained Transformer Versus Experts: An Innovative Way to Increase Readability of Patient Education Materials. Ann Plast Surg 91 (4):409–412. https://doi.org/10.1097/sap.0000000000003634

Article  CAS  PubMed  Google Scholar 

Özcan F, Örücü Atar M, Köroğlu Ö, Yılmaz B (2024) Assessment of the reliability and usability of ChatGPT in response to spinal cord injury questions. J Spinal Cord Med:1–6. https://doi.org/10.1080/10790268.2024.2361551

Temel MH, Erden Y, Bağcıer F (2024) Information Quality and Readability: ChatGPT’s Responses to the Most Common Questions About Spinal Cord Injury. World Neurosurg 181:e1138-e1144. https://doi.org/10.1016/j.wneu.2023.11.062

Article  PubMed  Google Scholar 

van Dis EAM, Bollen J, Zuidema W, van Rooij R, Bockting CL (2023) ChatGPT: five priorities for research. Nature 614 (7947):224–226. https://doi.org/10.1038/d41586-023-00288-7

Article  CAS  PubMed  Google Scholar 

Azamfirei R, Kudchadkar SR, Fackler J (2023) Large language models and the perils of their hallucinations. Crit Care 27 (1):120. https://doi.org/10.1186/s13054-023-04393-x

Article  PubMed  PubMed Central  Google Scholar 

Stokel-Walker C, Van Noorden R (2023) What ChatGPT and generative AI mean for science. Nature 614 (7947):214–216. https://doi.org/10.1038/d41586-023-00340-6

Article  CAS  PubMed  Google Scholar 

Wang L, Wan Z, Ni C, Song Q, Li Y, Clayton E, Malin B, Yin Z (2024) Applications and Concerns of ChatGPT and Other Conversational Large Language Models in Health Care: Systematic Review. J Med Internet Res 26:e22769. https://doi.org/10.2196/22769

Article  PubMed  PubMed Central  Google Scholar 

Wang L, Wan Z, Ni C, Song Q, Li Y, Clayton EW, Malin BA, Yin Z (2024) A Systematic Review of ChatGPT and Other Conversational Large Language Models in Healthcare. medRxiv. https://doi.org/10.1101/2024.04.26.24306390

Abbasian M, Khatibi E, Azimi I, Oniani D, Shakeri Hossein Abad Z, Thieme A, Sriram R, Yang Z, Wang Y, Lin B, Gevaert O, Li LJ, Jain R, Rahmani AM (2024) Foundation metrics for evaluating effectiveness of healthcare conversations powered by generative AI. NPJ Digit Med 7 (1):82. https://doi.org/10.1038/s41746-024-01074-z

Article  PubMed  PubMed Central  Google Scholar 

Zhui L, Yhap N, Liping L, Zhengjie W, Zhonghao X, Xiaoshu Y, Hong C, Xuexiu L, Wei R (2024) Impact of Large Language Models on Medical Education and Teaching Adaptations. JMIR Med Inform 12:e55933. https://doi.org/10.2196/55933

Article  PubMed  PubMed Central  Google Scholar 

Bedi S, Liu Y, Orr-Ewing L, Dash D, Koyejo S, Callahan A, Fries JA, Wornow M, Swaminathan A, Lehmann LS, Hong HJ, Kashyap M, Chaurasia AR, Shah NR, Singh K, Tazbaz T, Milstein A, Pfeffer MA, Shah NH (2024) Testing and Evaluation of Health Care Applications of Large Language Models: A Systematic Review. Jama. https://doi.org/10.1001/jama.2024.21700

Article  Google Scholar 

Ghanbari Haez S, Segala M, Bellan P, Magnolini S, Sanna L, Consolandi M, Dragoni M A Retrieval-Augmented Generation Strategy to Enhance Medical Chatbot Reliability. In, Cham, 2024. Artificial Intelligence in Medicine. Springer Nature Switzerland, pp 213–223

Jahan I, Laskar MTR, Peng C, Huang JX (2024) A comprehensive evaluation of large Language models on benchmark biomedical text processing tasks. Comput Biol Med 171:108189. https://doi.org/10.1016/j.compbiomed.2024.108189

Article  PubMed  Google Scholar 

Lim ZW, Pushpanathan K, Yew SME, Lai Y, Sun CH, Lam JSH, Chen DZ, Goh JHL, Tan MCJ, Sheng B, Cheng CY, Koh VTC, Tham YC (2023) Benchmarking large language models’ performances for myopia care: a comparative analysis of ChatGPT-3.5, ChatGPT-4.0, and Google Bard. EBioMedicine 95:104770. https://doi.org/10.1016/j.ebiom.2023.104770

Article  PubMed  PubMed Central  Google Scholar 

Moult B, Franck LS, Brady H (2004) Ensuring quality information for patients: development and preliminary validation of a new instrument to improve the quality of written health care information. Health Expect 7 (2):165–175. https://doi.org/10.1111/j.1369-7625.2004.00273.x

Article  PubMed  PubMed Central  Google Scholar 

Zhao FF, He HJ, Liang JJ, Cen J, Wang Y, Lin H, Chen F, Li TP, Yang JF, Chen L, Cen LP (2024) Benchmarking the performance of large language models in uveitis: a comparative analysis of ChatGPT-3.5, ChatGPT-4.0, Google Gemini, and Anthropic Claude3. Eye (Lond). https://doi.org/10.1038/s41433-024-03545-9

Article  PubMed  Google Scholar 

Zhang B, Du Y, Duan W, Chen Z (2024) Benchmarking Large Language Models for Cervical Spondylosis. JMIR Form Res 8:e55577. https://doi.org/10.2196/55577

Article  PubMed  PubMed Central  Google Scholar 

Hancı V, Ergün B, Gül Ş, Uzun Ö, Erdemir İ, Hancı FB (2024) Assessment of readability, reliability, and quality of ChatGPT®, BARD®, Gemini®, Copilot®, Perplexity® responses on palliative care. Medicine (Baltimore) 103 (33):e39305. https://doi.org/10.1097/md.0000000000039305

Article  PubMed  Google Scholar 

Wilhelm TI, Roos J, Kaczmarczyk R (2023) Large Language Models for Therapy Recommendations Across 3 Clinical Specialties: Comparative Study. J Med Internet Res 25:e49324. https://doi.org/10.2196/49324

Article  PubMed  PubMed Central  Google Scholar 

Lee Y, Tessier L, Brar K, Malone S, Jin D, McKechnie T, Jung JJ, Kroh M, Dang JT (2024) Performance of artificial intelligence in bariatric surgery: comparative analysis of ChatGPT-4, Bing, and Bard in the American Society for Metabolic and Bariatric Surgery textbook of bariatric surgery questions. Surg Obes Relat Dis 20 (7):609–613. https://doi.org/10.1016/j.soard.2024.04.014

Article  PubMed  Google Scholar 

Huang C, Chen L, Huang H, Cai Q, Lin R, Wu X, Zhuang Y, Jiang Z (2023) Evaluate the accuracy of ChatGPT’s responses to diabetes questions and misconceptions. J Transl Med 21 (1):502. https://doi.org/10.1186/s12967-023-04354-6

Article  PubMed 

Comments (0)

No login
gif