ChatGPT vs. web search for patient questions: what does ChatGPT do better?

12 March, 2024

This is a rather obscure/specialised journal but the question (and findings) are interesting.

CITATION: ChatGPT vs. web search for patient questions: what does ChatGPT do better?

Sarek A Shen et al

Eur Arch Otorhinolaryngol 2024 Feb 28. doi: 10.1007/s00405-024-08524-0.

PMID: 38416195 DOI: 10.1007/s00405-024-08524-0


Purpose: Chat generative pretrained transformer (ChatGPT) has the potential to significantly impact how patients acquire medical information online. Here, we characterize the readability and appropriateness of ChatGPT responses to a range of patient questions compared to results from traditional web searches.

Methods: Patient questions related to the published Clinical Practice Guidelines by the American Academy of Otolaryngology-Head and Neck Surgery were sourced from existing online posts. Questions were categorized using a modified Rothwell classification system into (1) fact, (2) policy, and (3) diagnosis and recommendations. These were queried using ChatGPT and traditional web search. All results were evaluated on readability (Flesch Reading Ease and Flesch-Kinkaid Grade Level) and understandability (Patient Education Materials Assessment Tool). Accuracy was assessed by two blinded clinical evaluators using a three-point ordinal scale.

Results: 54 questions were organized into fact (37.0%), policy (37.0%), and diagnosis (25.8%). The average readability for ChatGPT responses was lower than traditional web search (FRE: 42.3 13.1 vs. 55.6 10.5, p < 0.001), while the PEMAT understandability was equivalent (93.8% vs. 93.5%, p = 0.17). ChatGPT scored higher than web search for questions the 'Diagnosis' category (p < 0.01); there was no difference in questions categorized as 'Fact' (p = 0.15) or 'Policy' (p = 0.22). Additional prompting improved ChatGPT response readability (FRE 55.6 13.6, p < 0.01).

Conclusions: ChatGPT outperforms web search in answering patient questions related to symptom-based diagnoses and is equivalent in providing medical facts and established policy. Appropriate prompting can further improve readability while maintaining accuracy. Further patient education is needed to relay the benefits and limitations of this technology as a source of medial information.

HIFA profile: Neil Pakenham-Walsh is coordinator of HIFA (Healthcare Information For All), a global health community that brings all stakeholders together around the shared goal of universal access to reliable healthcare information. HIFA has 20,000 members in 180 countries, interacting in four languages and representing all parts of the global evidence ecosystem. HIFA is administered by Global Healthcare Information Network, a UK-based nonprofit in official relations with the World Health Organization. Email: