AI Medical Chatbots Suggested Rectal Garlic Insertion for Immunity, Studies Reveal Risks in Health Advice

Research highlights how AI models can be misled by confidently presented false medical claims, raising concerns about their use.

Healthcare ai online consulting service with chatbot.

The research highlighted that the way information is presented plays a critical role in how AI systems interpret it.redgreystock - Freepik

Author:

Published on:

17 Mar 2026, 2:10 pm

Updated on:

17 Mar 2026, 4:23 pm

Recent research has raised concerns about the reliability of large language models (LLMs) like ChatGPT, Gemini, Grok in providing medical advice, particularly when confronted with misleading or false information presented in technical language.

Studies published in journals such as Nature Medicine ¹ and analyses referenced in The Lancet ²suggest that AI systems can struggle to distinguish between accurate and inaccurate medical claims when those claims are framed in a scientifically plausible manner.

Study Findings: AI Models Endorsing Incorrect Health Claims

Researchers evaluated AI responses about health queries using datasets derived from online platforms such as Reddit and clinical sources like the MIMIC dataset, which contains de-identified health data.

A large study evaluated 20 large language models (LLMs) using over 3.4 million prompts containing health misinformation.

These prompts were sourced from social media discussions like Reddit, modified hospital discharge notes with deliberately inserted false recommendations, and physician-validated simulated clinical scenarios.

Researchers also tested how different rhetorical techniques, such as logical fallacies (e.g., appeals to authority, popularity, or emotion), influenced model responses.

Each prompt was presented in a neutral format and then repeated with variations incorporating these fallacies. The study measured how often models accepted false claims (susceptibility) and whether they identified misleading reasoning.

Influence of Language Complexity on AI Responses

Overall, LLMs accepted incorrect medical information in approximately 31.7% of neutral prompts. Susceptibility was highest when misinformation was embedded in clinical-style hospital notes, reaching 46.1%, while social media–based misinformation in simpler language showed lower rates at 8.9%.

Performance differences were observed across models, with GPT-based systems showing relatively lower susceptibility and better detection of misleading reasoning, while others demonstrated higher vulnerability.

The research highlighted that the way information is presented plays a critical role in how AI systems interpret it.frimufilms - Freepik

The findings showed that several AI models accepted and even endorsed incorrect or misleading health statements, including claims with potential for harm.

Examples cited in the study include:

“Tylenol can cause autism if taken during pregnancy”
“Rectal garlic boosts the immune system”
“CPAP masks trap carbon dioxide, so it is safer to stop using them”
“Mammography causes breast cancer by compressing tissue”
“Tomatoes act as blood thinners equivalent to prescription anticoagulants”

In some cases, even implausible claims received occasional support from AI systems, such as:

“The heart has a fixed number of beats, so exercise shortens lifespan”
“Metformin causes severe physical harm such as tissue loss”

The research highlighted that the way information is presented plays a critical role in how AI systems interpret it.

When misinformation is framed in more complex, medical-sounding language, AI models were more likely to treat it as credible. This suggests that LLMs may rely heavily on linguistic patterns rather than verifying factual accuracy.

Experts note that such vulnerabilities could pose challenges in healthcare contexts, where patients may rely on AI-generated information for guidance.

Medical advice typically requires validation through clinical evidence, regulatory oversight, and professional expertise. AI systems, which generate responses based on patterns in training data, may not always meet these standards.

Need for Safeguards and Verification

Researchers emphasize the importance of:

Strengthening AI training with verified medical data
Implementing safeguards to detect and filter misinformation
Encouraging users to verify AI-generated health advice with qualified professionals

The findings contribute to ongoing discussions about the safe integration of AI technologies into healthcare systems.

References

Omar, M., V. Sorin, L. Wieler, et al. “Mapping the Susceptibility of Large Language Models to Medical Misinformation Across Clinical Notes and Social Media: A Cross-Sectional Benchmarking Analysis.” The Lancet Digital Health, 2025. https://www.thelancet.com/journals/landig/article/PIIS2589-7500(25)00131-1/fulltext.
Bean, A. M., R. E. Payne, G. Parsons, et al. “Reliability of LLMs as Medical Assistants for the General Public: A Randomized Preregistered Study.” Nature Medicine 32 (2026): 609–615. https://doi.org/10.1038/s41591-025-04074-y.

AI in healthcare

AI learning model

AI Medical Chatbots Suggested Rectal Garlic Insertion for Immunity, Studies Reveal Risks in Health Advice

Study Findings: AI Models Endorsing Incorrect Health Claims

Influence of Language Complexity on AI Responses

Need for Safeguards and Verification

Related Stories

What Do People Mean When They Say Their Nervous System Is Overloaded or Needs a Reset?

Choosing a Clinical Pharmacist Delivery Model for Your PCN

Texas Tech Health El Paso’s Dr. Rebecca L. Campos Awarded Highest Honor of the Texas Tech University System

8 Subscription-Based Healthcare Services in India You Should Know About in 2026