To find out more about the podcast go to The fake disease that fooled AI.
Below is a short summary and detailed review of this podcast written by FutureFactual:
AI Safety in Health: How a Fake Disease Exposed Risks of Large Language Models
In this Scientific American episode, host Rachel Feltman chats with Almira Osmanovic Tunstrom about how AI chatbots influence medical information. The discussion centers on a deliberately fabricated disease to reveal how large language models extract, interpret, and propagate health data from web sources, including preprints and blogs. The conversation highlights the importance of ethics, human oversight, and careful vetting when AI tools touch health information.
- Fake disease seeded into data sources shows how AI can normalize non existent conditions when data sources are credulously mixed.
- Large language models rely on publicly crawled data, where blogs and preprints can be treated as credible medical sources by the model.
- Emphasizes the need for human in the loop, physician input, and ethical safeguards in AI driven health information.
- The episode discusses misinformation risks and the necessity of source verification for AI used by patients and clinicians.
Overview
The podcast presents a discussion with Almira Osmanovic Tunstrom, a researcher at the University of Gothenburg and associated centers, about a provocative AI ethics project. The aim is to illuminate how commercial large language models are trained and how they cope with health information. Host Rachel Feltman frames the issue around AI chatbots that people consult for medical advice and the potential dangers when these systems misinterpret or misrepresent data. The interview reveals how a deliberately constructed case can reveal gaps in how data flows from internet sources into AI training pipelines.
The Pseudo Experiment and Its Rationale
Tunstrom explains that she built a coherent narrative to trace data through the entire AI development pipeline. She started by acknowledging that most models are trained on data aggregated from public sources, including Common Crawl. To create a credible but fake scientific persona, she simulated a non existent university in a fictitious city and a researcher connected to health and medicine. Sprinkling invented terms and references across blogs, social media, and preprint servers, she observed how these signals could be picked up by data crawlers and, eventually, by language models. The goal was to leave breadcrumbs that would allow the AI to construct a credible but entirely fictitious condition, which she dubbed bixonomania or similarly spelled terms in the literature.
How Data Shapes AI Health Advice
The conversation details how the AI began by listing possible causes for eye irritation and screen fatigue, then gradually included blue light exposure and potential conditions that lead to a final, erroneous suggestion of the invented disease. The project underscored that preprints, blogs, and other open sources were not reliably filtered by some training data pipelines, leading to a risk that AI could treat a made up condition as real when later researchers cited it. Osmanovic Tunstrom notes that the fake paper and its fabricated acknowledgments and funding could be misinterpreted as legitimate by an AI thinking it was parsing a credible source, particularly when the model is asked questions about symptoms and differential diagnoses.
Ethical and Practical Lessons
The participant emphasizes that the episode demonstrates the necessity of keeping humans in the loop when health information is produced or disseminated by AI systems. Medical professionals, patients, and researchers should scrutinize AI outputs and verify sources rather than rely on the appearance of credibility in machine generated responses. The discussion also touches on the risk of AI being exploited for malicious purposes within academia and beyond, highlighting the need for responsible data practices, transparency, and ongoing ethics dialogue in digital health contexts.
Implications for Health Information and Policy
Key takeaways include the importance of robust source verification, improved data curation, and mechanisms to detect and mitigate misinformation in AI health tools. The episode argues for more human oversight, ethical guidelines for data distribution, and collaboration with clinicians to ensure that AI assists rather than undermines medical decision making. The conversation closes with reflections on how the field might reduce harm while continuing to harness AI to improve health outcomes.



