学术论文

当前位置 首页 > 医学人工智能 > 学术论文 > 正文

Cognitive Biases and Artificial Intelligence(认知偏差与人工智能)

信息来源: 发布日期:2024-12-03

CASE STUDIES 案例研究

Cognitive Biases and Artificial Intelligence

J. Wang and D.A. Redelmeier

Abstract

Generative artificial intelligence (AI) models are increasingly utilized for medical applications. We tested whether such models are prone to human-like cognitive biases when offering medical recommendations. We explored the performance of OpenAI generative pretrained transformer (GPT)-4 and Google Gemini-1.0-Pro with clinical cases that involved 10 cognitive biases and system prompts that created synthetic clinician respondents. Medical recommendations from generative AI were compared with strict axioms of rationality and prior results from clinicians. We found that significant discrepancies were apparent for most biases. For example, surgery was recommended more frequently for lung cancer when framed in survival rather than mortality statistics (framing effect: 75% vs. 12%; P<0.001). Similarly, pulmonary embolism was more likely to be listed in the differential diagnoses if the opening sentence mentioned hemoptysis rather than chronic obstructive pulmonary disease (primacy effect: 100% vs. 26%; P<0.001). In addition, the same emergency department treatment was more likely to be rated as inappropriate if the patient subsequently died rather than recovered (hindsight bias: 85% vs. 0%; P<0.001). One exception was base-rate neglect that showed no bias when interpreting a positive viral screening test (correction for false positives: 94% vs. 93%; P=0.431). The extent of these biases varied minimally with the characteristics of synthetic respondents, was generally larger than observed in prior research with practicing clinicians, and differed between generative AI models. We suggest that generative AI models display human-like cognitive biases and that the magnitude of bias can be larger than observed in practicing clinicians.

DOI: 10.1056/AIcs2400639

全文链接:https://ai.nejm.org/doi/abs/10.1056/AIcs2400639


认知偏差与人工智能

J. Wang 和 D.A. Redelmeier

摘要: 生成性人工智能(AI)模型越来越多地用于医疗应用。我们测试了这些模型在提供医疗建议时是否容易受到类似人类的认知偏差的影响。我们探索了OpenAI生成预训练变换器(GPT)-4和Google Gemini-1.0-Pro在涉及10种认知偏差和系统提示的临床案例中的性能,这些提示创建了合成的临床医生受访者。将生成性AI的医疗建议与理性的严格公理和以前临床医生的结果进行了比较。我们发现,大多数偏差都有明显的差异。例如,当以生存而不是死亡率统计数据来表述时,更频繁地推荐手术来治疗肺癌(框架效应:75%对12%;P<0.001)。同样,如果开场白提到咯血而不是慢性阻塞性肺病,肺栓塞更有可能被列入鉴别诊断(首因效应:100%对26%;P<0.001)。此外,如果患者随后死亡而不是康复,同样的急诊科治疗更有可能被评为不适当(后见之明偏差:85%对0%;P<0.001)。一个例外是,在解释阳性病毒筛查测试时,基率忽视没有表现出偏见(校正假阳性:94%对93%;P=0.431)。这些偏差的程度与合成受访者的特征几乎没有变化,通常比以往研究中观察到的在职临床医生的偏差大,并且在生成性AI模型之间有所不同。我们建议,生成性AI模型表现出类似人类的认知偏差,并且偏差的程度可能比在职临床医生观察到的要大。


NEJM AI, Volume 1 No. 12 December 2024

译文来自于AI工具Kimi