学术论文

当前位置 首页 > 医学人工智能 > 学术论文 > 正文

AI Grand Rounds: Reclaiming Voice with AI(人工智能大轮查:用人工智能恢复声音)

信息来源: 发布日期:2024-12-03

PERSPECTIVES观点

AI Grand Rounds: Reclaiming Voice with AI

F.N. Mirza and Others

Abstract

Voice impairments affect millions of Americans, with personalized text-to-speech technology offering limited solutions due to the need for extensive voice banking. Here, we report the case of Alexis Bogan, a 20-year-old patient who acutely lost her voice after surgery to resect her brain stem hemangioblastoma. In a world-first application, OpenAI’s Voice Engine was used to clone Ms. Bogan’s voice from just 15 seconds of preexisting audio, sourced from a school project she had filmed a few years prior. This enabled her to use a personalized text-to-speech app for daily communication while rehabilitating her speech. This case was highlighted on a recent episode of the NEJM AI Grand Rounds podcast,1 framing a broader discussion on voice cloning technology. While AI is often dual use and concerns about voice cloning often center on potential misuse, such as “deepfakes” and misinformation, we argue that suppressing this technology may inflict tangible harm on patients by denying them the chance to reclaim their voice. Inspired by Ms. Bogan’s journey, we urge researchers, clinicians, ethicists, policymakers, and tech companies to collaborate swiftly yet responsibly in advancing AI voice cloning in health care. By doing so, we can empower patients to recover not just their voice, but also a fundamental aspect of their identity and quality of life.

DOI: 10.1056/AIp2401000

全文链接:https://ai.nejm.org/doi/abs/10.1056/AIp2401000


人工智能大轮查:用人工智能恢复声音

F.N. Mirza 等人

摘要: 声音障碍影响着数百万美国人,由于需要大量的声音银行,个性化的文本到语音技术提供的解决方案有限。在这里,我们报告了Alexis Bogan的案例,这位20岁的患者在切除脑干血管瘤手术后突然失去了声音。在世界首次应用中,OpenAI的声音引擎仅用她几年前拍摄的一个学校项目中的15秒预先存在的音频克隆了Bogan女士的声音。这使她能够在恢复语言能力的同时,使用个性化的文本到语音应用程序进行日常交流。这个案例在最近的NEJM人工智能大轮查播客节目中被突出展示,引发了关于声音克隆技术的更广泛讨论。虽然人工智能通常是双重用途的,对声音克隆的担忧通常集中在潜在的滥用上,如“深度伪造”和错误信息,但我们认为压制这项技术可能会对患者造成实质性伤害,剥夺他们恢复声音的机会。受Bogan女士旅程的启发,我们敦促研究人员、临床医生、伦理学家、政策制定者和科技公司迅速但负责任地合作,推进健康护理中的人工智能声音克隆。通过这样做,我们可以帮助患者恢复不仅仅是他们的声音,还有他们身份和生活质量的一个基本方面。


NEJM AI, Volume 1 No. 12 December 2024

译文来自于AI工具Kimi