摘要
本文聚焦检索增强生成(Retrieval-Augmented Generation, RAG)技术在医学人工智能领域的核心价值与发展脉络,系统剖析其技术演进历程、核心架构设计及多场景应用实践,并构建多维度评估体系,深入探讨当前面临的挑战与未来优化方向。通过整合分析 RAG 在临床决策支持、医学科研教育及患者健康管理等场景的应用案例,验证了该技术在缓解大型语言模型(LLMs)知识滞后、幻觉生成及不可解释性等关键问题上的显著效能,其通过动态检索外部权威知识库的方式,大幅提升了医学内容生成的事实准确性、可追溯性与临床可靠性。研究表明,RAG 技术是推动医学人工智能可信化落地的核心路径,但在多模态信息融合、医疗数据隐私保护、知识库动态更新及计算成本优化等方面仍需持续突破。本文为 RAG 技术在高风险医学场景的安全应用提供了理论支撑与实践参考,助力智能医疗生态的高质量建设。
关键词: 大语言模型;检索增强生成;医疗AI;临床决策;评估
Abstract
This paper focuses on the core value and development context of Retrieval-Augmented Generation (RAG) technology in the field of medical artificial intelligence. It systematically analyzes its technical evolution, core architecture design, and multi-scenario application practices, constructs a multi-dimensional evaluation system, and deeply explores the current challenges and future optimization directions. By integrating and analyzing RAG's application cases in clinical decision support, medical research and education, and patient health management, this study verifies the significant effectiveness of the technology in alleviating key problems of Large Language Models (LLMs) such as knowledge latency, hallucination generation, and poor explainability. Through dynamically retrieving external authoritative knowledge bases, RAG greatly improves the factual accuracy, traceability, and clinical reliability of medical content generation. The research shows that RAG technology is a core path to promote the trustworthy deployment of medical artificial intelligence, but continuous breakthroughs are still needed in aspects such as multimodal information fusion, medical data privacy protection, dynamic knowledge base update, and computational cost optimization. This paper provides theoretical support and practical reference for the safe application of RAG technology in high-risk medical scenarios, and helps the high-quality construction of the intelligent healthcare ecosystem.
Key words: Large language models; Retrieval-augmented generation; Medical AI; Clinical decision-making; Evaluation
参考文献 References
[1] Patrick L, Ethan P, Aleksandra P, et al. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks[C]. Conference on Neural Information Processing Systems, 2020, 33: 9459-9474.
[2] Lei H, Weijiang Y, Weitao M, et al. A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions[J]. Computing Research Repository, 2023, abs/2311.05232.
[3] Adam T K, Ofir N, Santosh S V, et al. Why Language Models Hallucinate[J]. arXiv preprint arXiv, 2025, 2509(04664).
[4] Michael Klesel, H. Felix Wittmann. Retrieval-Augmented Generation (RAG)[J]. Business & Information Systems Engineering, 2025, 67(4): 551-561.
[5] Xuejiao Z, Siyan L, Su-Yin Y, et al. MedRAG: Enhancing Retrieval-augmented Generation with Knowledge Graph-Elicited Reasoning for Healthcare Copilot[J]. Computing Research Repository, 2025.
[6] Sudeshna D, Yao G, Yuting G, et al. Two-Layer Retrieval-Augmented Generation Framework for Low-Resource Medical Question Answering Using Reddit Data: Proof-of-Concept Study[J]. Journal of Medical Internet Research, 2025, 27.
[7] Guangzhi X, Qiao J, Xiao W, et al. Improving Retrieval-Augmented Generation in Medicine with Iterative Follow-up Questions[C]. Pacific Symposium on Biocomputing, 2025, 30: 199-214.
[8] Yucheng S, Shaochen X, Tianze Y, et al. MKRAG: Medical Knowledge Retrieval Augmented Generation for Medical Question Answering[C]. AMIA Annual Symposium Proceedings, 2024: 1011-1020.
[9] Nghia T N, Chien V N, Franck D, et al. Comprehensive and Practical Evaluation of Retrieval-Augmented Generation Systems for Medical Question Answering[J]. Computing Research Repository, 2024, abs/2411.09213.
[10] Junde W, Jiayuan Z, Yunli Q, et al. Medical Graph RAG: Towards Safe Medical Large Language Model Via Graph Retrieval-Augmented Generation[J]. Computing Research Repository, 2024, abs/2408.04187.
[11] Guangzhi X, Qiao J, Xiao W, et al. Improving Retrieval-Augmented Generation in Medicine with Iterative Follow-up Questions[C]. Pacific Symposium on Biocomputing, 2025, 30: 199-214.
[12] Sudeshna D, Yao G, Yuting G, et al. Two-Layer Retrieval-Augmented Generation Framework for Low-Resource Medical Question Answering Using Reddit Data: Proof-of-Concept Study[J]. Journal of Medical Internet Research, 2025, 27.
[13] Kaiwen Z, Kwonjoon L, Teruhisa M, et al. ViCor: Bridging Visual Understanding and Commonsense Reasoning with Large Language Models[J]. Computing Research Repository, 2024.
[14] Shahul E, Jithin J, Luis E, et al. RAGAS: Automated Evaluation of Retrieval Augmented Generation[C]. Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations, 2024: 150-158.
[15] YuHe K, Liyuan J, Kabilan E, et al. Development and Testing of Retrieval Augmented Generation in Large Language Models - A Case Study Report[J]. CoRR, 2024, abs/2402.01733.
[16] Guangzhi X, Qiao J, Zhiyong L, et al. Benchmarking Retrieval-Augmented Generation for Medicine[C]. Annual Meeting of the Association for Computational Linguistics, 2024: 6233-6251.
[17] Mario C, Lorenzo B, Valentin C, et al. Retrieval Augmented Generation Evaluation for Health Documents[J]. alphaxiv, 2025.
[18] Guangzhi X, Qiao J, Xiao W, et al. Improving Retrieval-Augmented Generation in Medicine with Iterative Follow-up Questions[C]. Pacific Symposium on Biocomputing, 2025, 30: 199-214.
[19] Juraj V, Annika D, Mai N, et al. Improving Reliability and Explainability of Medical Question Answering Through Atomic Fact Checking in Retrieval-Augmented LLMs[J]. alphaxiv, 2025.
[20] Yuelyu Ji, Hang Zhang, Yanshan Wang. Bias Evaluation and Mitigation in Retrieval-Augmented Medical Question-Answering Systems[J]. alphaxiv, 2025.
[21] Sichu L, Linhai Z, Hongyu Z, et al. RGAR: Recurrence Generation-augmented Retrieval for Factual-aware Medical Question Answering[J]. Computing Research Repository, 2025, abs/2502.13361.