Demonstrate-Search-Predict: Composing retrieval and language models for knowledge-intensive NLP
演示-搜索-预测:构建面向知识密集型自然语言处理的检索和语言模型
来自arXiv
2022-12-30 02:01:32
Retrieval-augmented in-context learning has emerged as a powerful approach for addressing knowledge-intensive tasks using frozen language models (LM) and retrieval models (RM). Existing work has combined these in simple "retrieve-then-read" pipelines in which the RM retrieves passages that are inserted into the LM prompt. To begin to fully realize the potential of frozen LMs and RMs, we propose Demonstrate-Search-Predict (DSP), a framework that relies on passing natural language texts in sophisticated pipelines between an LM and an RM. DSP can express high-level programs that bootstrap pipeline-aware demonstrations, search for relevant passages, and generate grounded predictions, systematically breaking down problems into small transformations that the LM and RM can handle more reliably. We have written novel DSP programs for answering questions in open-domain, multi-hop, and conversational settings, establishing in early evaluations new state-of-the-art in-context learning results and delivering 37-200%, 8-40%, and 80-290% relative gains against vanilla LMs, a standard retrieve-then-read pipeline, and a contemporaneous self-ask pipeline, respectively.
检索--增强的情景学习已成为一种强有力的方法 使用冻结语言模型(LM)处理知识密集型任务 检索模型(RM)。现有的工作已经简单地将这些结合在一起 “先检索后读取”管道,RM在其中检索符合以下条件的段落 插入到LM提示符中。开始充分认识到冰冻的潜力 在LMS和RMS的基础上,我们提出了一种演示-搜索-预测(DSP)的框架 依赖于在复杂的管道中在 Lm和Rm。DSP可以表达引导流水线感知的高级程序 演示,搜索相关段落,并生成接地 预测,系统地将问题分解为小的转变 可以更可靠地处理Lm和RM。我们编写了新颖的数字信号处理器程序 用于在开放域、多跳和对话设置中回答问题, 在早期评估中建立新的最先进的情景学习 结果,并提供37%-200%、8%-40%和80%-290%的相对收益 Vanilla LMS,一个标准的先检索后读取的流水线,以及一个同期的 分别自问流水线。
论文代码
关联比赛
本文链接地址:https://flyai.com/paper_detail/14163