Ring Attention with Blockwise Transformers for Near-Infinite Context-FlyAI

FlyAI小助手

3
获得赞
85873
发布的文章
0
答辩的项目

Ring Attention with Blockwise Transformers for Near-Infinite Context

作者: Hao liu

来自paperswithcode 2023-10-16 10:30:02

0

0

360

作者邀请

论文作者还没有讲解视频

邀请直播讲解

您已邀请成功，目前已有 $vue{users_count} 人邀请！

再次邀请

Transformers have emerged as the architecture of choice for many state-of-the-art AI models, showcasing exceptional performance across a wide range of AI applications. However, the memory demands imposed by Transformers limit their ability to handle long sequences, thereby creating challenges for tasks involving extended sequences or long-term dependencies. We present a distinct approach, Ring Attention, which leverages blockwise computation of self-attention to distribute long sequences across multiple devices while overlapping the communication of key-value blocks with the computation of blockwise attention. Ring Attention enables training and inference of sequences that are up to device count times longer than those of prior memory-efficient Transformers, effectively eliminating the memory constraints imposed by individual devices. Extensive experiments on language modeling tasks demonstrate the effectiveness of Ring Attention in allowing large sequence input size and improving performance.

文件下载

论文代码

https://github.com/lhao499/llm_large_context

关联比赛

本作品采用知识共享署名-非商业性使用-相同方式共享 4.0 国际许可协议进行许可，转载请附上原文出处链接和本声明。
本文链接地址：https://flyai.com/paper_detail/76771

赞已赞

讨论

500字

表情

发送

删除确认

是否删除该条评论？

取消删除

FlyAI小助手

3
获得赞
85873
发布的文章
0
答辩的项目

作者热门文章

Deep Generalized Schrödinger Bridge

阅读 2373

Informative knowledge distillation for image anomaly segmentation

阅读 2259

Deep Variation Prior: Joint Image Denoising and Noise Variance Estimation without Clean Data

阅读 2180

开通会员

论文推荐

An Open and Comprehensive Pipeline for Unified Object Grounding and Detection

Pytorch

TR-DETR: Task-Reciprocal Transformer for Joint Moment Retrieval and Highlight Detection

Imperio: Language-Guided Backdoor Attacks for Arbitrary Model Control

最新竞赛

: 遥感图像自然场景识别; 奖金池 ¥23,000

: 什么是好的钻石？; 奖金池 ¥3,000

: 瑜伽体式分类; 奖金池 ¥3,000

经验推荐

遥感图像自然场景识别-黑羽

来自项目：遥感图像自然场景识别

基于PyTorch工程利器解析遥感影像分类任务，小白必看！

来自项目：遥感影像场景分类预测

FasterRCNN在口罩佩戴检测任务中的上分技巧～

来自项目：口罩佩戴检测