Zhangzhe's Blog
The projection of my life.
Home
Tags
Categories
Search
0%
Transformer
Category
2024
07-30
KV Cache Transformer
07-18
FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning
07-17
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
07-12
GQA: Grouped-Query Attention
07-11
MQA: Multi-Query Attention
2023
07-29
MOTR: End-to-End Multiple-Object Tracking with Transformer
07-29
Deformable DETR: Deformable Transformers for End-to-end Object Detection
2022
09-29
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
01-20
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
2021
12-06
Beyond Self-attention: External Attention using Two Linear Layers for Visual Tasks
1
2