Benchmark Explorer

Interactive benchmark results from NVIDIA T4 (Kaggle). Entries marked [P] are projected from single-device measurements with 95% linear scaling efficiency.

Attention Latency (ms) vs Sequence Length

Latency Data
seq_lenDense (ms)HADS (ms)ReductionBigBird (ms)Sliding (ms)
51212.48.134.7%5.63.8
102447.328.938.9%19.411.2
2048181.2105.641.7%84.264.3
4096713.8412.442.2%298.3219.7