TriBench.

The standard for Triton kernel benchmarking. Fast, reproducible, and beautifully extensible.

🚀

Zero Overhead

Compile and warmup phases are strictly isolated from the measurement phase to eliminate JIT overhead.

🔬

Reproducible

Environment capture and strict random seed control ensure stable benchmark numbers across different runs.

🧩

Composable

Easily benchmark and compare multiple kernel variants alongside your main implementation.