DAVE : Distribution-aware Attribution via ViT Gradient Decomposition
Pre-print, Feb. 2026
arxiv / code / presentation / bibtex
DAVE is a distribution-aware attribution method for Vision Transformers that produces stable, high-resolution pixel-level explanations while reducing patch/grid artifacts common in ViT saliency maps. It decomposes ViT input gradients to suppress operator-variation noise and enforces local equivariance via averaging over small transforms and perturbations, improving localization and faithfulness across multiple ViT backbones.