AnomalyVFM: Transforming Vision Foundation Models into Zero-Shot Anomaly Detectors

AnomalyVFM is a general and effective framework that transforms pretrained Vision Foundation Models (VFMs), such as RADIO, DINOv2, or SigLIP2, into strong zero-shot anomaly detectors. This model was presented at CVPR 2026.

Overview

Zero-shot anomaly detection aims to detect and localise abnormal regions in images without access to any in-domain training images. AnomalyVFM addresses this by combining a robust three-stage synthetic dataset generation scheme with a parameter-efficient adaptation mechanism using low-rank feature adapters (LoRA/DoRA) and a confidence-weighted pixel loss.

With RADIO as a backbone, AnomalyVFM achieves an average image-level AUROC of 94.1% across 9 diverse datasets, substantially outperforming previous methods.

Usage

To test the model on a single image, you can use the following script (torch, torchvision, transformers and PIL have to be installed):

import torch
from PIL import Image
import torchvision

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

anomalyvfm = torch.hub.load("MaticFuc/AnomalyVFM", "anomalyvfm_siglip2", trust_remote_code=True, force_reload=True).to(device) # Possible options: "anomalyvfm_radio", "anomalyvfm_dinov2", "anomalyvfm_clip" and "anomalyvfm_siglip2", more to be added
img_trf = anomalyvfm.model.get_img_transform()

image = Image.open("test.png").convert("RGB")
image = image_trf(image).unsqueeze(0).to(device)
with torch.no_grad():
    score, mask = anomalyvfm(image)

print(f"Anomaly Score: {score.item():.4f}")
torchvision.utils.save_image(mask.float(), "pred.png")

By default, the output will be saved as pred.png.

Citation

If this work contributes to your research, please consider citing:

@InProceedings{fucka2026anomaly_vfm,
    title={AnomalyVFM -- Transforming Vision Foundation Models into Zero-Shot Anomaly Detectors},
    author={Fučka, Matic and Zavrtanik, Vitjan and Skočaj, Danijel},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2026}
}
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including MaticFuc/anomalyvfm_siglip2

Paper for MaticFuc/anomalyvfm_siglip2