☁️ FDR4VGT-CLOUD

FDR4VGT-CLOUD

Official model release accompanying the paper:

A multisensor deep learning framework for robust cloud segmentation in SPOT-VGT and Proba-V Julio Contreras, Cesar Aybar, Luis Gómez-Chova IEEE Geoscience and Remote Sensing Letters, 2026.

This model is the operational cloud masking algorithm selected for the ESA FDR4VGT archive reprocessing, delivering consistent cloud detection across the full SPOT-VGT (VGT1 1998–2003, VGT2 2002–2014) and Proba-V (2013–2020) record — a single sensor-agnostic model for the three missions.

✨ Overview

Architecture: Hybrid DeepLabV3+ (MobileNetV2 backbone) + PixelWise MLP (PW-DL3+)
Input: 4 Top-of-Atmosphere reflectance bands (Blue, Red, NIR, SWIR) — sensor-agnostic
Supported sensors: SPOT-VGT1, SPOT-VGT2, Proba-V
Input shape: [B, 4, 512, 512]
Parameters: 12.65M (57.29 MB)
Training: Weak-to-strong supervision — large-scale pre-training on 3,647 weakly-labeled scenes, followed by fine-tuning on 109 hand-annotated hard-example scenes.

🚀 Quick start

Installation

pip install mlstac rasterio torch==2.5.1

Inference

import torch
import mlstac
import rasterio as rio

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# 1. Load the model
framework = mlstac.download(
    file="https://huggingface.co/isp-uv-es/FDR4VGT-CLOUD/resolve/main/single/multisensor_single_1dpwdeeplabv3.json",
    output_dir="FDR4VGT/single",
)
model = framework.model

# 2. Load a 4-band image (Blue, Red, NIR, SWIR)
with rio.open("https://huggingface.co/isp-uv-es/FDR4VGT-CLOUD/resolve/main/ensemble/rgb.tif") as src:
    image = src.read()

# 3. Run large-scene inference (sliding window + Hann blending)
prob = framework.predict_large(
    image=image,
    model=model,
    device=device,
    batch_size=8,     # increase on GPU to speed up; lower on CPU
    num_workers=8,
    nodata=0,         # pixel value treated as invalid/padding
)

# 4. Binarize with the operational threshold
cloud_mask = (prob.squeeze() > 0.5).astype("uint8")

The binarization threshold (default 0.5) can be tuned per use case; the paper uses the F₂-optimal threshold on the validation set.

📊 Performance

Results on the manually-annotated test set (PW-DL3+, Multi-FT strategy) — mean over scenes:

Sensor	F₂	IoU	κ
Proba-V	0.891	0.842	0.808
SPOT-VGT	0.949	0.898	0.829

The model substantially outperforms the legacy BS1 (physical thresholds) and BS2 (pixel-wise MLP) baselines on both sensors, with the largest gain on SPOT-VGT (ΔF₂ = +0.090 over BS1). Temporal analysis across the 1998–2020 archive shows no statistically significant discontinuity at the VGT→Proba-V transition (Mann-Whitney U, p > 0.05), in contrast to the legacy record.

📁 Repository layout

Path	Description
`single/multisensor_single_1dpwdeeplabv3.json`	Operational single-model weights (`PW-DL3+`)
`ensemble/rgb.tif`	Example test scene (4-band TOA reflectance)

📄 Citation

If you use this model, please cite:

@article{contreras2026fdr4vgt,
  title   = {A multisensor deep learning framework for robust cloud segmentation in SPOT-VGT and Proba-V},
  author  = {Contreras, Julio and Aybar, Cesar and G{\'o}mez-Chova, Luis},
  journal = {IEEE Geoscience and Remote Sensing Letters},
  year    = {2026},
}

🙏 Acknowledgements

This work was supported by the European Space Agency (ESA) within the FDR4VGT: Fundamental Data Record for VGT project.

Developed at the Image Processing Laboratory (IPL), University of Valencia.

📜 License

CC0-1.0

Downloads last month: -; Downloads are not tracked for this model. How to track