Implementing Fmask and s2cloudless in Python

Implementing Fmask and s2cloudless in Python requires combining a physics-based radiative transfer classifier with a machine-learning probabilistic detector to achieve production-grade accuracy for optical satellite imagery. The standard workflow reads Sentinel-2 surface reflectance bands, computes pixel-level cloud probabilities using s2cloudless, ingests Fmask quality assurance (QA) bands, and merges both outputs into a single boolean mask using rasterio and numpy. This hybrid architecture compensates for s2cloudless’s occasional thin-cloud misses while leveraging Fmask’s robust geometric shadow detection, which is critical for time-series analysis and surface reflectance normalization.

Hybrid Architecture & Data Flow

Modern Cloud and Shadow Masking Strategies increasingly favor ensemble masking over single-model approaches. Fmask relies on physical thresholds (brightness temperature, NDVI, and spectral ratios) to isolate clouds and shadows, making it highly reliable for thick cloud cover and topographic shading. Conversely, s2cloudless uses a gradient-boosted decision tree classifier (LightGBM) trained on Sentinel-2 top-of-atmosphere (TOA) reflectance, excelling at detecting semi-transparent cirrus and high-altitude aerosols that bypass thermal thresholds.

By merging both outputs with a logical OR operation, you retain the highest sensitivity across spectral domains. The pipeline typically follows this sequence:

Preprocessing: Scale TOA bands to [0, 1] reflectance using the Sentinel-2 L1C quantification value of 10000.
ML Detection: Pass the band stack to S2PixelCloudDetector to generate a probability raster.
Physics QA: Execute Fmask via CLI or load pre-generated QA bands.
Fusion: Apply probability thresholds (typically ≥0.45) and decode Fmask categorical values, then union the masks.
Export: Write the final boolean mask to a Cloud-Optimized GeoTIFF (COG) with matching geospatial metadata.

For teams building automated Satellite Processing Workflows & Index Pipelines, this hybrid step should run immediately after atmospheric correction and before spectral index calculation to prevent cloud-contaminated statistics from skewing downstream analytics.

Environment & Compatibility Matrix

Component	Supported Versions	Notes
Python	3.10–3.13	rasterio and GDAL now ship binary wheels through 3.13
`s2cloudless`	≥1.5.0	Depends on `lightgbm`, `numpy`, `opencv-python-headless`
`rasterio`	≥1.3.0	Must align with GDAL 3.4+ for COG/STAC compliance
Fmask	4.2–5.3 (CLI)	Official Python bindings are deprecated; use CLI → QA ingestion
Sentinel-2	L1C (TOA)	`s2cloudless` expects L1C TOA reflectance normalized to `[0, 1]`
Landsat	8/9 (SR/TOA)	Fmask is optimized for TIRS thermal bands; adjust band mappings

Critical deployment note: Fmask is distributed as a compiled C++/MATLAB toolchain with no stable Python API. Production Python pipelines should invoke Fmask via subprocess, then parse the resulting *_fmask.tif QA band. Direct in-memory execution via ctypes or pybind11 is unstable outside conda-managed environments.

Production Implementation

The following script demonstrates a complete, memory-safe pipeline. It assumes Fmask has already been executed externally and that aligned Sentinel-2 L1C bands are available on disk. Note that s2cloudless expects input with shape (n_samples, n_rows, n_cols, n_bands) where n_bands is the number of spectral bands in the order specified by the detector (10 bands for all_bands=True).

import numpy as np
import rasterio
from rasterio.windows import Window
from s2cloudless import S2PixelCloudDetector
from pathlib import Path

def load_and_stack_bands(band_paths: list[str]) -> tuple[np.ndarray, dict]:
    """Load Sentinel-2 L1C bands into a normalized [0, 1] numpy stack."""
    with rasterio.open(band_paths[0]) as src:
        profile = src.profile.copy()
        height, width = src.height, src.width

    # Sentinel-2 L1C quantification value: DNs / 10000 = TOA reflectance [0, 1]
    SCALE_FACTOR = 10000.0
    stack = np.zeros((len(band_paths), height, width), dtype=np.float32)

    for i, path in enumerate(band_paths):
        with rasterio.open(path) as src:
            band = src.read(1, out_dtype=np.float32)
            stack[i] = np.clip(band / SCALE_FACTOR, 0.0, 1.0)

    return stack, profile

def decode_fmask_qa(qa_path: str) -> np.ndarray:
    """Load Fmask QA band and extract cloud/shadow pixels."""
    with rasterio.open(qa_path) as src:
        qa = src.read(1)

    # Fmask 4.x categorical encoding:
    # 0=clear, 1=water, 2=cloud_shadow, 3=snow/ice, 4=cloud, 255=fill/nodata
    fmask_cloud = (qa == 4)
    fmask_shadow = (qa == 2)
    return fmask_cloud | fmask_shadow

def generate_hybrid_mask(
    band_paths: list[str],
    fmask_qa_path: str,
    output_path: str,
    cloud_threshold: float = 0.45,
    chunk_size: int = 1024
) -> None:
    """Merge s2cloudless probabilities with Fmask QA into a boolean mask."""
    stack, profile = load_and_stack_bands(band_paths)
    height, width = stack.shape[1], stack.shape[2]

    # s2cloudless detector — all_bands=True expects 10 specific S2 bands in order:
    # B01, B02, B04, B05, B08, B8A, B09, B10, B11, B12
    detector = S2PixelCloudDetector(threshold=cloud_threshold, all_bands=True)

    # Pre-load Fmask mask for the full scene (fast; header-only then single read)
    fmask_full = decode_fmask_qa(fmask_qa_path)

    # Prepare output profile for boolean mask (uint8, 0=clear, 1=cloud/shadow)
    out_profile = profile.copy()
    out_profile.update(dtype="uint8", count=1, compress="deflate", nodata=None)

    with rasterio.open(output_path, "w", **out_profile) as dst:
        for row in range(0, height, chunk_size):
            for col in range(0, width, chunk_size):
                win = Window(
                    col, row,
                    min(chunk_size, width - col),
                    min(chunk_size, height - row)
                )

                # 1. s2cloudless: expects shape (n_data, n_rows, n_cols, n_bands)
                chunk_stack = stack[
                    :,
                    win.row_off:win.row_off + win.height,
                    win.col_off:win.col_off + win.width
                ]
                # Transpose from (bands, H, W) to (H, W, bands), then add batch dim
                chunk_input = chunk_stack.transpose(1, 2, 0)[np.newaxis, ...]
                # get_cloud_probability_maps returns shape (n_data, H, W)
                prob_maps = detector.get_cloud_probability_maps(chunk_input)
                ml_cloud = prob_maps[0] >= cloud_threshold

                # 2. Fmask QA slice
                fmask_chunk = fmask_full[
                    win.row_off:win.row_off + win.height,
                    win.col_off:win.col_off + win.width
                ]

                # 3. Hybrid union: flag if either detector marks as cloud/shadow
                hybrid_mask = ml_cloud | fmask_chunk
                dst.write(hybrid_mask.astype(np.uint8), 1, window=win)

Performance Optimization & Validation

Memory Management

Loading full-scene 10m/20m bands into RAM frequently triggers MemoryError on standard cloud instances. The chunked rasterio window approach above keeps peak memory manageable for typical tile sizes. For larger footprints, integrate dask.array or rioxarray to parallelize tile processing across CPU cores.

Threshold Calibration

The default cloud_threshold=0.45 in s2cloudless balances precision and recall for mid-latitude Sentinel-2 data. In tropical or high-aerosol regions, lower the threshold to 0.35 to capture thin haze, then validate against manual QA samples. Always cross-check against the official s2cloudless GitHub repository for model updates and band-order requirements — the expected band order is documented there and must be followed exactly.

Fmask QA Decoding

Fmask 5.x introduced bit-packed outputs for compact storage. If your QA band uses bitwise encoding instead of categorical values, replace the equality checks with bitwise operations:

# Bit-packed Fmask 5.x (check the algorithm reference for your sensor version)
fmask_cloud = (qa & 0b00000001) > 0
fmask_shadow = (qa & 0b00000010) > 0

Refer to the Fmask algorithm reference for exact bit definitions per sensor version.

Validation Workflow

Before deploying to production, run a stratified random sample (n=500 pixels) against high-resolution reference imagery (e.g., PlanetScope or NAIP). Calculate precision, recall, and F1-score. A well-tuned hybrid mask typically achieves F1 ≥ 0.92 for clouds and F1 ≥ 0.88 for shadows across diverse biomes.