--- license: apache-2.0 tags: - finance - settlement-fails # add model_type: lightgbm library_name: lightgbm --- ## Settlement “Stress” Flagging with LightGBM **Objective** Quickly flag days where a given CUSIP’s settlement fails are in the top‑10% of historic fail values, so ops can investigate and remediate before T+1. **Data & Features** - **Raw inputs**: daily “fails‐to‐deliver” count (`QUANTITY (FAILS)`) and price - **Engineered signals** (all lagged or historical, no leakage): - 1‑day lags: `qty_lag1`, `price_lag1`, `fail_value_lag1` - Rolling stats per‑CUSIP: 7‑day mean/std of quantity, 30‑day mean/std of fail value - Momentum: `qty_pct_change`, `price_pct_change` - Cumulative counts: days since last fail, # of days with any fail, cum qty - Event timing: `day_of_week`, `is_month_end`, `is_quarter_end`, `is_year_end` - Text flags: `is_foreign`, `is_adr`, `is_etf`, `is_reit` - Heavy‑tail transforms: `log_qty`, `log_val`, extreme spikes **Model** - **Algorithm**: LightGBM Classifier (handles missing values out‑of‑the‑box, extremely fast) - **Training** - Split by date: train = all data before `2025‑01‑01`, test = after - Positive class = fail_value > 90th percentile (train) - Early‑stop on AUC & binary_error on the hold‑out - Best iteration: ~20 boosting rounds **Performance on Test Set** - **Threshold** (train 90th pctile of `fail_value`): 445 122.29 - **ROC‑AUC**: 1.000 - **Precision**: 0.99 - **Recall**: 1.00 - **F1‑Score**: 1.00 Confusion matrix – test set *Figure 1 – Confusion matrix on the 2025-test slice.* | Class | True Neg | False Pos | False Neg | True Pos | |-------|---------:|----------:|----------:|---------:| | Count | 279 712 | 226 | 49 | 30 991 | **Top Features (gain)** | Feature | Importance | |---------------------|-----------:| | `price_pct_change` | 142 | | `price_lag1` | 129 | | `log_qty` | 115 | | `qty_pct_change` | 83 | | `fail_value_lag1` | 48 | | `log_val` | 40 | | (…plus smaller contributions…) | | **Next Steps** 1. **Calibrate** probability threshold for ops SLAs. 2. **Monitor** drift in AUC/precision‐recall over time. ### Quick start ```python import joblib model = joblib.load("lgb_settlement_stress_flag.pkl") proba = model.predict_proba(X)[:, 1] # P(stress) flag = proba > 0.5 ``` ## Citation > Musodza, K. (2025). Bond Settlement Automated Exception Handling and Reconciliation. Zenodo. https://doi.org/10.5281/zenodo.16828730 > > ➡️ Technical white-paper & notebooks: https://github.com/Coreledger-tech/Exception-handling-reconciliation.git