nspam β on-device Nostr spam classifier
A tiny linear model that scores bundles of Nostr kind:1 notes from a single
author as real or bot. Designed for mobile clients (Kotlin / Swift / JS).
- Model type: logistic regression over hashed character + word n-grams and hand-crafted structural features.
- Input: a list of 1β10 recent
kind:1events from one pubkey. - Output: a calibrated probability β [0, 1] that the author is a bot, given the supplied notes.
- Size: ~1024 KB (float32 weights, compressed npz smaller).
- Runtime: MurmurHash3 + dot product + sigmoid + isotonic lookup. No ML framework needed.
Intended use
Client-side filtering/ranking in Nostr apps. The score is directional, not a verdict β apps should combine with user mutes, follow graph, and NIP-56 reports rather than hard-block on a single score.
Not for
- High-stakes moderation decisions.
- Replacing human review for appeals.
- Scoring notes that aren't
kind:1(no training data for kinds 0/3/5/7/β¦).
Training data
- Authors: labeled via a manual UI at the pubkey level. Not released (the labels belong to the maintainer and contain private judgement calls).
- Notes: pulled from public relays (
damus.io,nos.lol,primal.net,nostr.wine, others) using the public NIP-01 filter protocol.
Holdout metrics (vv0.9)
| metric | value |
|---|---|
| average precision | 0.9787 |
| ROC AUC | 0.9888 |
| precision @ recall 0.9 | 0.9368 |
| per-author accuracy (majority) | 0.9059 |
Scores are best on bundles of 5+ notes. Single-note inference works but is
less reliable β see errors.jsonl in the training repo for failure modes.
Limitations & known failure modes
- English-heavy training data. Non-Latin-script feeds are underrepresented.
- Adversarial drift. Spammers can and do adopt new templates. Re-train periodically.
- Mixed-content bot accounts. Accounts that post mostly innocuous content with occasional spam will score low β the per-note labels don't exist.
- Cold start. Accounts with <3 notes have limited signal; model emits a less-confident score.
Files
weights.npzβ folded LogReg weights, intercept, calibration knots.config.jsonβ feature layout, hashing convention, bundle sizes.parity_fixtures.jsonlβ 50 bundles + expected scores for port validation.hash_fixtures.jsonlβ per-token hash outputs for unit-testing the hash port.README.mdβ quickstart.
License
MIT.