ae-314
/

promoter-gpt-ft-tata

Text Generation

text-generation-inference

Model card Files Files and versions

ae-314 commited on Oct 24, 2025

Commit

56bd9a1

·

verified ·

1 Parent(s): 248bae8

Update README.md

Files changed (1) hide show

README.md +17 -1

README.md CHANGED Viewed

@@ -40,7 +40,23 @@ This is the model card of a 🤗 transformers model that has been pushed on the
 ### Model Sources [optional]
 - **Repository:** https://huggingface.co/ae-314/promoter-gpt-ft-tata
-- **Paper Adele de Hoffer: (https://huggingface.co/blog/hugging-science/promoter-gpt)
 ### Direct Use
 - Unconditional generation of **synthetic human promoter-like sequences** from a short seed (research/education only).

 ### Model Sources [optional]
 - **Repository:** https://huggingface.co/ae-314/promoter-gpt-ft-tata
+- **Paper by: Adele de Hoffer: (https://huggingface.co/blog/hugging-science/promoter-gpt)
+## Evaluation
+- **Test (balanced human promoters, 300 bp):** loss = **1.2884** · perplexity = **3.63**
+- **Generation (N=50):** GC% ≈ **60.6 ± 15.2**; **TATA** 4-mer ≈ **26%**; **TATAWA** ≈ **10%**; unique 6-mer ratio ≈ **0.815**; ≥6-bp homopolymer ≈ **74%**.
+- **Notes:** Perplexity is on a mixed TATA+no-TATA domain (not directly comparable to an AT-rich 200 bp setup). Generation stats are unconditional (no control tokens).
+## Training Details
+- **Base:** custom Promoter-GPT (GPT-2 ~0.43M params)
+- **Data:** human promoters (300 bp), mixed `promoter_tata` + `promoter_no_tata` (positives), balanced
+- **Tokenization & context:** 3-mer WordLevel; **298 tokens** (full 300 bp; positional embeddings expanded to `n_positions=298`)
+- **Optimizer:** AdamW (weight_decay=0.01), **LR**=1e-4, cosine schedule, warmup ≈10%
+- **Batch / Accum:** 128 / 8 · **Epochs:** 3 · **Precision:** fp32
+- **Hardware:** Google Colab **T4 GPU**
 ### Direct Use
 - Unconditional generation of **synthetic human promoter-like sequences** from a short seed (research/education only).