You need to agree to share your contact information to access this model
This repository is publicly accessible, but you have to accept the conditions to access its files and content.
Accept the Apache-2.0 license and the NVIDIA Nemotron-3 base license. The model embeds Lorenzo Bernardini's Fractal-RL / THESIA research; cite arxiv:2503.01307 for cognitive-behaviors methodology.
Log in or Sign Up to review the conditions and access this model content.
Lorenzob/synch-2-merged
Full merged model: nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16 (120B MoE · 12B active) +
Lorenzob/synch-2 (v11 APOGEO LoRA · best reward +0.200).
Drop-in compatibile con: HF Dedicated Endpoints, TGI, vLLM, HF Inference API, Together AI, Modal, RunPod, Replicate.
Quickstart · Dedicated Endpoint
from huggingface_hub import InferenceClient
client = InferenceClient("Lorenzob/synch-2-merged", token="<HF_TOKEN>")
out = client.chat_completion(
messages=[
{"role": "user",
"content": "Compute the Ollivier-Ricci curvature of K_5."},
],
max_tokens=512, temperature=0.7,
)
print(out.choices[0].message.content)
Quickstart · TGI (Text Generation Inference)
docker run --gpus all --shm-size 1g -p 8080:80 \
-v $PWD/data:/data \
ghcr.io/huggingface/text-generation-inference:latest \
--model-id Lorenzob/synch-2-merged --trust-remote-code \
--num-shard 4 --max-input-length 4096 --max-total-tokens 8192
Quickstart · vLLM
python -m vllm.entrypoints.openai.api_server \
--model Lorenzob/synch-2-merged --trust-remote-code \
--dtype bfloat16 --tensor-parallel-size 4 \
--max-model-len 8192
Quickstart · Locale (transformers full-weights)
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
tok = AutoTokenizer.from_pretrained(
"Lorenzob/synch-2-merged", trust_remote_code=True,
)
model = AutoModelForCausalLM.from_pretrained(
"Lorenzob/synch-2-merged",
torch_dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True,
)
Suggested Hardware
AWS · 4x H100 80GB (consigliato) — accetta 8x A100 80GB — pesi BF16 totali ~245 GB,
50 shard model-XXXXX-of-00050.safetensors.
Merge Details
- Base: NVIDIA Nemotron-3-Super-120B-A12B-BF16 (120B MoE, 12B active)
- Adapter:
Lorenzob/synch-2(v11 APOGEO, best reward +0.200) - LoRA rank: 64 · LoRA alpha: 32
- Merge type: weight addition (peft
merge_and_unload) - Attention impl: SDPA (default), FlashAttention-2 supported
Governance
Vedi i documenti dedicati nel repo:
bias.md— bias analysissafety.md— safety considerationsprivacy.md— privacy implicationsexplainability.md— interpretability notesaccuracy_chart.png— eval results visual
Attribution
- Cognitive behaviors: Gandhi et al. 2025 (arXiv:2503.01307)
- Self-improving reasoner: karpathy/nanochat
- Fractal RL · LCTR · THESIA: Lorenzo Bernardini publications.
License
Apache-2.0 (this merged model) · NVIDIA Nemotron-3 license applies to the underlying base weights distribution.
- Downloads last month
- 324
Model tree for Lorenzob/synch-2-merged
Paper for Lorenzob/synch-2-merged
Evaluation results
- Best Fractal-RL Composite Reward on Synch2 Internal Eval (private)self-reported0.200