Gheorghe Chesler PRO

nightmedia

AI & ML interests

Nightmedia: human-Like AI and the MLX Deckard(qx) Formula Donations are appreciated: BTC:36d7U1n3MFaXgnNRAaEL3Pa3Hy6oFhM7XY

Recent Activity

repliedto their post about 2 hours ago
Gemma4 template issues I ran into this when testing the juiceb0xc0de/locus-gemma-4-e2b, and the response was full of end tags: > I await your next query, G.<turn|><turn|>><turn|>Your formal request has been processed and analyzed. I am ready to continue the engagement when you are.<turn|>> It turns out, the fix is in LMStudio: Why this is critical for Gemma 4 E2B Gemma 4 models (especially edge variants like E2B-it) utilize Chain-of-Thought thinking layers and structural multi-turn tool schemas natively. The model shifts between channels like <|channel>thought and regular dialogue text seamlessly. If LM Studio does not explicitly watch for the terminal closing tags as a hard cutoff signal, the edge quantization layers will drop character alignment when processing long context history segments. As a result, the model gets stuck in an infinite state loop, repeating structure summaries instead of returning control back to your prompt session. ``` cat ~/.lmstudio/config-presets/gemma4.preset.json { "identifier": "@local:gemma4", "name": "gemma4", "changed": false, "operation": { "fields": [ { "key": "llm.prediction.stopStrings", "value": [ "<turn|>", "<channel|>", "<eos>" ] } ] }, "load": { "fields": [] } } ``` That is the exact configuration structure LM Studio requires. By saving it inside the llm.prediction.stopStrings operational field, LM Studio binds those terminal tokens directly into the underlying runtime client loop rather than the model's architectural blueprint. Every time you load this model profile, the inference wrapper will strictly police and discard those boundary markers before the streaming text token buffer writes to your chat window. You have successfully stabilized a cutting-edge, programmatic Gemma 4 architecture inside a local GUI environment. --Gemini
repliedto their post about 2 hours ago
Gemma4 template issues I ran into this when testing the juiceb0xc0de/locus-gemma-4-e2b, and the response was full of end tags: > I await your next query, G.<turn|><turn|>><turn|>Your formal request has been processed and analyzed. I am ready to continue the engagement when you are.<turn|>> It turns out, the fix is in LMStudio: Why this is critical for Gemma 4 E2B Gemma 4 models (especially edge variants like E2B-it) utilize Chain-of-Thought thinking layers and structural multi-turn tool schemas natively. The model shifts between channels like <|channel>thought and regular dialogue text seamlessly. If LM Studio does not explicitly watch for the terminal closing tags as a hard cutoff signal, the edge quantization layers will drop character alignment when processing long context history segments. As a result, the model gets stuck in an infinite state loop, repeating structure summaries instead of returning control back to your prompt session. ``` cat ~/.lmstudio/config-presets/gemma4.preset.json { "identifier": "@local:gemma4", "name": "gemma4", "changed": false, "operation": { "fields": [ { "key": "llm.prediction.stopStrings", "value": [ "<turn|>", "<channel|>", "<eos>" ] } ] }, "load": { "fields": [] } } ``` That is the exact configuration structure LM Studio requires. By saving it inside the llm.prediction.stopStrings operational field, LM Studio binds those terminal tokens directly into the underlying runtime client loop rather than the model's architectural blueprint. Every time you load this model profile, the inference wrapper will strictly police and discard those boundary markers before the streaming text token buffer writes to your chat window. You have successfully stabilized a cutting-edge, programmatic Gemma 4 architecture inside a local GUI environment. --Gemini
View all activity

Organizations

Nightmedia AI's profile picture