Shihan Qu's picture

Open to Collab

Shihan Qu

zenmagnets

·

AI & ML interests

None yet

Recent Activity

new activity about 9 hours ago

Qwen/Qwen3.5-397B-A17B:Qwen3.6 397b

new activity about 1 month ago

mmangkad/Qwen3.6-27B-NVFP4:31gb NVFP4 Model?

liked a model about 1 month ago

NinjaBoffin/MiniMax-M2.7-NVFP4

View all activity

Organizations

None yet

New activity in Qwen/Qwen3.5-397B-A17B about 9 hours ago

Qwen3.6 397b

#75 opened about 1 month ago by

New activity in mmangkad/Qwen3.6-27B-NVFP4 about 1 month ago

31gb NVFP4 Model?

#1 opened about 1 month ago by

New activity in MiniMaxAI/MiniMax-M2.7 about 1 month ago

license

#5 opened about 1 month ago by

New activity in varjosoft/GLM-5.1-Open-TQ3 about 1 month ago

Pending GPU & vLLM validation

#1 opened about 2 months ago by

New activity in MiniMaxAI/MiniMax-M2.7 about 1 month ago

No commercial use allowed in License?

#6 opened about 1 month ago by

New activity in lukealonso/Qwen3.5-397B-A17B-NVFP4 3 months ago

How to run on vLLM for 4xSM120

#1 opened 3 months ago by

New activity in lukealonso/MiniMax-M2.5-NVFP4 3 months ago

Here's the vLLM recipe I'm using with 2x RTX Pro 6000

#1 opened 3 months ago by

New activity in Ex0bit/Kimi-K2.5-PRISM-REAP-530B-A32B 3 months ago

Anyone get this working on 4x RTX 6000 Pro?

#1 opened 3 months ago by

New activity in GadflyII/MiniMax-M2.1-NVFP4 3 months ago

Throughput NVFP4 on Dual 6000 Blackwells

#2 opened 3 months ago by

New activity in vincentzed-hf/Qwen3.5-397B-A17B-NVFP4 3 months ago

Anyone try this on 4x RTX 6000 Pro yet?

#1 opened 3 months ago by

New activity in mlx-community/Qwen3.5-397B-A17B-nvfp4 3 months ago

I wish it would fit in 2x6000 PRO!

#2 opened 3 months ago by

New activity in lukealonso/MiniMax-M2.5-NVFP4 3 months ago

"w1_weight_scale_2 must match w3_weight_scale_2. Accuracy may be affected."

#2 opened 3 months ago by

New activity in GadflyII/GLM-4.7-Flash-NVFP4 4 months ago

Wasn't able to recreate MMLU-Pro benchmarks

#5 opened 4 months ago by

New activity in zai-org/GLM-4.7-Flash 4 months ago

Enormous KV-cache size?

#3 opened 4 months ago by

New activity in GadflyII/GLM-4.7-Flash-NVFP4 4 months ago

Really appreciate that you ran performance comparison tests with BF16!

#2 opened 4 months ago by

New activity in marksverdhei/GLM-4.7-Flash-FP8 4 months ago

Performance comps with BF16?

#3 opened 4 months ago by

New activity in cyankiwi/GLM-4.7-Flash-AWQ-4bit 4 months ago

Any plans for a 6bit or 8bit version?

#3 opened 4 months ago by

New activity in marksverdhei/GLM-4.7-Flash-FP8 4 months ago

If 8bit, why shaped like 16 bit

#2 opened 4 months ago by

New activity in nvidia/Qwen3-30B-A3B-NVFP4 6 months ago

6 months since intro of NVFP4, and it's basically still a myth

#4 opened 6 months ago by

New activity in RESMP-DEV/Qwen3-Next-80B-A3B-Instruct-NVFP4 6 months ago

Works with vllm? Any recommendations or howtos?

#1 opened 7 months ago by