Ronen Zyroff

BigBlueWhale

AI & ML interests

None yet

Recent Activity

new activity 1 day ago

mistralai/Devstral-Small-2-24B-Instruct-2512:No community 4-bit quantization of Devstral Small 2 24B works on vLLM v0.15.1 — Ministral3ForCausalLM missing from registry

new activity 1 day ago

androiddrew/Devstral-Small-2-24B-Instruct-2512-AWQ-4bit:AWQ 4-bit produces repetitive gibberish on long outputs with vLLM v0.15.1 — same bug as cyankiwi variant

new activity 1 day ago

cyankiwi/Devstral-Small-2-24B-Instruct-2512-AWQ-4bit:AWQ 4-bit produces repetitive gibberish on long outputs with vLLM v0.15.1 — root cause identified

View all activity

Organizations

None yet

New activity in mistralai/Devstral-Small-2-24B-Instruct-2512 1 day ago

No community 4-bit quantization of Devstral Small 2 24B works on vLLM v0.15.1 — Ministral3ForCausalLM missing from registry

#26 opened 1 day ago by

BigBlueWhale

New activity in androiddrew/Devstral-Small-2-24B-Instruct-2512-AWQ-4bit 1 day ago

AWQ 4-bit produces repetitive gibberish on long outputs with vLLM v0.15.1 — same bug as cyankiwi variant

#1 opened 1 day ago by

BigBlueWhale

New activity in cyankiwi/Devstral-Small-2-24B-Instruct-2512-AWQ-4bit 1 day ago

AWQ 4-bit produces repetitive gibberish on long outputs with vLLM v0.15.1 — root cause identified

#5 opened 1 day ago by

BigBlueWhale

New activity in Qwen/Qwen-Image-2512 about 1 month ago

Extremely slow on 5090

#1 opened about 1 month ago by

STTrife

New activity in Qwen/Qwen3-VL-32B-Instruct 2 months ago

Best non-thinking model qwen ever released

👍 1

#7 opened 2 months ago by

BigBlueWhale

New activity in Qwen/Qwen3-VL-32B-Thinking-GGUF 2 months ago

Disappointment in text performance

#1 opened 2 months ago by

BigBlueWhale

New activity in Qwen/Qwen3-VL-32B-Thinking 2 months ago

Qwen3-32B (April 2025) is superior

#2 opened 2 months ago by

BigBlueWhale

New activity in Qwen/Qwen3-VL-32B-Instruct 3 months ago

Recommended model parameters

#5 opened 3 months ago by

BigBlueWhale

How about running by llama.cpp

#1 opened 4 months ago by

rosspanda0

New activity in Qwen/Qwen3-VL-32B-Thinking 4 months ago

Best open source model ever, period.

🤝 2

#1 opened 4 months ago by

BigBlueWhale

New activity in TheBloke/Airoboros-c34B-2.1-GGUF over 2 years ago

Fix prompt format in llama.cpp command

#2 opened over 2 years ago by

nacs

Best open source model for coding (August 2023)

#1 opened over 2 years ago by

BigBlueWhale

New activity in jondurbin/airoboros-c34b-2.1 over 2 years ago

wizardcoder-python-34b sucks. Is this any better?

#1 opened over 2 years ago by

BigBlueWhale

New activity in TheBloke/WizardCoder-Python-34B-V1.0-GGUF over 2 years ago

This model looks insanely good for coding ( 73.2 for humanEval )!

🤯 👍 2

#1 opened over 2 years ago by

mirek190

New activity in TheBloke/llama2_70b_chat_uncensored-GGML over 2 years ago

Uncensored my ass ....

#2 opened over 2 years ago by

mirek190

New activity in jondurbin/airoboros-l2-70b-gpt4-1.4.1 over 2 years ago

Works perfectly in CPU mode with oobabooga

👍 2

#4 opened over 2 years ago by

BigBlueWhale

New activity in TheBloke/StableBeluga2-70B-GPTQ over 2 years ago

Why so few 8 bit capable models?

#13 opened over 2 years ago by

ibivibiv

New activity in jondurbin/airoboros-l2-70b-gpt4-1.4.1 over 2 years ago

Works perfectly in CPU mode with oobabooga

👍 2

#4 opened over 2 years ago by

BigBlueWhale

Ronen Zyroff

AI & ML interests

Recent Activity

Organizations

BigBlueWhale's activity

No community 4-bit quantization of Devstral Small 2 24B works on vLLM v0.15.1 — Ministral3ForCausalLM missing from registry

AWQ 4-bit produces repetitive gibberish on long outputs with vLLM v0.15.1 — same bug as cyankiwi variant

AWQ 4-bit produces repetitive gibberish on long outputs with vLLM v0.15.1 — root cause identified

Extremely slow on 5090

Best non-thinking model qwen ever released

Disappointment in text performance

Qwen3-32B (April 2025) is superior

Recommended model parameters

How about running by llama.cpp

Best open source model ever, period.

Fix prompt format in llama.cpp command

Best open source model for coding (August 2023)

wizardcoder-python-34b sucks. Is this any better?

This model looks insanely good for coding ( 73.2 for humanEval )!

Uncensored my ass ....

Works perfectly in CPU mode with oobabooga

Why so few 8 bit capable models?

Works perfectly in CPU mode with oobabooga