feat: replace gpt-oss-20b-uncensored with HauhauCS MXFP4 GGUF

aoxo model had no quantization (BF16, ~40GB OOM). HauhauCS model
uses MXFP4 GGUF format, loads at 11.9GB via llama-cpp backend.
All three reasoning levels (Low/Medium/High) work.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
tlg
2026-04-06 16:41:41 +02:00
parent 7c4bbe0b29
commit 61308703dc

View File

@@ -36,8 +36,9 @@ physical_models:
gpt-oss-20b-uncensored: gpt-oss-20b-uncensored:
type: llm type: llm
backend: transformers backend: llamacpp
model_id: "aoxo/gpt-oss-20b-uncensored" model_id: "HauhauCS/GPT-OSS-20B-Uncensored-HauhauCS-Aggressive"
model_file: "GPT-OSS-20B-Uncensored-HauhauCS-MXFP4-Aggressive.gguf"
estimated_vram_gb: 13 estimated_vram_gb: 13
supports_vision: false supports_vision: false
supports_tools: true supports_tools: true