- Fix async generator streaming: _stream_generate yields directly instead of returning nested _iter(), route handler awaits generate() then passes async generator to StreamingResponse - Replace aoxo/gpt-oss-20b-uncensored (no quant, OOM) with HauhauCS MXFP4 GGUF via llama-cpp backend Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>