Clarify VRAM eviction rule for cross-priority edge case

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 13:20:53 +02:00
parent e6be9dcb85
commit bd0ed74d32
1 changed files with 1 additions and 1 deletions
--- a/kischdle/llmux/docs/superpowers/specs/2026-04-03-llmux-design.md
+++ b/kischdle/llmux/docs/superpowers/specs/2026-04-03-llmux-design.md
@@ -93,7 +93,7 @@ When a request arrives for a model whose physical model is not loaded:
   - Evict LLM first
   - Evict TTS second
   - Evict ASR only as last resort
-   - Never evict a higher-priority model to load a lower-priority one
+   - Never evict a higher-priority model to load a lower-priority one (e.g., never evict ASR to make room for TTS; in that case, evict the LLM instead)
 4. Load the requested model.

 ### Concurrency