Go to file
tlg d7a091df8c feat: VRAM manager with priority-based model eviction
Tracks GPU VRAM usage (16GB) and handles model loading/unloading with
priority-based eviction: LLM (lowest) -> TTS -> ASR (highest, protected).
Uses asyncio Lock for concurrency safety.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 09:14:41 +02:00
2026-03-31 17:58:54 +02:00
Description
No description provided
144 KiB
Languages
Python 91.2%
Shell 6%
Dockerfile 2.8%