- Multi-stage: devel image builds llama-cpp-python with CUDA, runtime
image gets the compiled library via COPY
- chatterbox-tts installed --no-deps to prevent torch 2.6 downgrade
- librosa and diskcache added as explicit chatterbox/llama-cpp deps
- All imports verified with GPU passthrough
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>