44 lines
1.5 KiB
Markdown
44 lines
1.5 KiB
Markdown
# Hardware details search for AI model benchmark values
|
|
|
|
## Motivation
|
|
|
|
Benchmark values for speed are provided in tokens per second for
|
|
a certain AI model but the Hardware used in the benchmark is not
|
|
described; therefore, a research is requested to find combinations
|
|
of AI model speed values and Hardware details.
|
|
|
|
## AI model
|
|
|
|
The AI model is
|
|
Qwen3 VL 30B A3B Instruct .
|
|
The model details are described at
|
|
https://huggingface.co/Qwen/Qwen3-VL-30B-A3B-Instruct
|
|
|
|
## Benchmark speed values
|
|
|
|
An example for provided benchmark speed values is at
|
|
https://artificialanalysis.ai/models/qwen3-vl-30b-a3b-instruct/providers
|
|
|
|
There three relevant speed values are provided:
|
|
- Fireworks: 141.7 t/s
|
|
- Novita: 105.3 t/s
|
|
- Alibaba Cloud: 104.3 t/s
|
|
|
|
The fourth speed value is not relevant because it is for a quantized
|
|
version of the model (FP8, 8 bit per parameter).
|
|
|
|
## Tasks
|
|
|
|
Your tasks are:
|
|
1. Find out which Hardware was used for the three speed values of the example
|
|
and which VRAM throughput in GB/s this Hardware had.
|
|
2. If you were not able to solve 1., then search for at least two
|
|
combinations of speed values and Hardware details for the AI model
|
|
with the original model parameter size of 16 bit per parameter.
|
|
When succeeded, end here.
|
|
3. If you were not able to solve 1. and not able to solve 2., then
|
|
search for at least two
|
|
combinations of speed values and Hardware details for the AI model
|
|
with quantized model parameter size of 8 bit per parameter.
|
|
|