New: Agile-Environment.md, Search_Requests.md
This commit is contained in:
43
Search_Requests.md
Normal file
43
Search_Requests.md
Normal file
@@ -0,0 +1,43 @@
|
||||
# Hardware details search for AI model benchmark values
|
||||
|
||||
## Motivation
|
||||
|
||||
Benchmark values for speed are provided in tokens per second for
|
||||
a certain AI model but the Hardware used in the benchmark is not
|
||||
described; therefore, a research is requested to find combinations
|
||||
of AI model speed values and Hardware details.
|
||||
|
||||
## AI model
|
||||
|
||||
The AI model is
|
||||
Qwen3 VL 30B A3B Instruct .
|
||||
The model details are described at
|
||||
https://huggingface.co/Qwen/Qwen3-VL-30B-A3B-Instruct
|
||||
|
||||
## Benchmark speed values
|
||||
|
||||
An example for provided benchmark speed values is at
|
||||
https://artificialanalysis.ai/models/qwen3-vl-30b-a3b-instruct/providers
|
||||
|
||||
There three relevant speed values are provided:
|
||||
- Fireworks: 141.7 t/s
|
||||
- Novita: 105.3 t/s
|
||||
- Alibaba Cloud: 104.3 t/s
|
||||
|
||||
The fourth speed value is not relevant because it is for a quantized
|
||||
version of the model (FP8, 8 bit per parameter).
|
||||
|
||||
## Tasks
|
||||
|
||||
Your tasks are:
|
||||
1. Find out which Hardware was used for the three speed values of the example
|
||||
and which VRAM throughput in GB/s this Hardware had.
|
||||
2. If you were not able to solve 1., then search for at least two
|
||||
combinations of speed values and Hardware details for the AI model
|
||||
with the original model parameter size of 16 bit per parameter.
|
||||
When succeeded, end here.
|
||||
3. If you were not able to solve 1. and not able to solve 2., then
|
||||
search for at least two
|
||||
combinations of speed values and Hardware details for the AI model
|
||||
with quantized model parameter size of 8 bit per parameter.
|
||||
|
||||
Reference in New Issue
Block a user