Deepseek v3-0324 is far from a minor upgrade
- MMLU-Pro: 75.9 → 81.2 (+5.3)
- GPQA: 59.1 → 68.4 (+9.3)
- AIME: 39.6 → 59.4 (+19.8)
- LiveCodeBench: 39.2 → 49.2 (+10)
if this is the non-reasoning model, perhaps it might be better than gpt -4.5