As large language models (LLMs) continue to improve at coding, the benchmarks used to evaluate their performance are steadily becoming less useful. That's because though many LLMs have similar high ...
Adam Tornhill is a programmer who combines degrees in engineering and psychology. Adam is the founder and CTO of CodeScene. Back in 2023, I discussed the risks and promises of emerging AI coding ...
Gemini 2.5 Deep Think scores competitive coding gold in ‘profound leap’ for abstract problem-solving
After a mathematics win in July, Gemini 2.5 Deep Think has now earned a gold-medal level performance in competitive coding. The International Collegiate Programming Contest (ICPC) is the “oldest, ...
Results that may be inaccessible to you are currently showing.
Hide inaccessible results