As large language models (LLMs) continue to improve at coding, the benchmarks used to evaluate their performance are steadily becoming less useful. That's because though many LLMs have similar high ...
Nous Research's NousCoder-14B is an open-source coding model landing right in the Claude Code moment
B, an open-source AI coding model trained in four days on Nvidia B200 GPUs, publishing its full reinforcement-learning stack ...
Adam Tornhill is a programmer who combines degrees in engineering and psychology. Adam is the founder and CTO of CodeScene. Back in 2023, I discussed the risks and promises of emerging AI coding ...
Gemini 2.5 Deep Think scores competitive coding gold in ‘profound leap’ for abstract problem-solving
After a mathematics win in July, Gemini 2.5 Deep Think has now earned a gold-medal level performance in competitive coding. The International Collegiate Programming Contest (ICPC) is the “oldest, ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results