As large language models (LLMs) continue to improve at coding, the benchmarks used to evaluate their performance are steadily becoming less useful. That's because though many LLMs have similar high ...
B, an open-source AI coding model trained in four days on Nvidia B200 GPUs, publishing its full reinforcement-learning stack ...
Adam Tornhill is a programmer who combines degrees in engineering and psychology. Adam is the founder and CTO of CodeScene. Back in 2023, I discussed the risks and promises of emerging AI coding ...
After a mathematics win in July, Gemini 2.5 Deep Think has now earned a gold-medal level performance in competitive coding. The International Collegiate Programming Contest (ICPC) is the “oldest, ...