Large language models have shown promise across specialized domains, but their performance limits in disaster risk reduction remain poorly understood. We conduct a version-specific evaluation of ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results