Abstract: This paper presents our Cityscape Aerial image Dataset for Object deTection (CADOT), a new benchmark dataset for object detection, based on raw aerial images sourced from the IGN website and ...
See an AMD laptop with a Ryzen AI chip and 128GB memory run GPT OSS at 40 tokens a second, for fast offline work and tighter ...
Overview: Python and SQL form the core data science foundation, enabling fast analysis, smooth cloud integration, and ...
OpenWorldSAM pushes the boundaries of SAM2 by enabling open-vocabulary segmentation with flexible language prompts. [2026-1-4]: Demo release: we’ve added simple demos to run OpenWorldSAM on images ...
(a) SA-Radar (i.e., Ctrl-RS) enables controllable and realistic radar simulation by conditioning on customizable radar attributes. It supports flexible scene editing such as attribute modification, ...
Abstract: In knowledge-based visual question answering (KB-VQA), the answer can be naturally represented by translating visual object embedding referred by the question according to the cross-modality ...