Abstract: Vision-Language (VL) alignment across image and text modalities is a challenging task due to the inherent semantic ambiguity of data with multiple possible meanings. Existing methods ...
Abstract: Single-frame infrared small target (SIRST) detection is crucial for both military and civilian applications, but remains challenging due to low resolution and small target sizes. Most ...