Abstract: As a challenging task in visual information retrieval, open-ended long-form video question answering automatically generates the natural language answer from the referenced video content ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results