arxiv preprint – CinePile: A Long Video Question Answering Dataset and Benchmark

In this episode, we discuss CinePile: A Long Video Question Answering Dataset and Benchmark by Ruchit Rawal, Khalid Saifullah, Ronen Basri, David Jacobs, Gowthami Somepalli, Tom Goldstein. CinePile is a new dataset and benchmark designed for authentic long-form video understanding, addressing the limitations of current datasets. It comprises 305,000 multiple-choice questions (MCQs) spanning various visual and multimodal aspects. The evaluation of recent state-of-the-art video-centric language models (LLMs) shows a significant gap between machine and human performance in these complex tasks.


Posted

in

by

Tags: