Database

LLM

Medical

Human Multimodal

SceneWalk Dataset

from SALOVA: Segment-Augmented Long Video Assistant for Targeted Retrieval and Routing in Long-Form Video Analysis

CVPR 2025

from SPARK: Multi-Vision Sensor Perception and Reasoning Benchmark for Large-scale Vision-Language Models

ArXiv 2024

from TroL: Traversal of Layers for Large Language and Vision Models

EMNLP 2024

from Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models

NeurIPS 2024

Page updated

Report abuse