Chen, Zhao, Wang, Han, Patwardhan & Cohan (2026). Introduces a 300K QA-pair training dataset built from 20K scientific papers using a two-stage synthesize-and-reground pipeline. Models fine-tuned on SciMDR show significant gains on complex document-level scientific reasoning benchmarks. A major infrastructure contribution for multimodal science AI.
Comments on "SciMDR: Benchmarking and Advancing Scientific Multimodal Document Reasoning"
Create a free account or sign in to join the discussion.
Sign in to join the conversation