Visual chain of thought: Bridging logical gaps with multimodal infillings D Rose, V Himakunthala, A Ouyang, R He, A Mei, Y Lu, M Saxon, ... arXiv preprint arXiv:2305.02317, 2023 | 5 | 2023 |
Chinmay Sonar, Diba Mirza, and William Yang Wang D Rose, V Himakunthala, A Ouyang, R He, A Mei, Y Lu, M Saxon Visual chain of thought: Bridging logical gaps with multimodal infillings 3, 2023 | 5 | 2023 |
Let’s think frame by frame with VIP: A video infilling and prediction dataset for evaluating video chain-of-thought V Himakunthala, A Ouyang, D Rose, R He, A Mei, Y Lu, C Sonar, ... Proceedings of the 2023 Conference on Empirical Methods in Natural Language …, 2023 | 4 | 2023 |
Let's Think Frame by Frame: Evaluating Video Chain of Thought with Video Infilling and Prediction V Himakunthala, A Ouyang, D Rose, R He, A Mei, Y Lu, C Sonar, ... arXiv preprint arXiv:2305.13903, 2023 | 1 | 2023 |