Lima: Less is more for alignment C Zhou, P Liu, P Xu, S Iyer, J Sun, Y Mao, X Ma, A Efrat, P Yu, L Yu, ... Advances in Neural Information Processing Systems 36, 2024 | 671 | 2024 |
Videoclip: Contrastive pre-training for zero-shot video-text understanding H Xu, G Ghosh, PY Huang, D Okhonko, A Aghajanyan, F Metze, ... arXiv preprint arXiv:2109.14084, 2021 | 501 | 2021 |
Pre-training via paraphrasing M Lewis, M Ghazvininejad, G Ghosh, A Aghajanyan, S Wang, ... Advances in Neural Information Processing Systems 33, 18470-18481, 2020 | 159 | 2020 |
Cm3: A causal masked multimodal model of the internet A Aghajanyan, B Huang, C Ross, V Karpukhin, H Xu, N Goyal, D Okhonko, ... arXiv preprint arXiv:2201.07520, 2022 | 144 | 2022 |
Vlm: Task-agnostic video-language model pre-training for video understanding H Xu, G Ghosh, PY Huang, P Arora, M Aminzadeh, C Feichtenhofer, ... arXiv preprint arXiv:2105.09996, 2021 | 133 | 2021 |
Exploring deep multimodal fusion of text and photo for hate speech classification F Yang, X Peng, G Ghosh, R Shilon, H Ma, E Moore, G Predovic Proceedings of the third workshop on abusive language online, 11-18, 2019 | 95 | 2019 |
Scaling autoregressive multi-modal models: Pretraining and instruction tuning L Yu, B Shi, R Pasunuru, B Muller, O Golovneva, T Wang, A Babu, B Tang, ... arXiv preprint arXiv:2309.02591 2 (3), 2023 | 90 | 2023 |
Demystifying clip data H Xu, S Xie, XE Tan, PY Huang, R Howes, V Sharma, SW Li, G Ghosh, ... arXiv preprint arXiv:2309.16671, 2023 | 83 | 2023 |
Htlm: Hyper-text pre-training and prompting of language models A Aghajanyan, D Okhonko, M Lewis, M Joshi, H Xu, G Ghosh, ... arXiv preprint arXiv:2107.06955, 2021 | 70 | 2021 |
Multi-task retrieval for knowledge-intensive tasks J Maillard, V Karpukhin, F Petroni, W Yih, B Oğuz, V Stoyanov, G Ghosh arXiv preprint arXiv:2101.00117, 2021 | 59 | 2021 |
Mavil: Masked audio-video learners PY Huang, V Sharma, H Xu, C Ryali, Y Li, SW Li, G Ghosh, J Malik, ... Advances in Neural Information Processing Systems 36, 2024 | 52 | 2024 |
Chameleon: Mixed-modal early-fusion foundation models C Team arXiv preprint arXiv:2405.09818, 2024 | 35 | 2024 |
Optimizing query evaluations using reinforcement learning for web search C Rosset, D Jose, G Ghosh, B Mitra, S Tiwary The 41st International ACM SIGIR Conference on Research & Development in …, 2018 | 33 | 2018 |
Cit: Curation in training for effective vision-language data H Xu, S Xie, PY Huang, L Yu, R Howes, G Ghosh, L Zettlemoyer, ... Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023 | 21 | 2023 |
Alert: Adapting language models to reasoning tasks P Yu, T Wang, O Golovneva, B AlKhamissi, S Verma, Z Jin, G Ghosh, ... arXiv preprint arXiv:2212.08286, 2022 | 21* | 2022 |
Relaxed filter set Y Wang, TK Dohzen, D Qi, R Majumder, G Ghosh, NR Wijaya US Patent App. 12/328,450, 2010 | 5 | 2010 |
Text Quality-Based Pruning for Efficient Training of Language Models V Sharma, K Padthe, N Ardalani, K Tirumala, R Howes, H Xu, PY Huang, ... arXiv preprint arXiv:2405.01582, 2024 | 3 | 2024 |
MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts XV Lin, A Shrivastava, L Luo, S Iyer, M Lewis, G Gosh, L Zettlemoyer, ... arXiv preprint arXiv:2407.21770, 2024 | 2 | 2024 |
User-SERP Interaction Prediction through Deep Multi-task Learning W Jiang, D Jose, G Ghosh | 1 | 2018 |