Vedanuj Goswami
Vedanuj Goswami
Research Engineer, Meta AI
Verified email at
Cited by
Cited by
12-in-1: Multi-task vision and language representation learning
J Lu*, V Goswami*, M Rohrbach, D Parikh, S Lee
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern†…, 2020
The hateful memes challenge: Detecting hate speech in multimodal memes
D Kiela, H Firooz, A Mohan, V Goswami, A Singh, P Ringshia, ...
Advances in neural information processing systems 33, 2611-2624, 2020
MMF: A multimodal framework for vision and language research
A Singh, V Goswami, V Natarajan, Y Jiang, X Chen, M Shah, M Rohrbach, ...
URL: https://github. com/facebookresearch/mmf, 0
Flava: A foundational language and vision alignment model
A Singh*, R Hu*, V Goswami*, G Couairon, W Galuba, M Rohrbach, ...
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern†…, 2022
Llama 2: Open foundation and fine-tuned chat models
H Touvron, L Martin, K Stone, P Albert, A Almahairi, Y Babaei, ...
arXiv preprint arXiv:2307.09288, 2023
No language left behind: Scaling human-centered machine translation
NLLB, MR Costa-jussŗ, J Cross, O «elebi, M Elbayad, K Heafield, ...
arXiv preprint arXiv:2207.04672, 2022
Only time can tell: Discovering temporal data for temporal modeling
L Sevilla-Lara, S Zha, Z Yan, V Goswami, M Feiszli, L Torresani
Proceedings of the IEEE/CVF winter conference on applications of computer†…, 2021
Creative sketch generation
S Ge, V Goswami, CL Zitnick, D Parikh
arXiv preprint arXiv:2011.10039, 2020
Are we pretraining it right? digging deeper into visio-linguistic pretraining
A Singh, V Goswami, D Parikh
arXiv preprint arXiv:2004.08744, 2020
The hateful memes challenge: Competition report
D Kiela, H Firooz, A Mohan, V Goswami, A Singh, CA Fitzpatrick, P Bull, ...
NeurIPS 2020 Competition and Demonstration Track, 344-360, 2021
Human-adversarial visual question answering
S Sheng, A Singh, V Goswami, J Magana, T Thrush, W Galuba, D Parikh, ...
Advances in Neural Information Processing Systems 34, 20346-20359, 2021
Movie: Revisiting modulated convolutions for visual counting and beyond
DK Nguyen, V Goswami, X Chen
arXiv preprint arXiv:2004.11883, 2020
Tricks for training sparse translation models
D Dua, S Bhosale, V Goswami, J Cross, M Lewis, A Fan
arXiv preprint arXiv:2110.08246, 2021
Knowledge extraction and annotation for cross-domain textual case-based reasoning in biologically inspired design
S Rugaber, S Bhati, V Goswami, E Spiliopoulou, S Azad, S Koushik, ...
Case-Based Reasoning Research and Development: 24th International Conference†…, 2016
Unsupervised image-to-video clothing transfer
A Pumarola, V Goswami, F Vicente, F De la Torre, F Moreno-Noguer
Proceedings of the IEEE/CVF International Conference on Computer Vision†…, 2019
SpeechMatrix: A Large-Scale Mined Corpus of Multilingual Speech-to-Speech Translations
PA Duquenne, H Gong, N Dong, J Du, A Lee, V Goswani, C Wang, J Pino, ...
arXiv preprint arXiv:2211.04508, 2022
Building recommender systems with PyTorch
D Mudigere, M Naumov, J Spisak, G Chauhan, N Kokhlikyan, A Singh, ...
Proceedings of the 26th ACM SIGKDD International Conference on Knowledge†…, 2020
Causes and cures for interference in multilingual translation
U Shaham, M Elbayad, V Goswami, O Levy, S Bhosale
arXiv preprint arXiv:2212.07530, 2022
MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation
M Anwar, B Shi, V Goswami, WN Hsu, J Pino, C Wang
arXiv preprint arXiv:2303.00628, 2023
Multilingual Speech-to-Speech Translation into Multiple Target Languages
H Gong, N Dong, S Popuri, V Goswami, A Lee, J Pino
arXiv preprint arXiv:2307.08655, 2023
The system can't perform the operation now. Try again later.
Articles 1–20