Follow
Andrew Rouditchenko
Andrew Rouditchenko
PhD Student at MIT CSAIL
Verified email at mit.edu - Homepage
Title
Cited by
Cited by
Year
The sound of pixels
H Zhao, C Gan, A Rouditchenko, C Vondrick, J McDermott, A Torralba
Proceedings of the European conference on computer vision (ECCV), 570-586, 2018
6152018
Everything at once-multi-modal fusion transformer for video retrieval
N Shvetsova, B Chen, A Rouditchenko, S Thomas, B Kingsbury, RS Feris, ...
Proceedings of the ieee/cvf conference on computer vision and pattern …, 2022
1562022
Avlnet: Learning audio-visual language representations from instructional videos
A Rouditchenko, A Boggust, D Harwath, B Chen, D Joshi, S Thomas, ...
Proc. Interspeech 2021, 1584-1588, 2021
1512021
Self-supervised audio-visual co-segmentation
A Rouditchenko, H Zhao, C Gan, J McDermott, A Torralba
ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019
1342019
Contrastive audio-visual masked autoencoder
Y Gong, A Rouditchenko, AH Liu, D Harwath, L Karlinsky, H Kuehne, ...
arXiv preprint arXiv:2210.07839, 2022
1332022
Multimodal clustering networks for self-supervised learning from unlabeled videos
B Chen, A Rouditchenko, K Duarte, H Kuehne, S Thomas, A Boggust, ...
Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2021
992021
Cross-modal discrete representation learning
AH Liu, SY Jin, CIJ Lai, A Rouditchenko, A Oliva, J Glass
arXiv preprint arXiv:2106.05438, 2021
432021
Cmkd: Cnn/transformer-based cross-model knowledge distillation for audio classification
Y Gong, S Khurana, A Rouditchenko, J Glass
arXiv preprint arXiv:2203.06760, 2022
352022
Uavm: Towards unifying audio and visual models
Y Gong, AH Liu, A Rouditchenko, J Glass
IEEE Signal Processing Letters 29, 2437-2441, 2022
222022
Comparison of multilingual self-supervised and weakly-supervised speech pre-training for adaptation to unseen languages
A Rouditchenko, S Khurana, S Thomas, R Feris, L Karlinsky, H Kuehne, ...
arXiv preprint arXiv:2305.12606, 2023
152023
Label-efficient audio classification through multitask learning and self-supervision
T Lee, T Gong, S Padhy, A Rouditchenko, A Ndirango
arXiv preprint arXiv:1910.12587, 2019
72019
C2kd: Cross-lingual cross-modal knowledge distillation for multilingual text-video retrieval
A Rouditchenko, YS Chuang, N Shvetsova, S Thomas, R Feris, ...
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
62023
Cascaded Multilingual Audio-Visual Learning from Videos
A Rouditchenko, A Boggust, D Harwath, S Thomas, H Kuehne, B Chen, ...
Proc. Interspeech 2021, 3006-3010, 2021
62021
Spoken ObjectNet: A Bias-Controlled Spoken Caption Dataset
I Palmer, A Rouditchenko, A Barbu, B Katz, J Glass
Proc. Interspeech 2021, 3650-3654, 2021
52021
What When and Where? Self-Supervised Spatio-Temporal Grounding in Untrimmed Multi-Action Videos from Narrated Instructions
B Chen, N Shvetsova, A Rouditchenko, D Kondermann, S Thomas, ...
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024
42024
Av-cpl: Continuous pseudo-labeling for audio-visual speech recognition
A Rouditchenko, R Collobert, T Likhomanenko
arXiv preprint arXiv:2309.17395, 2023
32023
Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and Translation
A Rouditchenko, Y Gong, S Thomas, L Karlinsky, H Kuehne, R Feris, ...
arXiv preprint arXiv:2406.10082, 2024
12024
Learning Audio-Video Language Representations
A Rouditchenko
Massachusetts Institute of Technology, 2021
2021
The system can't perform the operation now. Try again later.
Articles 1–18