Follow
Soham De
Soham De
DeepMind
Verified email at google.com - Homepage
Title
Cited by
Cited by
Year
High-Performance Large-Scale Image Recognition Without Normalization
A Brock, S De, SL Smith, K Simonyan
International Conference on Machine Learning, 2021
6012021
Gemma: Open Models Based on Gemini Research and Technology
G Team, T Mesnard, C Hardin, R Dadashi, S Bhupatiraju, S Pathak, ...
arXiv preprint arXiv:2403.08295, 2024
4252024
Adversarial robustness through local linearization
C Qin, J Martens, S Gowal, D Krishnan, K Dvijotham, A Fawzi, S De, ...
Advances in Neural Information Processing Systems, 13847-13856, 2019
3452019
Training quantized nets: A deeper understanding
H Li*, S De*, Z Xu, C Studer, H Samet, T Goldstein
Advances in Neural Information Processing Systems, 5813-5823, 2017
2452017
On the Origin of Implicit Regularization in Stochastic Gradient Descent
SL Smith, B Dherin, DGT Barrett, S De
International Conference on Learning Representations, 2021
2002021
Unlocking High-Accuracy Differentially Private Image Classification through Scale
S De, L Berrada, J Hayes, SL Smith, B Balle
ICML Workshop on Theory and Practice of Differential Privacy, 2022
191*2022
Resurrecting Recurrent Neural Networks for Long Sequences
A Orvieto, SL Smith, A Gu, A Fernando, C Gulcehre, R Pascanu, S De
arXiv preprint arXiv:2303.06349, 2023
1892023
Batch normalization biases residual blocks towards the identity function in deep networks
S De, S Smith
Advances in Neural Information Processing Systems 33, 2020
172*2020
The loosening of American culture over 200 years is associated with a creativity–order trade-off
JC Jackson, M Gelfand, S De, A Fox
Nature human behaviour 3 (3), 244-250, 2019
166*2019
Convergence guarantees for RMSProp and ADAM in non-convex optimization and an empirical comparison to Nesterov acceleration
S De, A Mukherjee, E Ullah
ICML Workshop on Modern Trends in Nonconvex Optimization for Machine Learning, 2018
157*2018
Automated inference with adaptive batches
S De, A Yadav, D Jacobs, T Goldstein
Artificial Intelligence and Statistics, 1504-1513, 2017
154*2017
Characterizing signal propagation to close the performance gap in unnormalized ResNets
A Brock, S De, SL Smith
International Conference on Learning Representations, 2021
1342021
BYOL works even without batch statistics
PH Richemond, JB Grill, F Altché, C Tallec, F Strub, A Brock, S Smith, ...
NeurIPS Workshop on Self-Supervised Learning: Theory and Practice, 2020
130*2020
On the Generalization Benefit of Noise in Stochastic Gradient Descent
S Smith, E Elsen, S De
International Conference on Machine Learning, 9058-9067, 2020
122*2020
The impact of neural network overparameterization on gradient confusion and stochastic gradient descent
KA Sankararaman*, S De*, Z Xu, WR Huang, T Goldstein
International Conference on Machine Learning, 8469-8479, 2020
1132020
Understanding norm change: An evolutionary game-theoretic approach
S De, DS Nau, MJ Gelfand
Proceedings of the 16th Conference on Autonomous Agents and MultiAgent …, 2017
76*2017
Layer-specific adaptive learning rates for deep networks
B Singh, S De, Y Zhang, T Goldstein, G Taylor
2015 IEEE 14th International Conference on Machine Learning and Applications …, 2015
692015
Efficient distributed SGD with variance reduction
S De, T Goldstein
2016 IEEE International Conference on Data Mining (ICDM), 2016
55*2016
Differentially Private Diffusion Models Generate Useful Synthetic Images
S Ghalebikesabi, L Berrada, S Gowal, I Ktena, R Stanforth, J Hayes, S De, ...
arXiv preprint arXiv:2302.13861, 2023
532023
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
S De, SL Smith, A Fernando, A Botev, G Cristian-Muraru, A Gu, R Haroun, ...
arXiv preprint arXiv:2402.19427, 2024
50*2024
The system can't perform the operation now. Try again later.
Articles 1–20