Prototypical cross-domain self-supervised learning for few-shot unsupervised domain adaptation X Yue, Z Zheng, S Zhang, Y Gao, T Darrell, K Keutzer, AS Vincentelli CVPR 2021, 2021 | 173 | 2021 |
Prompt vision transformer for domain generalization Z Zheng, X Yue, K Wang, Y You arXiv preprint arXiv:2208.08914, 2022 | 44 | 2022 |
Preventing Zero-Shot Transfer Degradation in Continual Learning of Vision-Language Models Z Zheng, M Ma, K Wang, Z Qin, X Yue, Y You ICCV 2023, 2023 | 40 | 2023 |
To Repeat or Not To Repeat: Insights from Scaling LLM under Token-Crisis F Xue, Y Fu, W Zhou, Z Zheng, Y You Neurips 2023, 2023 | 38 | 2023 |
Openmoe: An early effort on open mixture-of-experts language models F Xue, Z Zheng, Y Fu, J Ni, Z Zheng, W Zhou, Y You ICML 2024, 2024 | 29 | 2024 |
Cross-token modeling with conditional computation Y Lou, F Xue, Z Zheng, Y You arXiv preprint arXiv:2109.02008, 2021 | 26* | 2021 |
InfoBatch: Lossless Training Speed Up by Unbiased Dynamic Data Pruning Z Qin, K Wang, Z Zheng, J Gu, X Peng, D Zhou, Y You ICLR 2024, 2023 | 22 | 2023 |
Instruction in the wild: A user-based instruction dataset J Ni, F Xue, Y Deng, J Phang, K Jain, MH Shah, Z Zheng, Y You GitHub repository, 2023 | 22* | 2023 |
Response Length Perception and Sequence Scheduling: An LLM-Empowered LLM Inference Pipeline Z Zheng, X Ren, F Xue, Y Luo, X Jiang, Y You Neurips 2023, 2023 | 21 | 2023 |
Multi-source few-shot domain adaptation X Yue, Z Zheng, HP Das, K Keutzer, AS Vincentelli arXiv preprint arXiv:2109.12391, 2021 | 14 | 2021 |
Scene-aware learning network for radar object detection Z Zheng, X Yue, K Keutzer, A Sangiovanni Vincentelli Proceedings of the 2021 International Conference on Multimedia Retrieval …, 2021 | 10 | 2021 |
A Study on Transformer Configuration and Training Objective F Xue, J Chen, A Sun, X Ren, Z Zheng, X He, Y Chen, X Jiang, Y You ICML 2023, 2023 | 9* | 2023 |
CAME: Confidence-guided Adaptive Memory Efficient Optimization Y Luo, X Ren, Z Zheng, Z Jiang, X Jiang, Y You ACL 2023, 2023 | 6 | 2023 |
CowClip: Reducing CTR Prediction Model Training Time from 12 hours to 10 minutes on 1 GPU Z Zheng, P Xu, X Zou, D Tang, Z Li, C Xi, P Wu, L Zou, Y Zhu, M Chen, ... AAAI 2023, 2023 | 4 | 2023 |
Open-Sora: Democratizing Efficient Video Production for All Z Zheng, X Peng, Y You | 3 | 2024 |
DSP: Dynamic Sequence Parallelism for Multi-Dimensional Transformers X Zhao, S Cheng, Z Zheng, Z Yang, Z Liu, Y You arXiv preprint arXiv:2403.10266, 2024 | 2 | 2024 |
How Does the Textual Information Affect the Retrieval of Multimodal In-Context Learning? Y Luo, Z Zheng, Z Zhu, Y You arXiv preprint arXiv:2404.12866, 2024 | 1 | 2024 |
Dataset Growth Z Qin, Z Xu, Y Zhou, Z Zheng, Z Cheng, H Tang, L Shang, B Sun, X Peng, ... ECCV 2024, 2024 | | 2024 |
Helen: Optimizing CTR Prediction Models with Frequency-wise Hessian Eigenvalue Regularization Z Zhu, Y Liu, Z Zheng, H Guo, Y You WebConf 2024, 3485-3496, 2024 | | 2024 |
FSL-QuickBoost: Minimal-Cost Ensemble for Few-Shot Learning Y Bai, B Cai, YK Tan, Z Zheng, S Chen, T Chen MM 2024, 2024 | | 2024 |