Follow
Xiaoteng Ma(马骁腾)
Xiaoteng Ma(马骁腾)
Center for Intelligent and Networked Systems, Dept. Automation, Tsinghua University, Beijing, China
Verified email at mails.tsinghua.edu.cn - Homepage
Title
Cited by
Cited by
Year
DSAC: Distributional Soft Actor Critic for Risk-Sensitive Reinforcement Learning
X Ma, L Xia, Z Zhou, J Yang, Q Zhao
Reinforcement Learning for Real Life Workshop at ICML 2019, 0
91*
Believe what you see: Implicit constraint approach for offline multi-agent reinforcement learning
Y Yang, X Ma, C Li, Z Zheng, Q Zhang, G Huang, J Yang, Q Zhao
Advances in Neural Information Processing Systems 34, 10299-10312, 2021
652021
Mildly conservative q-learning for offline reinforcement learning
J Lyu, X Ma, X Li, Z Lu
Advances in Neural Information Processing Systems 35, 1711-1724, 2022
632022
Air-combat strategy using deep Q-learning
X Ma, L Xia, Q Zhao
2018 Chinese Automation Congress (CAC), 3952-3957, 2018
482018
Rorl: Robust offline reinforcement learning via conservative smoothing
R Yang, C Bai, X Ma, Z Wang, C Zhang, L Han
Advances in neural information processing systems 35, 23851-23866, 2022
422022
Efficient continuous control with double actors and regularized critics
J Lyu, X Ma, J Yan, X Li
Proceedings of the AAAI Conference on Artificial Intelligence 36 (7), 7655-7663, 2022
392022
Offline reinforcement learning with value-based episodic memory
X Ma, Y Yang, H Hu, Q Liu, J Yang, C Zhang, Q Zhao, B Liang
arXiv preprint arXiv:2110.09796, 2021
342021
Distributionally robust offline reinforcement learning with linear function approximation
X Ma, Z Liang, J Blanchet, M Liu, L Xia, J Zhang, Q Zhao, Z Zhou
arXiv preprint arXiv:2209.06620, 2022
222022
Wasserstein distance guided adversarial imitation learning with reward shape exploration
M Zhang, Y Wang, X Ma, L Xia, J Yang, Z Li, X Li
2020 IEEE 9th Data Driven Control and Learning Systems Conference (DDCLS …, 2020
182020
Exploit reward shifting in value-based deep-rl: Optimistic curiosity-based exploration and conservative exploitation via linear reward shaping
H Sun, L Han, R Yang, X Ma, J Guo, B Zhou
Advances in Neural Information Processing Systems 35, 37719-37734, 2022
172022
Modeling the interaction between agents in cooperative multi-agent reinforcement learning
X Ma, Y Yang, C Li, Y Lu, Q Zhao, Y Jun
arXiv preprint arXiv:2102.06042, 2021
172021
Fairness control of traffic light via deep reinforcement learning
C Li, X Ma, L Xia, Q Zhao, J Yang
2020 IEEE 16th International Conference on Automation Science and …, 2020
172020
Attendance and security system based on building video surveillance
K Sun, Q Zhao, J Zou, X Ma
Advancements in Smart City and Intelligent Building: Proceedings of the …, 2019
112019
What is essential for unseen goal generalization of offline goal-conditioned RL?
R Yang, L Yong, X Ma, H Hu, C Zhang, T Zhang
International Conference on Machine Learning, 39543-39571, 2023
102023
Uncertainty-driven trajectory truncation for model-based offline reinforcement learning
J Zhang, J Lyu, X Ma, J Yan, J Yang, L Wan, X Li
arXiv preprint arXiv:2304.04660, 2023
92023
SOAC: The soft option actor-critic architecture
C Li, X Ma, C Zhang, J Yang, L Xia, Q Zhao
arXiv preprint arXiv:2006.14363, 2020
92020
Bi-level proximal policy optimization for stochastic coordination of EV charging load with uncertain wind power
T Long, XT Ma, QS Jia
2019 IEEE Conference on Control Technology and Applications (CCTA), 302-307, 2019
92019
MPSN: Motion-aware Pseudo-Siamese Network for indoor video head detection in buildings
K Sun, X Ma, P Liu, Q Zhao
Building and Environment 222, 109354, 2022
82022
Reinforcement learning for fluctuation reduction of wind power with energy storage
Z Yang, X Ma, L Xia, Q Zhao, X Guan
Results in Control and Optimization 4, 100030, 2021
82021
Average-reward reinforcement learning with trust region methods
X Ma, X Tang, L Xia, J Yang, Q Zhao
arXiv preprint arXiv:2106.03442, 2021
82021
The system can't perform the operation now. Try again later.
Articles 1–20