Follow
Xiaotian Han
Xiaotian Han
TikTok (Bytedance)
Verified email at bytedance.com - Homepage
Title
Cited by
Cited by
Year
Real-time micro-scale temperature imaging at low cost based on fluorescent intensity ratio
J Xiong, M Zhao, X Han, Z Cao, X Wei, Y Chen, C Duan, M Yin
Scientific Reports 7 (1), 41311, 2017
342017
Image scene graph generation (sgg) benchmark
X Han, J Yang, H Hu, L Zhang, J Gao, P Zhang
arXiv preprint arXiv:2107.12604, 2021
282021
Mmptrack: Large-scale densely annotated multi-camera multiple people tracking benchmark
X Han, Q You, C Wang, Z Zhang, P Chu, H Hu, J Wang, Z Liu
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer …, 2023
21*2023
Exploring the reasoning abilities of multimodal large language models (mllms): A comprehensive survey on emerging trends in multimodal reasoning
Y Wang, W Chen, X Han, X Lin, H Zhao, Y Liu, B Zhai, J Yuan, Q You, ...
arXiv preprint arXiv:2401.06805, 2024
42024
ViTAR: Vision Transformer with Any Resolution
Q Fan, Q You, X Han, Y Liu, Y Tao, H Huang, R He, H Yang
arXiv preprint arXiv:2403.18361, 2024
2024
InfiMM-HD: A Leap Forward in High-Resolution Multimodal Understanding
H Liu, Q You, X Han, Y Wang, B Zhai, Y Liu, Y Tao, H Huang, R He, ...
arXiv preprint arXiv:2403.01487, 2024
2024
COCO is" ALL''You Need for Visual Instruction Fine-tuning
X Han, Y Wang, B Zhai, Q You, H Yang
arXiv preprint arXiv:2401.08968, 2024
2024
CORE-MM: Complex Open-Ended Reasoning Evaluation For Multi-Modal Large Language Models
X Han, Q You, Y Liu, W Chen, H Zheng, K Mrini, X Lin, Y Wang, B Zhai, ...
arXiv preprint arXiv:2311.11567, 2023
2023
InfiMM-Eval: Complex Open-Ended Reasoning Evaluation For Multi-Modal Large Language Models
X Han, Q You, Y Liu, W Chen, H Zheng, K Mrini, X Lin, Y Wang, B Zhai, ...
arXiv e-prints, arXiv: 2311.11567, 2023
2023
The system can't perform the operation now. Try again later.
Articles 1–9