My primary research focuses on the intersection of vision and language. Currently, I am exploring the tasks involving vision, language, and robotics, such as language-driven video understanding, open-vocabulary image/video understanding, and interactional robots. Previously, my work centered on hand detection, hand pose estimation, face recognition, and person re-identification.

Welcome students who are interested in the research of vision and language, intelligent robots to join us!

You can contact me via e-mail: yangshuo@smbu.edu.cn; yangshuo129@gmail.com.

🔥 News

📝 Publications

  ($\ast$ means equal contribution, $\dagger$ means corresponding author)

IJCAI 2025
sym

METOR: A Unified Framework for Mutual Enhancement of Objects and Relationships in Open-vocabulary Video Visual Relationship Detection

  • Yongqi Wang, Xinxiao Wu, Shuo Yang$\dagger$
  • The 34th International Joint Conference on Artificial Intelligence (IJCAI), 2025.

    [Paper] [BibTex] [Code]

TPAMI 2025
sym

End-to-end Open-vocabulary Video Visual Relationship Detection using Multi-modal Prompting

  • Yongqi Wang, Xinxiao Wu, Shuo Yang, Jiebo Luo
  • IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2025.

    [Paper] [BibTex] [Code]

AAAI 2025
sym

Video Summarization using Denoising Diffusion Probabilistic Model

  • Zirui Shang, Yubo Zhu, Hongxi Li, Shuo Yang, Xinxiao Wu
  • The 39th Annual AAAI Conference on Artificial Intelligence (AAAI), 2025.

    [Paper] [BibTex]

IEEE TMM 2024
sym

Dynamic Pathway for Query-Aware Feature Learning in Language-Driven Action Localization

  • Shuo Yang, Xinxiao Wu, Zirui Shang, Jiebo Luo
  • IEEE Transactions on Multimedia (TMM), 2024.

    [Paper] [BibTex]

AAAI 2024
sym

Multi-Modal Prompting for Open-Vocabulary Video Visual Relationship Detection

  • Shuo Yang$\ast$, Yongqi Wang$\ast$, Xiaofeng Ji, Xinxiao Wu
  • The 38th Annual AAAI Conference on Artificial Intelligence (AAAI), 2024.

    [Paper] [BibTex] [Code]

ACM MM 2023
sym

Probability Distribution Based Frame-supervised Language-driven Action Localization

  • Shuo Yang, Zirui Shang, Xinxiao Wu
  • The 31st ACM International Conference on Multimedia (ACM MM), 2023.

    [Paper] [BibTex] [Code]

IJCAI 2022
sym

Entity-aware and Motion-aware Transformers for Language-driven Action Localization

  • Shuo Yang, Xinxiao Wu
  • The 31st International Joint Conference on Artificial Intelligence (IJCAI), 2022.

    [Paper] [BibTex] [Code]

CVPR 2020
sym

High-Order Information Matters: Learning Relation and Topology for Occluded Person Re-Identification

  • Guan’an Wang$\ast$, Shuo Yang$\ast$, Huanyu Liu, Zhicheng Wang, Yang Yang, Shuliang Wang, Gang Yu, Jian Sun
  • In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020.

    [Paper] [BibTex] [Code]

TIP 2018
sym

Joint Hand Detection and Rotation Estimation Using CNN

  • Xiaoming Deng, Yinda Zhang, Shuo Yang, Ping Tan, Liang Chang, Ye Yuan, Hongan Wang
  • IEEE Transactions on Image Processing (TIP), 27(4):1888-1900, 2018.

    [Paper] [Project Page]

  • 计算机研究与发展 2025, 大语言模型知识引导的开放域多标签动作识别, 朱荣江, 石语珩, 杨硕, 王子奕, 吴心筱
  • SPL 2024, Source-free Image-text Matching via Uncertainty-aware Learning, Mengxiao Tian, Shuo Yang$\dagger$, Xinxiao Wu, Yunde Jia
  • PRCV 2024, Efficient Language-Driven Action Localization by Feature Aggregation and Prediction Adjustment, Zirui Shang, Shuo Yang$\dagger$, Xinxiao Wu
  • Arxiv 2024, Data-free Multi-label Image Recognition via LLM-powered Prompt Tuning, Shuo Yang, Zirui Shang, Yongqi Wang, Derong Deng, Hongwei Chen, Qiyuan Cheng, Xinxiao Wu
  • Arxiv 2017, Hand3D: Hand Pose Estimation using 3D Neural Network, Xiaoming Deng$\ast$, Shuo Yang$\ast$, Yinda Zhang$\ast$, Ping Tan, Liang Chang, Hongan Wang
  • Acta Automatica Sinica 2016, Convolutional neural networks in image understanding, Liang Chang, Xiaoming Deng, Mingquan Zhou, Zhongke Wu, Ye Yuan, Shuo Yang, Hongan Wang
  • 📖 Educations

    • 2018.09 - 2024.06, Ph.D. in Computer Science, School of Computer Science & Technology, Beijing Institute of Technology.

      Advisor: Shuliang Wang(2018.09 - 2021.06) and Xinxiao Wu from 2021.06.

    • 2014.09 - 2017.07, M.S. in Computer Science, Institute of Software, Chinese Academic of Science.

      Advisor: Xiaoming Deng.

    • 2010.09 - 2014.07, B.S. in Computer Science, School of Information, Beijing Union University.

    💻 Experiences

    • 2024.06 - now, Tenure-Track Associate Professor (Pre-Tenure) at Shenzhen MSU-BIT University, Shenzhen, China.
    • 2019.05 - 2020.02, Research intern at Megvii-inc, Beijing, China.
    • 2017.07 - 2018.08, Algorithm engineer at JD Finance, Beijing, China.