本人主要研究方向为视觉与语言的交叉领域。目前关注视觉、语言与机器人相结合的任务,例如语言驱动的视频理解、开放词汇图像/视频理解以及交互式机器人等。此前工作涉及手部检测、手部姿态估计、人脸识别与行人重识别等。

欢迎对视觉与语言智能机器人等方向感兴趣的同学加入课题组!

联系邮箱:yangshuo@smbu.edu.cn;yangshuo129@gmail.com。

🔥 学术动态

📝 论文

Google Scholar 引用   ($\ast$ 表示共同一作,$\dagger$ 表示通讯作者)

PR 2026
sym

Image-free Multi-label Image Recognition via LLM-powered Hierarchical Prompt Tuning

  • Shuo Yang$\dagger$, Zirui Shang, Yongqi Wang, Derong Deng, Hongwei Chen, Xinxiao Wu, Qiyuan Cheng
  • Pattern Recognition (PR), 2026.

    [Paper] [BibTex] [Code]

ICCV 2025
sym

LLM-enhanced Action-aware Multi-modal Prompt Tuning for Image-Text Matching

  • Mengxiao Tian, Xinxiao Wu, Shuo Yang$\dagger$
  • International Conference on Computer Vision (ICCV), 2025.

    [Paper] [BibTex] [Code]

IJCAI 2025
sym

METOR: A Unified Framework for Mutual Enhancement of Objects and Relationships in Open-vocabulary Video Visual Relationship Detection

  • Yongqi Wang, Xinxiao Wu, Shuo Yang$\dagger$
  • The 34th International Joint Conference on Artificial Intelligence (IJCAI), 2025.

    [Paper] [BibTex] [Code]

IEEE TPAMI 2025
sym

End-to-end Open-vocabulary Video Visual Relationship Detection using Multi-modal Prompting

  • Yongqi Wang, Xinxiao Wu, Shuo Yang, Jiebo Luo
  • IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2025.

    [Paper] [BibTex] [Code]

AAAI 2025
sym

Video Summarization using Denoising Diffusion Probabilistic Model

  • Zirui Shang, Yubo Zhu, Hongxi Li, Shuo Yang, Xinxiao Wu
  • The 39th Annual AAAI Conference on Artificial Intelligence (AAAI), 2025.

    [Paper] [BibTex]

IEEE TMM 2024
sym

Dynamic Pathway for Query-Aware Feature Learning in Language-Driven Action Localization

  • Shuo Yang, Xinxiao Wu, Zirui Shang, Jiebo Luo
  • IEEE Transactions on Multimedia (TMM), 2024.

    [Paper] [BibTex]

📖 教育背景

  • 2018.09 - 2024.06,计算机科学博士,北京理工大学计算机学院。
    导师:王树良(2018.09 - 2021.06),2021.06 起师从 吴心筱
  • 2014.09 - 2017.07,计算机科学硕士,中国科学院软件研究所。
    导师:邓小明
  • 2010.09 - 2014.07,计算机科学学士,北京联合大学信息学院。

💻 工作经历