Soolab Sibei Yang

Sibei Yang is an Assistant Professor in SIST, ShanghaiTech University, since Fall 2021. Before that, She is a Research Assistant Professor in Computing at The Hong Kong Polytechnic University. She received her Ph.D. degree from the University of Hong Kong in 2020, advised by Prof. Yizhou Yu. Her Ph.D. study is supported by Hong Kong PhD Fellowship. She obtained her B.S. degree in computer science from Chu Kochen Honors College at Zhejiang University in 2016.

Her general research interests span computer vision, natural language processing, and the intersection of them.

Research Lab

░██████╗░█████╗░░█████╗░██╗░░░░░░█████╗░██████╗░
██╔════╝██╔══██╗██╔══██╗██║░░░░░██╔══██╗██╔══██╗
╚█████╗░██║░░██║██║░░██║██║░░░░░███████║██████╦╝
░╚═══██╗██║░░██║██║░░██║██║░░░░░██╔══██║██╔══██╗         
██████╔╝╚█████╔╝╚█████╔╝███████╗██║░░██║██████╦╝       # Welcome to SIST.1C313.
╚═════╝░░╚════╝░░╚════╝░╚══════╝╚═╝░░╚═╝╚═════╝░       # Have Fun!

Our current research interests primarily focus on 1) Open-world Visual Understanding, 2) Neural Generation and Editing, 3) Vision-Language Joint Understanding, 4) Large Language and Vision Models, and 5) Embodied AI. Our mission is to facilitate the learning of unified and universal perception, understanding, reasoning, and generation within the realm of an open world. We believe that learning from multimodal information (especially vision and language) in a general and unified manner, holds the key to a deeper understanding of our world.

We are always looking for undergraduate and graduate students!

News

Jul 5, 2024	Three papers are accepted by ECCV 2024 🎉🎉🎉
Feb 27, 2024	Two papers are accepted by CVPR 2024 🎊🎊🎊
Jan 15, 2024	One paper is accepted by ICLR 2024 🐲🐲🐲
Dec 30, 2023	Congratulations to Cheng Shi for receiving the National Scholarship, and to Jiajin Tang for achieving the Outstanding Student Award.👏👏👏
Sep 23, 2023	2 papers Free-Bloom (Zero-Shot Text-to-Video Generation) and DDCoT (CoT Prompting for Multimodal Reasoning in LMs) are accepted by NeurIPS 2023 🎉🎉🎉

Recent Publication

* equal contribution; † corresponding author.

2024

Part2Object: Hierarchical Unsupervised 3D Instance Segmentation

Cheng Shi*, Yulin Zhang*, Bin Yang, Jiajin Tang, Yuexin Ma, Sibei Yang†

Accepted by ECCV, 2024
Plain-D^Net: A Plain Multi-Dataset Object Detector

Cheng Shi*, Yuchen Zhu*, and Sibei Yang†

Accepted by ECCV, 2024
WildRefer: 3D Object Localization in Large-scale Dynamic Scenes with Multi-modal Visual Data and Natural Language

Zhenxiang Lin, Xidong Peng, Peishan Cong, Ge Zheng, Yujing Sun, Yuenan Hou, Xinge Zhu, Sibei Yang, Yuexin Ma

Accepted by ECCV, 2024

arXiv
CVPR2024

Curriculum Point Prompting for Weakly-Supervised Referring Image Segmentation

Qiyuan Dai, and Sibei Yang†

Accepted by CVPR, 2024

arXiv
The Devil is in the Object Boundary: Towards Annotation-free Instance Segmentation Using Foundation Models

Cheng Shi, and Sibei Yang†

Accepted by ICLR, 2024

arXiv Code
OMG: Towards Open-vocabulary Motion Generation via Mixture of Controllers

Han Liang, Jiacheng Bao, Ruichi Zhang, Sihan Ren, Yuecheng Xu, Sibei Yang, Xin Chen, Jingyi Yu, Lan Xu

Accepted by CVPR, 2024

arXiv Code Video
RealDex: Towards Human-like Grasping for Robotic Dexterous Hand

Yumeng Liu*, Yaxun Yang*, Youzhuo Wang*, Xiaofei Wu, Jiamin Wang, Yichen Yao, Sören Schwertfeger, Sibei Yang, Wenping Wang, Jingyi Yu, Xuming He, Yuexin Ma

Accepted by IJCAI, 2024

arXiv

2023

DDCoT: Duty-Distinct Chain-of-Thought Prompting for Multimodal Reasoning in Language Models

Ge Zheng*, Bin Yang*, Jiajin Tang*, Hong-Yu Zhou, Sibei Yang†

Accepted by NeurIPS, 2023

arXiv Code
Free-Bloom: Zero-Shot Text-to-Video Generator with LLM Director and LDM Animator

Hanzhuo Huang*, Yufan Feng*, Cheng Shi, Lan Xu, Jingyi Yu, Sibei Yang†

Accepted by NeurIPS, 2023

arXiv Code
ICCV2023

LoGoPrompt: Synthetic Text Images Can Be Good Visual Prompts for Vision-Language Models

Cheng Shi, and Sibei Yang†

Accepted by ICCV, 2023

arXiv HTML PDF
ICCV2023

EdaDet: Open-Vocabulary Object Detection Using Early Dense Alignment

Cheng Shi, and Sibei Yang†

Accepted by ICCV, 2023

arXiv HTML PDF
ICCV2023

CoTDet: Affordance Knowledge Prompting for Task Driven Object Detection

Jiajin Tang*, Ge Zheng*, Jingyi Yu, Sibei Yang†

Accepted by ICCV, 2023

arXiv HTML PDF
ICCV2023

Temporal Collection and Distribution for Referring Video Object Segmentation

Jiajin Tang, Ge Zheng, and Sibei Yang†

Accepted by ICCV, 2023

HTML PDF
ICCV2023

Grounded lmage Text Matching with Mismatched Relation Reasoning

Yu Wu*, Yana Wei*, Haozhe Wang, Yongfei Liu, Sibei Yang, Xuming He†

Accepted by ICCV, 2023

arXiv
CVPR2023

Contrastive Grouping with Transformer for Referring Image Segmentation

Jiajin Tang, Ge Zheng, Cheng Shi, Sibei Yang†

Accepted by CVPR, 2023

PDF Code
AAAI2023

CCQ: Cross-Class Query Network for Partially Labeled Organ Segmentation

Xuyang Liu, Bingbing Wen, and Sibei Yang†

Accepted by AAAI, 2023

HTML Code
SIGGRAPH2023

DreamFace: Progressive Generation of Animatable 3D Faces under Text Guidance

Longwen Zhang*, Qiwei Qiu*, Hongyang Lin*, Qixuan Zhang, Cheng Shi, Wei Yang, Ye Shi, Sibei Yang†, Lan Xu†, Jingyi Yu†

Accepted by SIGGRAPH, 2023

arXiv HTML PDF Video
TPAMI2023

A Unified Visual Information Preservation Framework for Self-supervised Pre-training in Medical Image Analysis

Hong-Yu Zhou*, Chixiang Lu*, Chaoqi Chen, Sibei Yang, Yizhou Yu†

Accepted by TPAMI, 2023

arXiv
TPAMI2023

A Survey on Graph Neural Networks and Graph Transformers in Computer Vision: A Task-oriented Perspective

Chaoqi Chen*, Yushuang Wu*, Qiyuan Dai*, Hong-Yu Zhou*, Mutian Xu, Sibei Yang†, Xiaoguang Han†, Yizhou Yu†

Submitted to TPAMI, 2023

arXiv

2022

ECCV2022

Spatial and Visual Perspective-Taking via View Rotation and Relation Reasoning for Embodied Reference Understanding

Cheng Shi, and Sibei Yang†

Accepted by ECCV, 2022

PDF Code

Teaching

CS181: Artificial Intelligence I (Spring 2022, Spring 2023, Spring 2024)

CS282: Machine Learning (Fall 2021, Fall 2022, Fall 2023)