About Me
Hello! I am a senior undergraduate student in my final year of study. My research interests lie in the domain of embodied AI, with a particular emphasis on aligning human instruction intents based on Visual Language Models (VLMs). At present, my work centers around employing multi AI agents for hierarchical decision making in robots, aiming to address long horizon tasks such as preparing a complete dinner.
Currently, I was supervised by Prof. Huiping Zhuang to complete my graduation project. Previously, I collaborated with Dr. Yongquan Chen and Junliang Li at Airs. Our work was focused on an object grasping system guided by visual instructions. This system has been reported on multiple media and has garnered extensive attention within the industry.
Beyond my research pursuits, I consider myself a hardcore tech geek. I have a passion for constructing mechanical structures, electronic systems, and visual algorithms with my own hands to solve problems in the physical world, rather than merely dwelling on theories. I also keep a close eye on the forefront of entrepreneurship, embodying the Silicon Valley spirit. I have always aspired to become a technology leader akin to Elon Musk or Steve Jobs. My life's goal is to pursue what I truly love, rather than what is commonly regarded as the right thing to do.
I am looking for 2025 summer internship and 2026 phd position. Please feel free to reach out to me and engage in discussions on any topic.
Education
-
University of Hong Kong
Msc.Eng, major in Robotics and Intelligent Systems (Sep. 2025 - July. 2026)
In coming student in 2025 fall -
South China University of Technology
B.Eng, major in Intelligence Manufacturing (Sep. 2021 - June. 2025)
GPA: 3.6/4.0
- Core Curriculum:
- C++ Programming Foundations: (4.0/4.0)
- Practice of Introduction to Engineering: (4.0/4.0)
- Linear Algebra: (4.0/4.0)
- Circuits Practice: (4.0/4.0)
- Engineering Innovation Training: (4.0/4.0)
- Embedded System and Design: (4.0/4.0)
- Modeling, Analysis and Control of Dynamic System: (3.7/4.0)
-
University of Pennsylvania
Winter exchange student (Jan. 2024 - Feb. 2024)
Focus on robotics and intelligent systems as well as essential skills on Leadership, Persuasive Speaking& Writing and Inovation& Technology.
Certification
Publications
-
Robotic Visual Instruction
Authors: Yanbang Li, ZiYang Gong, Haoyang Li, Xiaoqi Huang, Haolan Kang, Guangpingbai, Xianzheng Ma
-
Grasp What You Want: Embodied Dexterous Grasping System Driven by Your Voice
Authors: Junliang Li, Kai Ye, Haolan Kang, Mingxuan Liang, Yuhang Wu, Zhenhua Liu, Huiping Zhuang, Rui Huang, Yongquan Chen
Natural language is commonly used for human-robot interaction but lacks spatial precision, leading to ambiguity. To address this, we introduce Robotic Visual Instruction (RoVI), which uses 2D sketches to guide robotic tasks. RoVI encodes spatial-temporal information for 3D manipulation. We also present the Visual Instruction Embodied Workflow (VIEW), which uses Vision-Language Models (VLMs) to interpret RoVI and generate 3D actions. A dataset of 15K instances fine-tunes VLMs for edge deployment. VIEW shows strong generalization across 11 tasks, achieving an 87.5% success rate in real-world scenarios. Code and datasets will be released soon.
Conference: 2025 CVPR, Accept, 3 positive reviews
Paper
In recent years, human-robot collaboration has become crucial, but robots struggle to interpret voice commands accurately. Traditional systems lack advanced manipulation and adaptability, especially in unstructured environments. This paper introduces the Embodied Dexterous Grasping System (EDGS), which uses a Vision-Language Model (VLM) to align voice and visual information, improving object handling. Inspired by human interactions, it employs a precise grasping strategy. Experiments show EDGS effectively manages complex tasks, demonstrating its potential in Embodied AI.
Journal: 2025 JFR (under reviwes)
Projects
-
- A low cost 6DoF hand exoskeleton using linkage and tendon mechanism under the ESP32 platform. The total cost of this project is approximately 500 yuan, making it an economical solution for those interested in hand exoskeleton technology.
- Responsible for the ESP32 software development and servo control program development
A low-cost 6DoF hand exoskeleton using linkage and tendon mechanism under ESP32
Technical Partner
Mar. 2024 - Jun. 2024
-
- Dedicated to the development of a smart watch integrating TinyML and deep learning algorithms, which efficiently and accurately identified the user's daily activities and possessed the ability to collaborate with smart home devices;
- Led the implementation and optimisation of the TinyML model for deployment on ultra-low-power microcontrollers, which enabled local data analysis and processing, significantly decreased latency, and enhanced user privacy protection and data security.
A TinyML-based Human Activity Recognition Smart Watch
Team leader
Sep. 2023- Jan. 2024
-
- Acted as a team leader for the design and implementation of a robotic ball collector based on the FreeRTOS real-time operating system and OpenMV camera module.
- Spearheaded the design and implementation of a PID controller for precise control of the speed and direction of the motors for accurate capture and transport of the balls.
- Won the Second Prize in the Competition of Multifunctional Robot in the Course of Engineering Innovation Training, adequately demonstrating the ability of robot design and control system integration.
Design of Robotic Ball Collector with FreeRTOS and OpenMV
Team leader
Jan. 2023 - Jun. 2023
-
- Attended the research and development of a safety monitoring system for wireless charging of electric vehicles, which monitored the electromagnetic field distribution during wireless charging in real time using millimetre wave radar technology for reducing the impact of electromagnetic radiation on human beings and animals.
- Committed to the design and implementation of the millimetre wave radar sensor board, as well as the data communication and processing with the STM32.
- Obtained Project Innovation Award in the 18th Winning in Guangzhou and Guangdong-HongKong-Macao Greater Bay Area Entrepreneurship Competition for College Students, and awarded Winning Prize in the 8th China College Students' “Internet+” Innovation and Entrepreneurship Competition at South China University of Technology.
Development of a Millimeter Wave Radar-Based Living Detection Device for Electric Vehicle Wireless Charging Safety
Team leader
Dec. 2022 - Jun. 2023
Experience
-
- Employed Realsense L515 for RGBD data acquisition. Conducted point cloud densification and denoising to enhance object depth - estimation accuracy. Utilized Dinox, GroundSAM, and Clip for target object detection and segmentation in image data. Estimated the dexterous hand's optimal grasping pose based on object geometry, and used an IK solver to filter collision - prone poses.
- Leveraged VLM such as gpt4o to extract intent from human instructions. Through prompt tuning, devised a method to understand intent, extract target objects, and perform semantic enrichment using multimodal info for better downstream segmentation prompts.
- Operated Franka, UR5, and RealMan robotic arms in conjunction with Inspire, Ruiyan, and BrainCo dexterous hands' end - effectors to achieve joint or space poses. Imported URDF models of these robots into the Sapien simulation environment for algorithm testing and optimization. Employed motion planners for trajectory planning and accomplished tasks like pick - and - place, object rotation, and drawer opening.
Shenzhen Institute of Artificial Intelligence and Robotics for Society
Research Assistant
Mar. 2024 - Jan. 2025
-
- Learned and practised Design Thinking, First Principles Thinkings and Maslow's Hierarchy of Needs during the two-week summer camp.
- Participated in the innovative research and development process of smart hardware solutions, successfully transformed innovative thinking into actual products, demonstrated the whole product management process from conceptualisation to prototyping, and exemplified excellent problem-solving and innovation skills.
InnoX Summer Camp 2024, Shenzhen InnoX Academy
Junior Project Manager
Aug. 2024
-
- Collaborate with Mingyuan Xiao and supervised by Prof.Tan Boon Huan from Nanyang Technological University, involved in the research and development project of memory-assisted smart glasses for the elderly, which received a research grant of RMB 100,000 with a value of RMB 10 million.
- Undertook the hardware integration and software design of smart glasses, including the selection and integration of modules such as camera, microphone, and loudspeaker, in addition to the implementation of data processing algorithms based on microcontrollers.
Forever tech: Development of Smart Glasses for Assisting Memory Recall in Elderly
Technical Partner
Jan. 2023 - Dec. 2023