Command-driven semantic robotic grasping towards user-specified tasks
Abstract Despite significant advancements in robotic grasp driven by visual perception, deploying robots in unstructured environments to perform user-specified tasks still poses considerable challenges. Natural language offers an intuitive means of specifying task objectives, reducing ambiguity. In...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Springer
2025-06-01
|
| Series: | Complex & Intelligent Systems |
| Subjects: | |
| Online Access: | https://doi.org/10.1007/s40747-025-01981-y |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Abstract Despite significant advancements in robotic grasp driven by visual perception, deploying robots in unstructured environments to perform user-specified tasks still poses considerable challenges. Natural language offers an intuitive means of specifying task objectives, reducing ambiguity. In this study, we introduced natural language into a vision-guided grasp system by employing visual attributes as a mediating bridge between language instructions and visual observations. We propose a command-driven semantic grasp architecture that integrates pixel attention within the visual attribute recognition module and includes a modified grasp pose estimation network to enhance prediction accuracy. Our experimental results show that our approach improves the performance of the submodules including visual attribute recognition and grasp pose estimation compared to baseline models. Furthermore, we demonstrate that our proposed model exhibits notable effectiveness in real-world user-specified grasping experiments. |
|---|---|
| ISSN: | 2199-4536 2198-6053 |