Recognition of Grasping Patterns Using Deep Learning for Human-Robot Collaboration

Pedro Amaral; Filipe Silva; Vítor Santos

doi:10.3390/s23218989

Recognition of Grasping Patterns Using Deep Learning for Human-Robot Collaboration

Sensors (Basel). 2023 Nov 5;23(21):8989. doi: 10.3390/s23218989.

Authors

Pedro Amaral¹, Filipe Silva¹, Vítor Santos²

Affiliations

¹ Department of Electronics, Telecommunications and Informatics (DETI), Institute of Electronics and Informatics Engineering of Aveiro (IEETA), University of Aveiro, 3810-193 Aveiro, Portugal.
² Department of Mechanical Engineering (DEM), Institute of Electronics and Informatics Engineering of Aveiro (IEETA), University of Aveiro, 3810-193 Aveiro, Portugal.

Abstract

Recent advances in the field of collaborative robotics aim to endow industrial robots with prediction and anticipation abilities. In many shared tasks, the robot's ability to accurately perceive and recognize the objects being manipulated by the human operator is crucial to make predictions about the operator's intentions. In this context, this paper proposes a novel learning-based framework to enable an assistive robot to recognize the object grasped by the human operator based on the pattern of the hand and finger joints. The framework combines the strengths of the commonly available software MediaPipe in detecting hand landmarks in an RGB image with a deep multi-class classifier that predicts the manipulated object from the extracted keypoints. This study focuses on the comparison between two deep architectures, a convolutional neural network and a transformer, in terms of prediction accuracy, precision, recall and F1-score. We test the performance of the recognition system on a new dataset collected with different users and in different sessions. The results demonstrate the effectiveness of the proposed methods, while providing valuable insights into the factors that limit the generalization ability of the models.

Keywords: collaborative robotics; grasping posture; hand–object interaction; keypoints classification; object recognition.

MeSH terms

Deep Learning*
Hand
Humans
Neural Networks, Computer
Robotics* / methods
Upper Extremity

Grants and funding

The present study was developed in the scope of the Project Augmented Humanity (POCI-01-0247-FEDER-046103), financed by Portugal 2020, under the Competitiveness and Internationalization Operational Program, the Lisbon Regional Operational Program and by the European Regional Development Fund.