An Efficient Framework For Human Action Recognition Based On Graph Convolutional Networks
Nikolaos Kilis, Christos Papaioannidis, Ioannis Mademlis, Ioannis Pitas
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:09:35
Surgical simulators provide hands-on training and learning of the necessary psychomotor skills. Automated skill evaluation of the trainee doctors based on the video of a task being performed by them is an important key step for the optimal utilization of such simulators. However, current skill evaluation techniques require accurate tracking information of the instruments which restricts their applicability to robot assisted surgeries only. in this paper, we propose a novel neural network architecture that can perform skill evaluation using video data alone (and no tracking information). Given the small dataset available for training such a system, the network trained using l_2 regression loss easily overfits the training data. We propose a novel rank loss to help learn robust representation, leading to 5% improvement for skill score prediction on the benchmark JIGSAWS dataset. To demonstrate the applicability of our method on non-robotic surgeries, we contribute a new neuro-endoscopic technical skills (NETS) training dataset comprising of 100 short videos of 12 subjects. Our method achieved 27% improvement over the state of the art on the NETS dataset. Project page with source code, and data is available at nets-iitd.github.io/nets-v1.