Non-Autoregressive Sign Language Production via Knowledge Distillation

2022-08-12 09:17:11

Eui Jun Hwang, Jung Ho Kim, Suk min Cho, Jong C. Park

arXiv_CL

Abstract
Abstract (translated)
URL
PDF

Abstract

Sign Language Production (SLP) aims to translate expressions in spoken language into corresponding ones in sign language, such as skeleton-based sign poses or videos. Existing SLP models are either AutoRegressive (AR) or Non-Autoregressive (NAR). However, AR-SLP models suffer from regression to the mean and error propagation during decoding. NSLP-G, a NAR-based model, resolves these issues to some extent but engenders other problems. For example, it does not consider target sign lengths and suffers from false decoding initiation. We propose a novel NAR-SLP model via Knowledge Distillation (KD) to address these problems. First, we devise a length regulator to predict the end of the generated sign pose sequence. We then adopt KD, which distills spatial-linguistic features from a pre-trained pose encoder to alleviate false decoding initiation. Extensive experiments show that the proposed approach significantly outperforms existing SLP models in both Frechet Gesture Distance and Back-Translation evaluation.

Abstract (translated)

URL

https://arxiv.org/abs/2208.06183

PDF

https://arxiv.org/pdf/2208.06183.pdf