Abstract
In the constant updates of the product dialogue systems, we need to retrain the natural language understanding (NLU) model as new data from the real users would be merged into the existent data accumulated in the last updates. Within the newly added data, new intents would emerge and might have semantic entanglement with the existing intents, e.g. new intents that are semantically too specific or generic are actually subset or superset of some existing intents in the semantic space, thus impairing the robustness of the NLU model. As the first attempt to solve this problem, we setup a new benchmark consisting of 4 Dialogue Version Control dataSets (DialogVCS). We formulate the intent detection with imperfect data in the system update as a multi-label classification task with positive but unlabeled intents, which asks the models to recognize all the proper intents, including the ones with semantic entanglement, in the inference. We also propose comprehensive baseline models and conduct in-depth analyses for the benchmark, showing that the semantically entangled intents can be effectively recognized with an automatic workflow.
Abstract (translated)
在产品对话系统的不断更新中,我们需要重新训练自然语言理解(NLU)模型,因为从真实用户收集的新数据将合并到上次更新中所积累的现有数据中。在新数据中,新的意图将涌现,并且可能与现有意图语义 entanglement,例如,语义过于特定或通用的新意图实际上是某些现有意图在语义空间中的子集或超集,从而削弱了 NLU 模型的鲁棒性。作为解决这个问题的第一步,我们建立了一个由 4 个对话版本控制数据集(DialogVCS)组成的新基准,我们将不完美的数据在系统中更新中用作一个多标签分类任务,有正面但未标记的意图,该任务要求模型在推理中识别所有正确的意图,包括语义 entanglement 意图。我们还提出了全面的基础模型,并对基准进行了深入分析,表明,通过自动工作流程,语义 entanglement 意图可以 effectively 识别。
URL
https://arxiv.org/abs/2305.14751