Multi_Task

Determining the severity of Parkinson's disease in patients using a multi task neural network

2024-02-08 08:55:34

María Teresa García-Ordás, José Alberto Benítez-Andrades, Jose Aveleira-Mata, José-Manuel Alija-Pérez, Carmen Benavides

arXiv_SD

arXiv_SD Deep_Learning Classification Multi_Task Prediction Pose
Abstract

Parkinson's disease is easy to diagnose when it is advanced, but it is very difficult to diagnose in its early stages. Early diagnosis is essential to be able to treat the symptoms. It impacts on daily activities and reduces the quality of life of both the patients and their families and it is also the second most prevalent neurodegenerative disorder after Alzheimer in people over the age of 60. Most current studies on the prediction of Parkinson's severity are carried out in advanced stages of the disease. In this work, the study analyzes a set of variables that can be easily extracted from voice analysis, making it a very non-intrusive technique. In this paper, a method based on different deep learning techniques is proposed with two purposes. On the one hand, to find out if a person has severe or non-severe Parkinson's disease, and on the other hand, to determine by means of regression techniques the degree of evolution of the disease in a given patient. The UPDRS (Unified Parkinson's Disease Rating Scale) has been used by taking into account both the motor and total labels, and the best results have been obtained using a mixed multi-layer perceptron (MLP) that classifies and regresses at the same time and the most important features of the data obtained are taken as input, using an autoencoder. A success rate of 99.15% has been achieved in the problem of predicting whether a person suffers from severe Parkinson's disease or non-severe Parkinson's disease. In the degree of disease involvement prediction problem case, a MSE (Mean Squared Error) of 0.15 has been obtained. Using a full deep learning pipeline for data preprocessing and classification has proven to be very promising in the field Parkinson's outperforming the state-of-the-art proposals.

Abstract (translated)

帕金森病在病情较晚的时候容易诊断，但在早期阶段很难诊断。早期的诊断对治疗症状非常重要。这种疾病会影响患者的日常生活，降低他们的生活质量，也是60岁以上人群中最常见的神经退行性疾病。目前，关于预测帕金森病严重程度的研究大部分都是在疾病进展到较晚阶段时进行的。在这项工作中，研究分析了一系列可以轻松从语音分析中提取的变量，使得这是一种非常非侵入性的技术。在这篇论文中，提出了一种基于不同深度学习技术的两种目的的方法。一方面，通过确定一个人是否患有严重或非严重的帕金森病，另一方面，通过回归分析方法确定患者疾病在一定程度上的发展程度。在考虑了运动和总标签的情况下，使用了统一帕金森病评分表（UPDRS），并且最佳结果是通过一种混合多层感知器（MLP）进行分类和回归得到的，输入数据的最重要特征使用自动编码器。在预测一个人是否患有严重帕金森病或非严重帕金森病的问题上，取得了99.15%的成功率。在疾病程度预测问题中，获得了0.15的均方误差（MSE）。使用完整的深度学习数据预处理和分类方法在帕金森病领域表现出了与现有最佳建议相媲美的效果。

URL

https://arxiv.org/abs/2402.05491

PDF

https://arxiv.org/pdf/2402.05491.pdf
Read All
Multi Task Consistency Guided Source-Free Test-Time Domain Adaptation Medical Image Segmentation

2023-10-18 07:49:24

Yanyu Ye, Zhenxi Zhang, Wei Wei, Chunna Tian

arXiv_CV

arXiv_CV Segmentation Relation Multi_Task Prediction Pose Medical
Abstract

Source-free test-time adaptation for medical image segmentation aims to enhance the adaptability of segmentation models to diverse and previously unseen test sets of the target domain, which contributes to the generalizability and robustness of medical image segmentation models without access to the source domain. Ensuring consistency between target edges and paired inputs is crucial for test-time adaptation. To improve the performance of test-time domain adaptation, we propose a multi task consistency guided source-free test-time domain adaptation medical image segmentation method which ensures the consistency of the local boundary predictions and the global prototype representation. Specifically, we introduce a local boundary consistency constraint method that explores the relationship between tissue region segmentation and tissue boundary localization tasks. Additionally, we propose a global feature consistency constraint toto enhance the intra-class compactness. We conduct extensive experiments on the segmentation of benchmark fundus images. Compared to prediction directly by the source domain model, the segmentation Dice score is improved by 6.27\% and 0.96\% in RIM-ONE-r3 and Drishti GS datasets, respectively. Additionally, the results of experiments demonstrate that our proposed method outperforms existing competitive domain adaptation segmentation algorithms.

Abstract (translated)

源自由测试时间适应医疗图像分割旨在增强分割模型的适应性，使其能适应目标域中的多样化和之前未见过的测试集。这有助于实现没有访问源域的医疗图像分割模型的泛化能力和稳健性。确保目标边缘与成对输入之间的一致性对测试时间适应至关重要。为了提高测试时间域适应的性能，我们提出了一个多任务一致性引导的源自由测试时间域适应医疗图像分割方法，确保局部边界预测和全局原型表示的一致性。具体来说，我们引入了组织区域分割与组织边界定位任务之间的关系。此外，我们还提出了全局特征一致性约束以增强类内压缩性。我们对基准基金图像进行了广泛的实验。与直接由源域模型预测的分割结果相比，RIM-ONE-r3和Drishti GS数据集中的分割Dice分数分别提高了6.27%和0.96%。此外，实验结果表明，与现有的竞争域适应分割算法相比，我们提出的方法具有优异的性能。

URL

https://arxiv.org/abs/2310.11766

PDF

https://arxiv.org/pdf/2310.11766.pdf
Read All
Towards LogiGLUE: A Brief Survey and A Benchmark for Analyzing Logical Reasoning Capabilities of Language Models

2023-10-02 01:00:50

Man Luo, Shrinidhi Kumbhar, Ming shen, Mihir Parmar, Neeraj Varshney, Pratyay Banerjee, Somak Aditya, Chitta Baral

arXiv_AI

arXiv_AI Review Survey Multi_Task Knowledge Language_Model
Abstract

Logical reasoning is fundamental for humans yet presents a substantial challenge in the domain of Artificial Intelligence. Initially, researchers used Knowledge Representation and Reasoning (KR) systems that did not scale and required non trivial manual effort. Recently, the emergence of large language models (LLMs) has demonstrated the ability to overcome various limitations of formal Knowledge Representation (KR) systems. Consequently, there is a growing interest in using LLMs for logical reasoning via natural language. This work strives to understand the proficiency of LLMs in logical reasoning by offering a brief review of the latest progress in this area; with a focus on the logical reasoning datasets, tasks, and the methods adopted to utilize LLMs for reasoning. To offer a thorough analysis, we have compiled a benchmark titled LogiGLUE. This includes 24 varied datasets encompassing deductive, abductive, and inductive reasoning. We have standardized these datasets into Seq2Seq tasks to facilitate straightforward training and evaluation for future research. Utilizing LogiGLUE as a foundation, we have trained an instruction fine tuned language model, resulting in LogiT5. We study single task training, multi task training, and a chain of thought knowledge distillation fine tuning technique to assess the performance of model across the different logical reasoning categories. By this comprehensive process, we aim to shed light on the capabilities and potential pathways for enhancing logical reasoning proficiency in LLMs, paving the way for more advanced and nuanced developments in this critical field.

Abstract (translated)

逻辑推理是人类基本的思维活动,但在人工智能领域中却面临巨大的挑战。起初,研究人员使用无法扩展且需要大量手动努力的知识表示和推理(KR)系统。最近,大型语言模型(LLM)的出现已经证明了能够克服正式知识表示(KR)系统的各种限制的能力。因此,越来越多的人开始使用LLM来进行自然语言逻辑推理。这项工作旨在通过简要回顾该领域的最新进展,理解LLM在逻辑推理方面的熟练程度。我们焦点关注逻辑推理数据集、任务和利用LLM进行推理的方法。为了进行全面分析,我们汇编了一个基准名为LogiGLUE。该基准包括24个不同的数据集,涵盖了从演绎、归纳和推断推理的各种类型。我们将这些数据集标准化为Seq2Seq任务,以便于未来的研究和 straightforward的训练和评估。利用LogiGLUE作为基础,我们训练了一个优化的语言模型,结果为LogiT5。我们研究单一任务训练、多任务训练和思维知识蒸馏优化技术,以评估模型在不同逻辑推理类别中的表现。通过这种方式,我们旨在阐明LLM在逻辑推理方面的能力和潜在路径,为这个关键领域的更高级、精细的发展铺平道路。

URL

https://arxiv.org/abs/2310.00836

PDF

https://arxiv.org/pdf/2310.00836.pdf
Read All
Instruction Tuned Models are Quick Learners

2023-05-17 22:30:01

Himanshu Gupta, Saurabh Arjun Sawant, Swaroop Mishra, Mutsumi Nakamura, Arindam Mitra, Santosh Mashetty, Chitta Baral

arXiv_CL

arXiv_CL Transfer_Learning Multi_Task Inference Language_Model
Abstract

Instruction tuning of language models has demonstrated the ability to enhance model generalization to unseen tasks via in-context learning using a few examples. However, typical supervised learning still requires a plethora of downstream training data for finetuning. Often in real-world situations, there is a scarcity of data available for finetuning, falling somewhere between few shot inference and fully supervised finetuning. In this work, we demonstrate the sample efficiency of instruction tuned models over various tasks by estimating the minimal downstream training data required by them to perform transfer learning and match the performance of state-of-the-art (SOTA) supervised models. We conduct experiments on 119 tasks from Super Natural Instructions (SuperNI) in both the single task learning (STL) and multi task learning (MTL) settings. Our findings reveal that, in the STL setting, instruction tuned models equipped with 25% of the downstream train data surpass the SOTA performance on the downstream tasks. In the MTL setting, an instruction tuned model trained on only 6% of downstream training data achieve SOTA, while using 100% of the training data results in a 3.69% points improvement (ROUGE-L 74.68) over the previous SOTA. We conduct an analysis on T5 vs Tk-Instruct by developing several baselines to demonstrate that instruction tuning aids in increasing both sample efficiency and transfer learning. Additionally, we observe a consistent ~4% performance increase in both settings when pre-finetuning is performed with instructions. Finally, we conduct a categorical study and find that contrary to previous results, tasks in the question rewriting and title generation categories suffer from instruction tuning.

Abstract (translated)

指令优化语言模型已经证明了通过使用几个例子来提高模型对未完成任务的泛化能力的能力。然而,典型的监督学习仍然需要大量的后续训练数据来进行微调。通常,在现实世界的情况下,微调数据的资源非常有限,处于几个Shot推断和完全监督微调之间的中间位置。在本文中,我们使用Super Natural Instructions(超级指令)中的119个任务进行了实验,同时在单任务学习和多任务学习环境中进行了测试。我们的发现表明,在STL环境中,指令优化模型所拥有的25%的后续训练数据超过了后续任务的性能表现(与SOTA相比)。在MTL环境中,只使用后续训练数据训练的指令优化模型达到了SOTA表现,而使用全部训练数据则导致了3.69%点的进步(ROUGE-L 74.68)高于之前的SOTA表现。我们对T5和Tk-Instruct进行了分析,以开发多个基准来表明指令优化有助于增加样本效率和迁移学习。此外,我们在两个环境中观察到一致的 ~4%的性能提升,在执行预微调指令之前。最后,我们进行了分类研究,并发现与之前的结果相反,问题改写和标题生成任务中的任务受到指令优化的影响。

URL

https://arxiv.org/abs/2306.05539

PDF

https://arxiv.org/pdf/2306.05539.pdf
Read All
Regularizing disparity estimation via multi task learning with structured light reconstruction

2023-01-19 15:54:52

Alistair Weld, Joao Cartucho, Chi Xu, Joseph Davids, Stamatia Giannarou

arXiv_CV

arXiv_CV Deep_Learning Multi_Task Pose Medical 3D Reconstruction
Abstract

3D reconstruction is a useful tool for surgical planning and guidance. However, the lack of available medical data stunts research and development in this field, as supervised deep learning methods for accurate disparity estimation rely heavily on large datasets containing ground truth information. Alternative approaches to supervision have been explored, such as self-supervision, which can reduce or remove entirely the need for ground truth. However, no proposed alternatives have demonstrated performance capabilities close to what would be expected from a supervised setup. This work aims to alleviate this issue. In this paper, we investigate the learning of structured light projections to enhance the development of direct disparity estimation networks. We show for the first time that it is possible to accurately learn the projection of structured light on a scene, implicitly learning disparity. Secondly, we \textcolor{black}{explore the use of a multi task learning (MTL) framework for the joint training of structured light and disparity. We present results which show that MTL with structured light improves disparity training; without increasing the number of model parameters. Our MTL setup outperformed the single task learning (STL) network in every validation test. Notably, in the medical generalisation test, the STL error was 1.4 times worse than that of the best MTL performance. The benefit of using MTL is emphasised when the training data is limited.} A dataset containing stereoscopic images, disparity maps and structured light projections on medical phantoms and ex vivo tissue was created for evaluation together with virtual scenes. This dataset will be made publicly available in the future.

Abstract (translated)

URL

https://arxiv.org/abs/2301.08140

PDF

https://arxiv.org/pdf/2301.08140.pdf
Read All
CERBERUS: Simple and Effective All-In-One Automotive Perception Model with Multi Task Learning

2022-10-03 08:17:26

Carmelo Scribano, Giorgia Franchini, Ignacio Sañudo Olmedo, Marko Bertogna

arXiv_CV

arXiv_CV Deep_Learning Multi_Task Inference Pose Autonomous
Abstract

Perceiving the surrounding environment is essential for enabling autonomous or assisted driving functionalities. Common tasks in this domain include detecting road users, as well as determining lane boundaries and classifying driving conditions. Over the last few years, a large variety of powerful Deep Learning models have been proposed to address individual tasks of camera-based automotive perception with astonishing performances. However, the limited capabilities of in-vehicle embedded computing platforms cannot cope with the computational effort required to run a heavy model for each individual task. In this work, we present CERBERUS (CEnteR Based End-to-end peRception Using a Single model), a lightweight model that leverages a multitask-learning approach to enable the execution of multiple perception tasks at the cost of a single inference. The code will be made publicly available at this https URL

Abstract (translated)

URL

https://arxiv.org/abs/2210.00756

PDF

https://arxiv.org/pdf/2210.00756.pdf
Read All
Multi tasks RetinaNet for mitosis detection

2022-08-26 13:06:54

Chen Yang, Wang Ziyue, Fang Zijie, Bian Hao, Zhang Yongbing

arXiv_CV

arXiv_CV Detection Deep_Learning Classification Multi_Task Pose
Abstract

The account of mitotic cells is a key feature in tumor diagnosis. However, due to the variability of mitotic cell morphology, it is a highly challenging task to detect mitotic cells in tumor tissues. At the same time, although advanced deep learning method have achieved great success in cell detection, the performance is often unsatisfactory when tested data from another domain (i.e. the different tumor types and different scanners). Therefore, it is necessary to develop algorithms for detecting mitotic cells with robustness in domain shifts scenarios. Our work further proposes a foreground detection and tumor classification task based on the baseline(Retinanet), and utilizes data augmentation to improve the domain generalization performance of our model. We achieve the state-of-the-art performance (F1 score: 0.5809) on the challenging premilary test dataset.

Abstract (translated)

URL

https://arxiv.org/abs/2208.12657

PDF

https://arxiv.org/pdf/2208.12657.pdf
Read All
Multi Task Learning For Zero Shot Performance Prediction of Multilingual Models

2022-05-12 14:47:03

Kabir Ahuja, Shanu Kumar, Sandipan Dandapat, Monojit Choudhury

arXiv_CL

arXiv_CL Multi_Task Prediction Language_Model Transformer Zero-Shot
Abstract

Massively Multilingual Transformer based Language Models have been observed to be surprisingly effective on zero-shot transfer across languages, though the performance varies from language to language depending on the pivot language(s) used for fine-tuning. In this work, we build upon some of the existing techniques for predicting the zero-shot performance on a task, by modeling it as a multi-task learning problem. We jointly train predictive models for different tasks which helps us build more accurate predictors for tasks where we have test data in very few languages to measure the actual performance of the model. Our approach also lends us the ability to perform a much more robust feature selection and identify a common set of features that influence zero-shot performance across a variety of tasks.

Abstract (translated)

URL

https://arxiv.org/abs/2205.06130

PDF

https://arxiv.org/pdf/2205.06130.pdf
Read All
Multitask Emotion Recognition Model with Knowledge Distillation and Task Discriminator

2022-03-24 13:50:48

Euiseok Jeong, Geesung Oh, Sejoon Lim

arXiv_CV

arXiv_CV Recognition Deep_Learning Face Multi_Task Knowledge Action Emotion
Abstract

Due to the collection of big data and the development of deep learning, research to predict human emotions in the wild is being actively conducted. We designed a multi-task model using ABAW dataset to predict valence-arousal, expression, and action unit through audio data and face images at in real world. We trained model from the incomplete label by applying the knowledge distillation technique. The teacher model was trained as a supervised learning method, and the student model was trained by using the output of the teacher model as a soft label. As a result we achieved 2.40 in Multi Task Learning task validation dataset.

Abstract (translated)

URL

https://arxiv.org/abs/2203.13072

PDF

https://arxiv.org/pdf/2203.13072.pdf
Read All
Multi-Task Triplet Loss for Named Entity Recognition using Supplementary Text

2021-08-31 01:13:33

Ryan Siskind, Shalin Shah

arXiv_CL

arXiv_CL Recognition Review Embedding Multi_Task
Abstract

Retail item data contains many different forms of text like the title of an item, the description of an item, item name and reviews. It is of interest to identify the item name in the other forms of text using a named entity tagger. However, the title of an item and its description are syntactically different (but semantically similar) in that the title is not necessarily a well formed sentence while the description is made up of well formed sentences. In this work, we use a triplet loss to contrast the embeddings of the item title with the description to establish a proof of concept. We find that using the triplet loss in a multi-task NER algorithm improves both the precision and recall by a small percentage. While the improvement is small, we think it is a step in the right direction of using various forms of text in a multi-task algorithm. In addition to precision and recall, the multi task triplet loss method is also found to significantly improve the exact match accuracy i.e. the accuracy of tagging the entire set of tokens in the text with correct tags.

Abstract (translated)

URL

https://arxiv.org/abs/2109.13736

PDF

https://arxiv.org/pdf/2109.13736.pdf
Read All
Dynamic Balancing of Humanoid Robot Walker3 with Proprioceptive Actuation: Systematic Design of Algorithm, Software and Hardware

2021-08-09 06:11:19

Yan Xie, Jiajun Wang, Hao Dong, Xiaoyu Ren, Liqun Huang, Mingguo Zhao

arXiv_RO

arXiv_RO Tracking Multi_Task Knowledge Optimization Action
Abstract

Dynamic balancing under uncertain disturbances is important for a humanoid robot, which requires a good capability of coordinating the entire body redundancy to execute multi tasks. Whole-body control (WBC) based on hierarchical optimization has been generally accepted and utilized in torque-controlled robots. A good hierarchy is the prerequisite for WBC and can be predefined according to prior knowledge. However, the real-time computation would be problematic in the physical applications considering the computational complexity of WBC. For robots with proprioceptive actuation, the joint friction in gear reducer would also degrade the torque tracking performance. In our paper, a reasonable hierarchy of tasks and constraints is first customized for robot dynamic balancing. Then a real-time WBC is implemented via a computationally efficient WBC software. Such a method is solved on a modular master control system UBTMaster characterized by the real-time communication and powerful computing capability. After the joint friction being well covered by the model identification, extensive experiments on various balancing scenarios are conducted on a humanoid Walker3 with proprioceptive actuation. The robot shows an outstanding balance performance even under external impulses as well as the two feet of the robot suffering the inclination and shift disturbances independently. The results demonstrate that with the strict hierarchy, real-time computation and joint friction being handled carefully, the robot with proprioceptive actuation can manage the dynamic physical interactions with the unstructured environments well.

Abstract (translated)

URL

https://arxiv.org/abs/2108.03826

PDF

https://arxiv.org/pdf/2108.03826.pdf
Read All
Learning More for Free - A Multi Task Learning Approach for Improved Pathology Classification in Capsule Endoscopy

2021-06-30 15:55:17

Anuja Vats, Marius Pedersen, Ahmed Mohammed, Øistein Hovde

arXiv_CV

arXiv_CV Classification Multi_Task
Abstract

The progress in Computer Aided Diagnosis (CADx) of Wireless Capsule Endoscopy (WCE) is thwarted by the lack of data. The inadequacy in richly representative healthy and abnormal conditions results in isolated analyses of pathologies, that can not handle realistic multi-pathology scenarios. In this work, we explore how to learn more for free, from limited data through solving a WCE multicentric, multi-pathology classification problem. Learning more implies to learning more than full supervision would allow with the same data. This is done by combining self supervision with full supervision, under multi task learning. Additionally, we draw inspiration from the Human Visual System (HVS) in designing self supervision tasks and investigate if seemingly ineffectual signals within the data itself can be exploited to gain performance, if so, which signals would be better than others. Further, we present our analysis of the high level features as a stepping stone towards more robust multi-pathology CADx in WCE.

Abstract (translated)

URL

https://arxiv.org/abs/2106.16162

PDF

https://arxiv.org/pdf/2106.16162.pdf
Read All
ScoreGAN: A Fraud Review Detector based on Multi Task Learning of Regulated GAN with Data Augmentation

2021-03-17 14:18:15

Saeedreza Shehnepoor, Roberto Togneri, Wei Liu, Mohammed Bennamoun

arXiv_CV

arXiv_CV GAN Detection Object_Detection Review Adversarial Classification Text_Classification Multi_Task Pose
Abstract

The promising performance of Deep Neural Networks (DNNs) in text classification, has attracted researchers to use them for fraud review detection. However, the lack of trusted labeled data has limited the performance of the current solutions in detecting fraud reviews. The Generative Adversarial Network (GAN) as a semi-supervised method has demonstrated to be effective for data augmentation purposes. The state-of-the-art solutions utilize GANs to overcome the data scarcity problem. However, they fail to incorporate the behavioral clues in fraud generation. Additionally, state-of-the-art approaches overlook the possible bot-generated reviews in the dataset. Finally, they also suffer from a common limitation in scalability and stability of the GAN, slowing down the training procedure. In this work, we propose ScoreGAN for fraud review detection that makes use of both review text and review rating scores in the generation and detection process. Scores are incorporated through Information Gain Maximization (IGM) into the loss function for three reasons. One is to generate score-correlated reviews based on the scores given to the generator. Second, the generated reviews are employed to train the discriminator, so the discriminator can correctly label the possible bot-generated reviews through joint representations learned from the concatenation of GLobal Vector for Word representation (GLoVe) extracted from the text and the score. Finally, it can be used to improve the stability and scalability of the GAN. Results show that the proposed framework outperformed the existing state-of-the-art framework, namely FakeGAN, in terms of AP by 7\%, and 5\% on the Yelp and TripAdvisor datasets, respectively.

Abstract (translated)

URL

https://arxiv.org/abs/2006.06561

PDF

https://arxiv.org/pdf/2006.06561.pdf
Read All
Double Meta-Learning for Data Efficient Policy Optimization in Non-Stationary Environments

2020-11-21 03:19:35

Elahe Aghapour, Nora Ayanian

arXiv_AI

arXiv_AI Reinforcement_Learning Multi_Task Optimization Pose
Abstract

We are interested in learning models of non-stationary environments, which can be framed as a multi-task learning problem. Model-free reinforcement learning algorithms can achieve good asymptotic performance in multi-task learning at a cost of extensive sampling, due to their approach, which requires learning from scratch. While model-based approaches are among the most data efficient learning algorithms, they still struggle with complex tasks and model uncertainties. Meta-reinforcement learning addresses the efficiency and generalization challenges on multi task learning by quickly leveraging the meta-prior policy for a new task. In this paper, we propose a meta-reinforcement learning approach to learn the dynamic model of a non-stationary environment to be used for meta-policy optimization later. Due to the sample efficiency of model-based learning methods, we are able to simultaneously train both the meta-model of the non-stationary environment and the meta-policy until dynamic model convergence. Then, the meta-learned dynamic model of the environment will generate simulated data for meta-policy optimization. Our experiment demonstrates that our proposed method can meta-learn the policy in a non-stationary environment with the data efficiency of model-based learning approaches while achieving the high asymptotic performance of model-free meta-reinforcement learning.

Abstract (translated)

URL

https://arxiv.org/abs/2011.10714

PDF

https://arxiv.org/pdf/2011.10714.pdf
Read All
Leveraging speaker attribute information using multi task learning for speaker verification and diarization

2020-10-27 13:10:51

Chau Luu, Peter Bell, Steve Renals

arXiv_SD

arXiv_SD Recognition Embedding Multi_Task
Abstract

Deep speaker embeddings have become the leading method for encoding speaker identity in speaker recognition tasks. The embedding space should ideally capture the variations between all possible speakers, encoding the multiple aspects that make up speaker identity. In this work, utilizing speaker age as an auxiliary variable in US Supreme Court recordings and speaker nationality with VoxCeleb, we show that by leveraging additional speaker attribute information in a multi task learning setting, deep speaker embedding performance can be increased for verification and diarization tasks, achieving a relative improvement of 17.8% in DER and 8.9% in EER for Supreme Court audio compared to omitting the auxiliary task. Experimental code has been made publicly available.

Abstract (translated)

URL

https://arxiv.org/abs/2010.14269

PDF

https://arxiv.org/pdf/2010.14269.pdf
Read All
KSM: Fast Multiple Task Adaption via Kernel-wise Soft Mask Learning

2020-09-11 21:48:39

Li Yang, Zhezhi He, Junshan Zhang, Deliang Fan

arXiv_AI

arXiv_AI Multi_Task Knowledge Pose
Abstract

Deep Neural Networks (DNN) could forget the knowledge about earlier tasks when learning new tasks, and this is known as \textit{catastrophic forgetting}. While recent continual learning methods are capable of alleviating the catastrophic problem on toy-sized datasets, some issues still remain to be tackled when applying them in real-world problems. Recently, the fast mask-based learning method (e.g. piggyback \cite{mallya2018piggyback}) is proposed to address these issues by learning only a binary element-wise mask in a fast manner, while keeping the backbone model fixed. However, the binary mask has limited modeling capacity for new tasks. A more recent work \cite{hung2019compacting} proposes a compress-grow-based method (CPG) to achieve better accuracy for new tasks by partially training backbone model, but with order-higher training cost, which makes it infeasible to be deployed into popular state-of-the-art edge-/mobile-learning. The primary goal of this work is to simultaneously achieve fast and high-accuracy multi task adaption in continual learning setting. Thus motivated, we propose a new training method called \textit{kernel-wise Soft Mask} (KSM), which learns a kernel-wise hybrid binary and real-value soft mask for each task, while using the same backbone model. Such a soft mask can be viewed as a superposition of a binary mask and a properly scaled real-value tensor, which offers a richer representation capability without low-level kernel support to meet the objective of low hardware overhead. We validate KSM on multiple benchmark datasets against recent state-of-the-art methods (e.g. Piggyback, Packnet, CPG, etc.), which shows good improvement in both accuracy and training cost.

Abstract (translated)

URL

https://arxiv.org/abs/2009.05668

PDF

https://arxiv.org/pdf/2009.05668.pdf
Read All
Mask Point R-CNN

2020-08-02 11:11:28

Wenchao Zhang, Chong Fu, Mai Zhu

arXiv_CV

arXiv_CV Segmentation Detection Attention Multi_Task Pose Contour
Abstract

The attributes of object contours has great significance for instance segmentation task. However, most of the current popular deep neural networks do not pay much attention to the target edge information. Inspired by the human annotation process when making instance segmentation datasets, in this paper, we propose Mask Point RCNN aiming at promoting the neural networks attention to the target edge information, which can heighten the information propagates between multiple tasks by using different attributes features. Specifically, we present an auxiliary task to Mask RCNN, including utilizing keypoint detection technology to construct the target edge contour, and enhancing the sensitivity of the network to the object edge through multi task learning and feature fusion. These improvements are easy to implement and have a small amount of additional computing overhead. By extensive evaluations on the Cityscapes dataset, the results show that our approach outperforms vanilla Mask RCNN by 5.4 on the validation subset and 5.0 on the test subset.

Abstract (translated)

URL

https://arxiv.org/abs/2008.00460

PDF

https://arxiv.org/pdf/2008.00460.pdf
Read All
Hierarchical Multi Task Learning with Subword Contextual Embeddings for Languages with Rich Morphology

2020-04-25 22:55:56

Arda Akdemir, Tetsuo Shibuya, Tunga Güngör

arXiv_CL

arXiv_CL Recognition Embedding Multi_Task Knowledge Pose
Abstract

Morphological information is important for many sequence labeling tasks in Natural Language Processing (NLP). Yet, existing approaches rely heavily on manual annotations or external software to capture this information. In this study, we propose using subword contextual embeddings to capture the morphological information for languages with rich morphology. In addition, we incorporate these embeddings in a hierarchical multi-task setting which is not employed before, to the best of our knowledge. Evaluated on Dependency Parsing (DEP) and Named Entity Recognition (NER) tasks, which are shown to benefit greatly from morphological information, our final model outperforms previous state-of-the-art models on both tasks for the Turkish language. Besides, we show a net improvement of 18.86% and 4.61% F-1 over the previously proposed multi-task learner in the same setting for the DEP and the NER tasks, respectively. Empirical results for five different MTL settings show that incorporating subword contextual embeddings brings significant improvements for both tasks. In addition, we observed that multi-task learning consistently improves the performance of the DEP component.

Abstract (translated)

URL

https://arxiv.org/abs/2004.12247

PDF

https://arxiv.org/pdf/2004.12247.pdf
Read All
Multi-task Learning Based Neural Bridging Reference Resolution

2020-03-07 21:21:29

Juntao Yu, Massimo Poesio

arXiv_CL

arXiv_CL Face Multi_Task Pose
Abstract

We propose a multi task learning-based neural model for bridging reference resolution tackling two key challenges faced by bridging reference resolution. The first challenge is the lack of large corpora annotated with bridging references. To address this, we use multi-task learning to help bridging reference resolution with coreference resolution. We show that substantial improvements of up to 8 p.p. can be achieved on full bridging resolution with this architecture. The second challenge is the different definitions of bridging used in different corpora, meaning that hand-coded systems or systems using special features designed for one corpus do not work well with other corpora. Our neural model only uses a small number of corpus independent features, thus can be applied easily to different corpora. Evaluations with very different bridging corpora (ARRAU, ISNOTES, BASHI and SCICORP) suggest that our architecture works equally well on all corpora, and achieves the SoTA results on full bridging resolution for all corpora, outperforming the best reported results by up to 34.9 percentage points.

Abstract (translated)

URL

https://arxiv.org/abs/2003.03666

PDF

https://arxiv.org/pdf/2003.03666.pdf
Read All
Data Distillation, Face-Related Tasks, Multi Task Learning, Semi-Supervised Learning

2019-07-08 04:36:32

Sepidehsadat Hosseini, Mohammad Amin Shabani, Nam Ik Cho

arXiv_CV

arXiv_CV Face Multi_Task Prediction Optimization Pose
Abstract

We propose a new semi-supervised learning method on face-related tasks based on Multi-Task Learning (MTL) and data distillation. The proposed method exploits multiple datasets with different labels for different-but-related tasks such as simultaneous age, gender, race, facial expression estimation. Specifically, when there are only a few well-labeled data for a specific task among the multiple related ones, we exploit the labels of other related tasks in different domains. Our approach is composed of (1) a new MTL method which can deal with weakly labeled datasets and perform several tasks simultaneously, and (2) an MTL-based data distillation framework which enables network generalization for the training and test data from different domains. Experiments show that the proposed multi-task system performs each task better than the baseline single task. It is also demonstrated that using different domain datasets along with the main dataset can enhance network generalization and overcome the domain differences between datasets. Also, comparing data distillation both on the baseline and MTL framework, the latter shows more accurate predictions on unlabeled data from different domains. Furthermore, by proposing a new learning-rate optimization method, our proposed network is able to dynamically tune its learning rate.

Abstract (translated)

URL

https://arxiv.org/abs/1907.03402

PDF

https://arxiv.org/pdf/1907.03402.pdf
Read All

Content

Multi_Task (20)

Multi_Task

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL