Abstract
The surge in rich multimodal content on social media platforms has greatly advanced Multimodal Sentiment Analysis (MSA), with Large Language Models (LLMs) further accelerating progress in this field. Current approaches primarily leverage the knowledge and reasoning capabilities of parameter-heavy (Multimodal) LLMs for sentiment classification, overlooking autonomous multimodal sentiment reasoning generation in resource-constrained environments. Therefore, we focus on the Resource-Limited Joint Multimodal Sentiment Reasoning and Classification task, JMSRC, which simultaneously performs multimodal sentiment reasoning chain generation and sentiment classification only with a lightweight model. We propose a Multimodal Chain-of-Thought Reasoning Distillation model, MulCoT-RD, designed for JMSRC that employs a "Teacher-Assistant-Student" distillation paradigm to address deployment constraints in resource-limited environments. We first leverage a high-performance Multimodal Large Language Model (MLLM) to generate the initial reasoning dataset and train a medium-sized assistant model with a multi-task learning mechanism. A lightweight student model is jointly trained to perform efficient multimodal sentiment reasoning generation and classification. Extensive experiments on four datasets demonstrate that MulCoT-RD with only 3B parameters achieves strong performance on JMSRC, while exhibiting robust generalization and enhanced interpretability.
Abstract (translated)
社交媒体平台上丰富多模态内容的激增极大地推动了多模态情感分析(Multimodal Sentiment Analysis,MSA)的发展。大型语言模型(Large Language Models,LLMs)进一步加速了这一领域的进步。然而,当前的方法主要依赖于参数量大的(多模态)LLM进行情感分类,而忽视了在资源受限环境中自主生成的多模态情感推理。因此,我们专注于“资源限制下的联合多模态情感推理与分类”任务,简称JMSRC,该任务仅使用轻量级模型同时执行多模态情感推理链生成和情感分类。 为此,我们提出了一种名为Multimodal Chain-of-Thought Reasoning Distillation(MulCoT-RD)的模型,专门针对JMSRC任务设计。此模型采用“教师-助理-学生”蒸馏范式,以解决资源受限环境中的部署约束问题。首先利用高性能多模态大型语言模型(Multimodal Large Language Model, MLLM)生成初始推理数据集,并通过多任务学习机制训练一个中等规模的辅助模型。然后共同训练一个轻量级学生模型,用于执行高效的多模态情感推理生成和分类。 在四个数据集上的广泛实验表明,仅拥有30亿参数的MulCoT-RD在JMSRC任务上表现出色,并展示了强大的泛化能力和增强的可解释性。
URL
https://arxiv.org/abs/2508.05234