Paper Reading AI Learner

Human-computer interactions predict mental health

2025-12-15 16:47:48
Veith Weilnhammer, Jefferson Ortega, David Whitney

Abstract

Scalable assessments of mental illness, the leading driver of disability worldwide, remain a critical roadblock toward accessible and equitable care. Here, we show that human-computer interactions encode mental health with state-of-the-art biomarker precision. We introduce MAILA, a MAchine-learning framework for Inferring Latent mental states from digital Activity. We trained MAILA to predict 1.3 million mental-health self-reports from 20,000 cursor and touchscreen recordings recorded in 9,000 online participants. The dataset includes 2,000 individuals assessed longitudinally, 1,500 diagnosed with depression, and 500 with obsessive-compulsive disorder. MAILA tracks dynamic mental states along three orthogonal dimensions, identifies individuals living with mental illness, and achieves near-ceiling accuracy when predicting group-level mental health. By extracting non-verbal signatures of psychological function that have so far remained untapped, MAILA represents a key step toward foundation models for mental health. The ability to decode mental states at zero marginal cost creates new opportunities in neuroscience, medicine, and public health, while raising urgent questions about privacy, agency, and autonomy online.

Abstract (translated)

可扩展的心理疾病评估仍然是全球范围内实现心理健康服务普及和公平的关键障碍,而心理疾病是导致残疾的主要原因。在这里,我们展示了人机交互能够以最先进的生物标志物精度编码心理健康状态。我们引入了MAILA(用于从数字活动推断潜在精神状态的机器学习框架)。通过使用20,000个来自9,000名在线参与者的光标和触摸屏记录数据集中的130万份自我报告的心理健康数据,我们训练了MAILA。该数据集中包括2,000名纵向评估人员、1,500名被诊断为抑郁症患者以及500名患有强迫症的个体的数据。 MAILA能够追踪三种正交维度上的动态心理状态,识别有心理健康问题的人,并且在预测群体层面的心理健康时达到了接近天花板的准确性。通过提取迄今为止尚未开发的心理功能的非言语特征,MAILA代表了构建面向精神健康的基石模型的关键一步。以零边际成本解码心理状态的能力为神经科学、医学和公共卫生领域创造了新的机会,同时也引发了关于隐私、自主权以及在线环境中的个人权利等紧迫问题。

URL

https://arxiv.org/abs/2511.20179

PDF

https://arxiv.org/pdf/2511.20179.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot