Paper Reading AI Learner

MUSEFood: Multi-sensor-based Food Volume Estimation on Smartphones

2019-03-18 13:40:23
Junyi Gao, Weihao Tan, Liantao Ma, Yasha Wang, Wen Tang

Abstract

Researches have shown that diet recording can help people increase awareness of food intake and improve nutrition management, and thereby maintain a healthier life. Recently, researchers have been working on smartphone-based diet recording methods and applications that help users accomplish two tasks: record what they eat and how much they eat. Although the former task has made great progress through adopting image recognition technology, it is still a challenge to estimate the volume of foods accurately and conveniently. In this paper, we propose a novel method, named MUSEFood, for food volume estimation. MUSEFood uses the camera to capture photos of the food, but unlike existing volume measurement methods, MUSEFood requires neither training images with volume information nor placing a reference object of known size while taking photos. In addition, considering the impact of different containers on the contour shape of foods, MUSEFood uses a multi-task learning framework to improve the accuracy of food segmentation, and uses a differential model applicable for various containers to further reduce the negative impact of container differences on volume estimation accuracy. Furthermore, MUSEFood uses the microphone and the speaker to accurately measure the vertical distance from the camera to the food in a noisy environment, thus scaling the size of food in the image to its actual size. The experiments on real foods indicate that MUSEFood outperforms state-of-the-art approaches, and highly improves the speed of food volume estimation.

Abstract (translated)

研究表明,饮食记录有助于提高人们对食物摄入的认识,改善营养管理,从而保持健康的生活。最近,研究人员一直在研究基于智能手机的饮食记录方法和应用程序,帮助用户完成两项任务:记录他们吃什么和吃多少。虽然采用图像识别技术已经取得了很大的进展,但是准确、方便地估计食物的体积仍然是一个挑战。本文提出了一种新的食品体积估算方法,即木麻风法。Musefood使用相机捕捉食物的照片,但与现有的体积测量方法不同,Musefood不需要使用体积信息训练图像,也不需要在拍照时放置已知尺寸的参考对象。此外,考虑到不同容器对食品轮廓形状的影响,Musefood采用多任务学习框架提高了食品分割的准确性,并采用适用于不同容器的微分模型进一步降低了容器差异对体积估计精度的负面影响。此外,Musefood还使用麦克风和扬声器在嘈杂的环境中精确测量相机与食物之间的垂直距离,从而将图像中食物的大小调整为实际大小。对真实食物的实验表明,木麻风优于最先进的方法,并大大提高了食物体积估计的速度。

URL

https://arxiv.org/abs/1903.07437

PDF

https://arxiv.org/pdf/1903.07437.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot