Paper Reading AI Learner

NeFF-BioNet: Crop Biomass Prediction from Point Cloud to Drone Imagery

2024-10-30 04:53:11
Xuesong Li, Zeeshan Hayder, Ali Zia, Connor Cassidy, Shiming Liu, Warwick Stiller, Eric Stone, Warren Conaty, Lars Petersson, Vivien Rolland

Abstract

Crop biomass offers crucial insights into plant health and yield, making it essential for crop science, farming systems, and agricultural research. However, current measurement methods, which are labor-intensive, destructive, and imprecise, hinder large-scale quantification of this trait. To address this limitation, we present a biomass prediction network (BioNet), designed for adaptation across different data modalities, including point clouds and drone imagery. Our BioNet, utilizing a sparse 3D convolutional neural network (CNN) and a transformer-based prediction module, processes point clouds and other 3D data representations to predict biomass. To further extend BioNet for drone imagery, we integrate a neural feature field (NeFF) module, enabling 3D structure reconstruction and the transformation of 2D semantic features from vision foundation models into the corresponding 3D surfaces. For the point cloud modality, BioNet demonstrates superior performance on two public datasets, with an approximate 6.1% relative improvement (RI) over the state-of-the-art. In the RGB image modality, the combination of BioNet and NeFF achieves a 7.9% RI. Additionally, the NeFF-based approach utilizes inexpensive, portable drone-mounted cameras, providing a scalable solution for large field applications.

Abstract (translated)

作物生物量提供了关于植物健康和产量的关键见解,这对作物科学、耕作系统和农业研究至关重要。然而,目前的测量方法劳动强度大、具有破坏性且不够精确,阻碍了这一特征的大规模量化。为了解决这一限制,我们提出了一种生物质预测网络(BioNet),该网络设计用于适应不同的数据模态,包括点云和无人机影像。我们的BioNet利用稀疏3D卷积神经网络(CNN)和基于变换器的预测模块来处理点云和其他3D数据表示,从而预测生物量。为了进一步扩展BioNet以适用于无人机影像,我们集成了一个神经特征场(NeFF)模块,这使得可以重建3D结构,并将视觉基础模型中的2D语义特征转换到相应的3D表面。对于点云模态,BioNet在两个公共数据集中表现出色,相对于最先进的方法,相对改善了约6.1%。在RGB图像模态中,BioNet与NeFF的组合实现了7.9%的相对改善。此外,基于NeFF的方法使用廉价、便携式的无人机搭载相机,为大面积应用提供了可扩展的解决方案。

URL

https://arxiv.org/abs/2410.23901

PDF

https://arxiv.org/pdf/2410.23901.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot