Paper Reading AI Learner

Particle Trajectory Representation Learning with Masked Point Modeling

2025-02-04 18:31:56
Sam Young, Yeon-jae Jwa, Kazuhiro Terao

Abstract

Effective self-supervised learning (SSL) techniques have been key to unlocking large datasets for representation learning. While many promising methods have been developed using online corpora and captioned photographs, their application to scientific domains, where data encodes highly specialized knowledge, remains in its early stages. We present a self-supervised masked modeling framework for 3D particle trajectory analysis in Time Projection Chambers (TPCs). These detectors produce globally sparse (<1% occupancy) but locally dense point clouds, capturing meter-scale particle trajectories at millimeter resolution. Starting with PointMAE, this work proposes volumetric tokenization to group sparse ionization points into resolution-agnostic patches, as well as an auxiliary energy infilling task to improve trajectory semantics. This approach -- which we call Point-based Liquid Argon Masked Autoencoder (PoLAr-MAE) -- achieves 99.4% track and 97.7% shower classification F-scores, matching that of supervised baselines without any labeled data. While the model learns rich particle trajectory representations, it struggles with sub-token phenomena like overlapping or short-lived particle trajectories. To support further research, we release PILArNet-M -- the largest open LArTPC dataset (1M+ events, 5.2B labeled points) -- to advance SSL in high energy physics (HEP). Project site: this https URL

Abstract (translated)

有效的自监督学习(SSL)技术已成为解锁大型数据集进行表示学习的关键。尽管许多有前景的方法已经通过在线语料库和配有说明的照片得到了发展,但它们在科学领域的应用——这些领域中的数据编码了高度专业化的知识——仍处于早期阶段。我们提出了一种用于时间投影室(TPC)中3D粒子轨迹分析的自监督掩码建模框架。这些探测器产生的全局稀疏(<1%占用率)但局部密集的点云,以毫米级分辨率捕捉到米级的粒子轨迹。 基于PointMAE的工作提出了体积标记化来将稀疏的离子化点分组为与分辨率无关的补丁,并引入了一个辅助能量填充任务来改进轨迹语义。我们称这种方法为基于点的液氩掩码自编码器(PoLAr-MAE),它在没有标注数据的情况下,达到了99.4%的追踪和97.7%的 Shower分类F分数,与监督基线相匹配。 尽管该模型能够学习到丰富的粒子轨迹表示,但它仍难以处理如重叠或短寿命粒子轨迹这样的亚标记现象。为了支持进一步的研究,我们发布了PILArNet-M——一个最大的开放LArTPC数据集(超过100万事件,52亿个标注点),以推动高能物理领域中的自监督学习。 项目网站:[这个链接](https://this-url.com)

URL

https://arxiv.org/abs/2502.02558

PDF

https://arxiv.org/pdf/2502.02558.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot