Memory_Networks

Mamba-360: Survey of State Space Models as Transformer Alternative for Long Sequence Modelling: Methods, Applications, and Challenges

2024-04-24 18:10:31

Badri Narayana Patro, Vijay Srinivas Agneeswaran

arXiv_AI

arXiv_AI Speech_Recognition RNN Recognition Memory_Networks Survey Attention Recommendation Transformer Pose Medical Speech
Abstract

Sequence modeling is a crucial area across various domains, including Natural Language Processing (NLP), speech recognition, time series forecasting, music generation, and bioinformatics. Recurrent Neural Networks (RNNs) and Long Short Term Memory Networks (LSTMs) have historically dominated sequence modeling tasks like Machine Translation, Named Entity Recognition (NER), etc. However, the advancement of transformers has led to a shift in this paradigm, given their superior performance. Yet, transformers suffer from $O(N^2)$ attention complexity and challenges in handling inductive bias. Several variations have been proposed to address these issues which use spectral networks or convolutions and have performed well on a range of tasks. However, they still have difficulty in dealing with long sequences. State Space Models(SSMs) have emerged as promising alternatives for sequence modeling paradigms in this context, especially with the advent of S4 and its variants, such as S4nd, Hippo, Hyena, Diagnol State Spaces (DSS), Gated State Spaces (GSS), Linear Recurrent Unit (LRU), Liquid-S4, Mamba, etc. In this survey, we categorize the foundational SSMs based on three paradigms namely, Gating architectures, Structural architectures, and Recurrent architectures. This survey also highlights diverse applications of SSMs across domains such as vision, video, audio, speech, language (especially long sequence modeling), medical (including genomics), chemical (like drug design), recommendation systems, and time series analysis, including tabular data. Moreover, we consolidate the performance of SSMs on benchmark datasets like Long Range Arena (LRA), WikiText, Glue, Pile, ImageNet, Kinetics-400, sstv2, as well as video datasets such as Breakfast, COIN, LVU, and various time series datasets. The project page for Mamba-360 work is available on this webpage.\url{this https URL}.

Abstract (translated)

序列建模是一个贯穿各种领域的关键领域，包括自然语言处理（NLP）、语音识别、时间序列预测、音乐生成和生物信息学。递归神经网络（RNNs）和长短时记忆网络（LSTMs）历史上曾统治序列建模任务，如机器翻译、命名实体识别等。然而，Transformer的进步导致了一种范式的转移，由于它们在性能上的优越表现。然而，Transformer的注意力复杂性和处理归纳偏差的能力仍然存在挑战。为解决这些问题，已经提出了几种变体，包括使用特征网络或卷积的模型，并在各种任务上表现良好。然而，它们仍然很难处理长序列。状态空间模型（SSMs）在这一背景下出现了有前景的替代方案，尤其是S4和其变体，如S4nd、Hippo、Hyena、诊断状态空间（DSS）、Gated State Spaces（GSS）和Linear Recurrent Unit（LRU）、Liquid-S4、Mamba等。在本次调查中，我们根据三种范式对基本SSMs进行了分类，即开关架构、结构架构和循环架构。本调查还强调了SSMs在各个领域的多样化应用，如视觉、视频、音频、语音、语言（特别是长序列建模）、医学（包括基因组学）、化学（如药物设计）和推荐系统，以及时间序列分析，包括表格数据。此外，我们还分析了SSMs在基准数据集，如Long Range Arena（LRA）、WikiText、Glue、Pile、ImageNet、Kinetics-400、sstv2，以及视频数据集，如Breakfast、COIN、LVU等。Mamba-360工作的项目页面可以在该网页上查看。

URL

https://arxiv.org/abs/2404.16112

PDF

https://arxiv.org/pdf/2404.16112.pdf
Read All
Beyond Gait: Learning Knee Angle for Seamless Prosthesis Control in Multiple Scenarios

2024-04-10 06:28:19

Pengwei Wang, Yilong Chen, Wan Su, Jie Wang, Teng Ma, Haoyong Yu

arXiv_RO

arXiv_RO RNN CNN Memory_Networks Deep_Learning Prediction Transformer Pose
Abstract

Deep learning models have become a powerful tool in knee angle estimation for lower limb prostheses, owing to their adaptability across various gait phases and locomotion modes. Current methods utilize Multi-Layer Perceptrons (MLP), Long-Short Term Memory Networks (LSTM), and Convolutional Neural Networks (CNN), predominantly analyzing motion information from the thigh. Contrary to these approaches, our study introduces a holistic perspective by integrating whole-body movements as inputs. We propose a transformer-based probabilistic framework, termed the Angle Estimation Probabilistic Model (AEPM), that offers precise angle estimations across extensive scenarios beyond walking. AEPM achieves an overall RMSE of 6.70 degrees, with an RMSE of 3.45 degrees in walking scenarios. Compared to the state of the art, AEPM has improved the prediction accuracy for walking by 11.31%. Our method can achieve seamless adaptation between different locomotion modes. Also, this model can be utilized to analyze the synergy between the knee and other joints. We reveal that the whole body movement has valuable information for knee movement, which can provide insights into designing sensors for prostheses. The code is available at this https URL.

Abstract (translated)

深度学习模型已成为下肢假肢角估计的强大工具，这主要是因为它们在各种步态和运动模式上的适应性。目前的方法主要利用多层感知器（MLP）、长短时记忆网络（LSTM）和卷积神经网络（CNN），主要分析大腿的运动信息。然而，与这些方法不同，我们的研究通过将全身运动作为输入来提出一种全局视角。我们提出了一个基于Transformer的全概率框架，称为角度估计概率模型（AEPM），在广泛的场景中实现了精确的角估计。AEPM在行走场景中的整体RMSE为6.70度，行走场景中的RMSE为3.45度。与最先进的技术相比，AEPM在行走方面的预测准确性提高了11.31%。我们的方法可以在不同运动模式之间实现无缝适应。此外，这个模型还可以用于分析膝盖和其他关节之间的协同作用。我们发现全身运动对膝盖运动具有宝贵的信息，可以为假肢设计传感器提供启示。代码可在此处下载：https://url.cn/xyz5h

URL

https://arxiv.org/abs/2404.06772

PDF

https://arxiv.org/pdf/2404.06772.pdf
Read All
Dual Memory Networks: A Versatile Adaptation Approach for Vision-Language Models

2024-03-26 10:54:07

Yabin Zhang, Wenjie Zhu, Hui Tang, Zhiyuan Ma, Kaiyang Zhou, Lei Zhang

arXiv_AI

arXiv_AI Memory_Networks Classification Attention Knowledge Language_Model Transformer Pose Few-Shot Zero-Shot
Abstract

With the emergence of pre-trained vision-language models like CLIP, how to adapt them to various downstream classification tasks has garnered significant attention in recent research. The adaptation strategies can be typically categorized into three paradigms: zero-shot adaptation, few-shot adaptation, and the recently-proposed training-free few-shot adaptation. Most existing approaches are tailored for a specific setting and can only cater to one or two of these paradigms. In this paper, we introduce a versatile adaptation approach that can effectively work under all three settings. Specifically, we propose the dual memory networks that comprise dynamic and static memory components. The static memory caches training data knowledge, enabling training-free few-shot adaptation, while the dynamic memory preserves historical test features online during the testing process, allowing for the exploration of additional data insights beyond the training set. This novel capability enhances model performance in the few-shot setting and enables model usability in the absence of training data. The two memory networks employ the same flexible memory interactive strategy, which can operate in a training-free mode and can be further enhanced by incorporating learnable projection layers. Our approach is tested across 11 datasets under the three task settings. Remarkably, in the zero-shot scenario, it outperforms existing methods by over 3\% and even shows superior results against methods utilizing external training data. Additionally, our method exhibits robust performance against natural distribution shifts. Codes are available at \url{this https URL}.

Abstract (translated)

随着预训练视觉语言模型（如CLIP）的出现，如何将它们适应各种下游分类任务的研究引起了人们的关注。适应策略通常可以分为三种范式：零样本适应、少样本适应和最近提出的无样本适应。大多数现有方法都是为特定场景而设计的，只能适应其中的一个或两个范式。在本文中，我们提出了一个通用的适应策略，可以在所有三个设置中有效工作。具体来说，我们提出了包括动态和静态内存组件的双内存网络。静态内存缓存训练数据知识，实现无样本少 shot 适应，而动态内存在测试过程中保留历史测试特征，允许探索训练集之外的数据洞察。这种新的能力在少样本设置中提高了模型性能，并在没有训练数据的情况下使模型具有可用性。两个内存网络采用相同的灵活内存交互策略，可以以训练-free模式运行，并通过引入可学习投影层进一步增强。我们的方法在三个任务设置下的11个数据集上进行了测试。值得注意的是，在零样本场景中，它超过了现有方法约3%的性能，甚至对抗使用外部训练数据的算法具有优越性。此外，我们的方法对自然分布变化具有鲁棒性能。代码可在此处访问：https://this URL。

URL

https://arxiv.org/abs/2403.17589

PDF

https://arxiv.org/pdf/2403.17589.pdf
Read All
Multi-perspective Memory Enhanced Network for Identifying Key Nodes in Social Networks

2024-03-22 14:29:03

Qiang Zhang, Jiawei Liu, Fanrui Zhang, Xiaoling Zhu, Zheng-Jun Zha

arXiv_AI

arXiv_AI Memory_Networks Attention Pose
Abstract

Identifying key nodes in social networks plays a crucial role in timely blocking false information. Existing key node identification methods usually consider node influence only from the propagation structure perspective and have insufficient generalization ability to unknown scenarios. In this paper, we propose a novel Multi-perspective Memory Enhanced Network (MMEN) for identifying key nodes in social networks, which mines key nodes from multiple perspectives and utilizes memory networks to store historical information. Specifically, MMEN first constructs two propagation networks from the perspectives of user attributes and propagation structure and updates node feature representations using graph attention networks. Meanwhile, the memory network is employed to store information of similar subgraphs, enhancing the model's generalization performance in unknown scenarios. Finally, MMEN applies adaptive weights to combine the node influence of the two propagation networks to select the ultimate key nodes. Extensive experiments demonstrate that our method significantly outperforms previous methods.

Abstract (translated)

在社交网络中识别关键节点对于及时屏蔽虚假信息具有关键作用。现有的关键节点识别方法通常仅从传播结构角度考虑节点影响力，并且对未知场景的泛化能力不足。在本文中，我们提出了一个名为多视角记忆增强网络（MMEN）的新方法来识别社交网络中的关键节点，该方法从多个角度挖掘关键节点，并使用记忆网络来存储历史信息。具体来说，MMEN首先从用户属性和传播结构的角度构建两个传播网络，并使用图注意力网络更新节点特征表示。同时，记忆网络用于存储类似子图的信息，提高了模型在未知场景下的泛化性能。最后，MMEN应用自适应权重将两个传播网络节点的影响力结合起来，选择最终的关键节点。大量实验证明，我们的方法显著优于现有方法。

URL

https://arxiv.org/abs/2403.15235

PDF

https://arxiv.org/pdf/2403.15235.pdf
Read All
Preventing Catastrophic Forgetting through Memory Networks in Continuous Detection

2024-03-21 19:20:29

Gaurav Bhatt, James Ross, Leonid Sigal

arXiv_CV

arXiv_CV Segmentation Detection Object_Detection Memory_Networks Classification Knowledge Optimization Transformer Pose
Abstract

Modern pre-trained architectures struggle to retain previous information while undergoing continuous fine-tuning on new tasks. Despite notable progress in continual classification, systems designed for complex vision tasks such as detection or segmentation still struggle to attain satisfactory performance. In this work, we introduce a memory-based detection transformer architecture to adapt a pre-trained DETR-style detector to new tasks while preserving knowledge from previous tasks. We propose a novel localized query function for efficient information retrieval from memory units, aiming to minimize forgetting. Furthermore, we identify a fundamental challenge in continual detection referred to as background relegation. This arises when object categories from earlier tasks reappear in future tasks, potentially without labels, leading them to be implicitly treated as background. This is an inevitable issue in continual detection or segmentation. The introduced continual optimization technique effectively tackles this challenge. Finally, we assess the performance of our proposed system on continual detection benchmarks and demonstrate that our approach surpasses the performance of existing state-of-the-art resulting in 5-7% improvements on MS-COCO and PASCAL-VOC on the task of continual detection.

Abstract (translated)

现代预训练架构在持续对新技术进行微调时，很难保留之前的知识。尽管在持续分类方面取得了显著的进展，但为复杂视觉任务（如检测或分割）设计的系统仍然很难达到令人满意的成绩。在本文中，我们引入了一种基于记忆的检测Transformer架构，将预训练的DETR风格检测器适应于新技术，同时保留之前任务的知識。我们提出了一种新的局部查询函数，用于从记忆单元中进行高效的信息检索，旨在最小化遗忘。此外，我们还指出了连续检测中一个基本挑战，称为背景降格。当早期任务中的物体类别在后续任务中重新出现，可能没有标签时，会导致它们被隐含地视为背景。这是连续检测或分割中不可避免的 issue。我们提出的连续优化技术有效地解决了这个挑战。最后，我们在连续检测基准上评估我们所提出的系统的性能，并证明了我们的方法超越了現有狀態-of-the-art，实现了在连续检测上的5-7%改进。

URL

https://arxiv.org/abs/2403.14797

PDF

https://arxiv.org/pdf/2403.14797.pdf
Read All
Spatial-temporal Memories Enhanced Graph Autoencoder for Anomaly Detection in Dynamic Graphs

2024-03-14 02:26:10

Jie Liu, Xuequn Shang, Xiaolin Han, Wentao Zhang, Hongzhi Yin

arXiv_AI

arXiv_AI Detection Memory_Networks Face Attention Embedding Relation Unsupervised Reconstruction
Abstract

Anomaly detection in dynamic graphs presents a significant challenge due to the temporal evolution of graph structures and attributes. The conventional approaches that tackle this problem typically employ an unsupervised learning framework, capturing normality patterns with exclusive normal data during training and identifying deviations as anomalies during testing. However, these methods face critical drawbacks: they either only depend on proxy tasks for general representation without directly pinpointing normal patterns, or they neglect to differentiate between spatial and temporal normality patterns, leading to diminished efficacy in anomaly detection. To address these challenges, we introduce a novel Spatial-Temporal memories-enhanced graph autoencoder (STRIPE). Initially, STRIPE employs Graph Neural Networks (GNNs) and gated temporal convolution layers to extract spatial features and temporal features, respectively. Then STRIPE incorporates separate spatial and temporal memory networks, which capture and store prototypes of normal patterns, thereby preserving the uniqueness of spatial and temporal normality. After that, through a mutual attention mechanism, these stored patterns are then retrieved and integrated with encoded graph embeddings. Finally, the integrated features are fed into the decoder to reconstruct the graph streams which serve as the proxy task for anomaly detection. This comprehensive approach not only minimizes reconstruction errors but also refines the model by emphasizing the compactness and distinctiveness of the embeddings in relation to the nearest memory prototypes. Through extensive testing, STRIPE has demonstrated a superior capability to discern anomalies by effectively leveraging the distinct spatial and temporal dynamics of dynamic graphs, significantly outperforming existing methodologies, with an average improvement of 15.39% on AUC values.

Abstract (translated)

在动态图中的异常检测是一个挑战性的任务，因为图结构和属性的时间演化。解决这个问题的传统方法通常采用无监督学习框架，在训练期间捕获规范模式，并在测试期间识别异常。然而，这些方法面临着关键的缺陷：它们要么只依赖于一般表示的代理任务，没有直接确定规范模式，要么忽视了空间和时间规范模式之间的区别，导致异常检测的有效性降低。为了应对这些挑战，我们引入了一种新颖的空间-时间记忆增强图自编码器（STRIPE）。首先，STRIPE采用图神经网络（GNNs）和有门时间卷积层来提取空间特征和时间特征。然后，STRIPE引入了单独的空间和时间记忆网络，它们捕获并存储规范模式的模板，从而保留空间和时间的独特性。接下来，通过自注意力机制，这些存储的模式被检索并整合与编码的图嵌入。最后，将整合的嵌入输入解码器以重构图流作为异常检测的代理任务。这种全面的方法不仅减少了重构误差，而且通过强调嵌入与最近记忆原型之间的简洁性和差异性，优化了模型。通过广泛的测试，STRIPE已经证明了自己在区分异常方面的优越性能，有效提高了平均异常检测的准确率15.39%。

URL

https://arxiv.org/abs/2403.09039

PDF

https://arxiv.org/pdf/2403.09039.pdf
Read All
Predicting Outcomes in Video Games with Long Short Term Memory Networks

2024-02-24 22:36:23

Kittimate Chulajata, Sean Wu, Fabien Scalzo, Eun Sang Cha

arXiv_AI

arXiv_AI RNN Memory_Networks Prediction Language_Model Transformer
Abstract

Forecasting winners in E-sports with real-time analytics has the potential to further engage audiences watching major tournament events. However, making such real-time predictions is challenging due to unpredictable variables within the game involving diverse player strategies and decision-making. Our work attempts to enhance audience engagement within video game tournaments by introducing a real-time method of predicting wins. Our Long Short Term Memory Network (LSTMs) based approach enables efficient predictions of win-lose outcomes by only using the health indicator of each player as a time series. As a proof of concept, we evaluate our model's performance within a classic, two-player arcade game, Super Street Fighter II Turbo. We also benchmark our method against state of the art methods for time series forecasting; i.e. Transformer models found in large language models (LLMs). Finally, we open-source our data set and code in hopes of furthering work in predictive analysis for arcade games.

Abstract (translated)

通过实时分析预测电子竞技比赛的胜者具有进一步激发观众观看主要赛事活动潜力。然而，由于游戏内涉及多种玩家策略和决策的不确定变量，进行这样的实时预测具有挑战性。我们的工作旨在通过引入基于长短期记忆网络（LSTMs）的实时预测方法来增强电子竞技比赛中的观众参与度。通过使用每个玩家的健康状况作为时间序列，我们的基于LSTMs的方法可以高效预测胜负结果。作为概念证明，我们在经典的两个玩家街机游戏《超级街头霸王2 Turbo》中评估了我们的模型的性能。我们还与大型语言模型（LLM）中找到的Transformer模型进行了基准测试。最后，我们开源了我们的数据集和代码，希望进一步推动电子竞技游戏预测分析的工作。

URL

https://arxiv.org/abs/2402.15923

PDF

https://arxiv.org/pdf/2402.15923.pdf
Read All
A comparative study of zero-shot inference with large language models and supervised modeling in breast cancer pathology classification

2024-01-25 02:05:31

Madhumita Sushil, Travis Zack, Divneet Mandair, Zhiwei Zheng, Ahmed Wali, Yan-Ning Yu, Yuwei Quan, Atul J. Butte

arXiv_CL

arXiv_CL RNN Memory_Networks Classification Attention Transfer_Learning Inference Language_Model Bert Transformer Action Zero-Shot Chat
Abstract

Although supervised machine learning is popular for information extraction from clinical notes, creating large annotated datasets requires extensive domain expertise and is time-consuming. Meanwhile, large language models (LLMs) have demonstrated promising transfer learning capability. In this study, we explored whether recent LLMs can reduce the need for large-scale data annotations. We curated a manually-labeled dataset of 769 breast cancer pathology reports, labeled with 13 categories, to compare zero-shot classification capability of the GPT-4 model and the GPT-3.5 model with supervised classification performance of three model architectures: random forests classifier, long short-term memory networks with attention (LSTM-Att), and the UCSF-BERT model. Across all 13 tasks, the GPT-4 model performed either significantly better than or as well as the best supervised model, the LSTM-Att model (average macro F1 score of 0.83 vs. 0.75). On tasks with high imbalance between labels, the differences were more prominent. Frequent sources of GPT-4 errors included inferences from multiple samples and complex task design. On complex tasks where large annotated datasets cannot be easily collected, LLMs can reduce the burden of large-scale data labeling. However, if the use of LLMs is prohibitive, the use of simpler supervised models with large annotated datasets can provide comparable results. LLMs demonstrated the potential to speed up the execution of clinical NLP studies by reducing the need for curating large annotated datasets. This may result in an increase in the utilization of NLP-based variables and outcomes in observational clinical studies.

Abstract (translated)

尽管监督机器学习在从临床笔记中提取信息方面广受欢迎，但创建大型注释数据集需要广泛的领域专业知识，并且需要花费大量时间。与此同时，大型语言模型（LLMs）已经展示了有希望的迁移学习能力。在这项研究中，我们探讨了最近LLMs是否可以减少大规模数据注释的需求。我们挑选了769个乳腺癌病理学报告的手动标注数据集，用13个类别进行标注，以比较GPT-4模型和GPT-3.5模型与三种架构模型的监督分类性能：随机森林分类器、带有注意的长短期记忆网络（LSTM-Att）和UCSF-BERT模型。在所有13个任务中，GPT-4模型在大多数任务上表现得比或与最好的监督模型更好，LSTM-Att模型的平均宏观F1分数为0.83，而0.75。在标签高度不平衡的任务中，差异更加明显。GPT-4错误的常见来源包括多个样本的推理和复杂任务设计。在无法轻松收集大型注释数据集的复杂任务中，LLMs可以减轻大型数据标注的负担。然而，如果使用LLMs的成本太高，使用简单的监督模型并提供类似结果可能是有意义的。LLMs证明通过减少对大型注释数据集的需求，可以加速临床自然语言处理研究的执行。这可能导致在观察性临床研究中使用NLP变量的利用率增加和结果增加。

URL

https://arxiv.org/abs/2401.13887

PDF

https://arxiv.org/pdf/2401.13887.pdf
Read All
RELIANCE: Reliable Ensemble Learning for Information and News Credibility Evaluation

2024-01-17 13:11:09

Majid Ramezani, Hamed Mohammad-Shahi, Mahshid Daliry, Soroor Rahmani, Amir-Hosein Asghari

arXiv_CL

arXiv_CL RNN Memory_Networks Pose
Abstract

In the era of information proliferation, discerning the credibility of news content poses an ever-growing challenge. This paper introduces RELIANCE, a pioneering ensemble learning system designed for robust information and fake news credibility evaluation. Comprising five diverse base models, including Support Vector Machine (SVM), naive Bayes, logistic regression, random forest, and Bidirectional Long Short Term Memory Networks (BiLSTMs), RELIANCE employs an innovative approach to integrate their strengths, harnessing the collective intelligence of the ensemble for enhanced accuracy. Experiments demonstrate the superiority of RELIANCE over individual models, indicating its efficacy in distinguishing between credible and non-credible information sources. RELIANCE, also surpasses baseline models in information and news credibility assessment, establishing itself as an effective solution for evaluating the reliability of information sources.

Abstract (translated)

在信息爆炸的时代，辨别新闻内容的可靠性面临着日益增长的任务。本文介绍了一个先驱的集成学习系统——RELIANCE，旨在评估信息的可靠性和虚假新闻的可靠性。该系统由五种不同的基础模型组成，包括支持向量机（SVM）、朴素贝叶斯、逻辑回归、随机森林和双向长短时记忆网络（BiLSTMs）。RELIANCE采用了一种创新的方法，整合其优势，并利用集体的智慧来提高准确性。实验证明，RELIANCE相对于单独的模型具有优越性，表明其在区分可信和不可信信息来源方面的有效性。此外，RELIANCE在信息和新闻可靠性评估方面超过了基线模型，成为评估信息来源可靠性的有效解决方案。

URL

https://arxiv.org/abs/2401.10940

PDF

https://arxiv.org/pdf/2401.10940.pdf
Read All
Natural Language Processing and Multimodal Stock Price Prediction

2024-01-03 01:21:30

Kevin Taylor, Jerry Ng

arXiv_CL

arXiv_CL RNN Memory_Networks Prediction Bert
Abstract

In the realm of financial decision-making, predicting stock prices is pivotal. Artificial intelligence techniques such as long short-term memory networks (LSTMs), support-vector machines (SVMs), and natural language processing (NLP) models are commonly employed to predict said prices. This paper utilizes stock percentage change as training data, in contrast to the traditional use of raw currency values, with a focus on analyzing publicly released news articles. The choice of percentage change aims to provide models with context regarding the significance of price fluctuations and overall price change impact on a given stock. The study employs specialized BERT natural language processing models to predict stock price trends, with a particular emphasis on various data modalities. The results showcase the capabilities of such strategies with a small natural language processing model to accurately predict overall stock trends, and highlight the effectiveness of certain data features and sector-specific data.

Abstract (translated)

在金融决策领域，预测股票价格至关重要。通常使用长期记忆网络（LSTMs）、支持向量机（SVMs）和自然语言处理（NLP）模型等人工智能技术来预测这些价格。本文使用股票百分比变化作为训练数据，而非传统使用货币原始值，重点分析公开发布的新闻文章。百分比变化的选取旨在为模型提供关于价格波动和整体价格变化对特定股票的重要性。该研究采用专门的BERT自然语言处理模型预测股票价格趋势，特别关注各种数据模态。结果表明，这类策略 small natural language processing model 确实具有准确预测整体股票趋势的能力，并突出了某些数据特征和行业特定数据的有效性。

URL

https://arxiv.org/abs/2401.01487

PDF

https://arxiv.org/pdf/2401.01487.pdf
Read All
Bird Movement Prediction Using Long Short-Term Memory Networks to Prevent Bird Strikes with Low Altitude Aircraft

2023-12-17 20:12:39

Elaheh Sabziyan Varnousfaderani, Syed A. M. Shihab

arXiv_CV

arXiv_CV RNN Memory_Networks Prediction
Abstract

The number of collisions between aircraft and birds in the airspace has been increasing at an alarming rate over the past decade due to increasing bird population, air traffic and usage of quieter aircraft. Bird strikes with aircraft are anticipated to increase dramatically when emerging Advanced Air Mobility aircraft start operating in the low altitude airspace where probability of bird strikes is the highest. Not only do such bird strikes can result in human and bird fatalities, but they also cost the aviation industry millions of dollars in damages to aircraft annually. To better understand the causes and effects of bird strikes, research to date has mainly focused on analyzing factors which increase the probability of bird strikes, identifying high risk birds in different locations, predicting the future number of bird strike incidents, and estimating cost of bird strike damages. However, research on bird movement prediction for use in flight planning algorithms to minimize the probability of bird strikes is very limited. To address this gap in research, we implement four different types of Long Short-Term Memory (LSTM) models to predict bird movement latitudes and longitudes. A publicly available data set on the movement of pigeons is utilized to train the models and evaluate their performances. Using the bird flight track predictions, aircraft departures from Cleveland Hopkins airport are simulated to be delayed by varying amounts to avoid potential bird strikes with aircraft during takeoff. Results demonstrate that the LSTM models can predict bird movement with high accuracy, achieving a Mean Absolute Error of less than 100 meters, outperforming linear and nonlinear regression models. Our findings indicate that incorporating bird movement prediction into flight planning can be highly beneficial.

Abstract (translated)

过去十年里，由于鸟类数量的增加、空勤和飞机使用的安静型飞机越来越多，空中撞击航空器与鸟类之间的碰撞数量不断增加。预计，当低空空域中出现新型先进空运工具时，预计鸟类撞击航空器的显著增加。不仅这些鸟类撞击会导致人类和鸟类死亡，而且它们每年还会给航空业造成数百万美元的损失。为了更好地了解撞击的原因和影响，迄今为止，研究主要集中在分析增加鸟类撞击概率的因素、确定不同地点的高风险鸟类、预测未来的鸟类撞击事件以及估计鸟类撞击损失成本。然而，关于用于飞行计划算法预测鸟类移动的研究却非常有限。为了填补这一研究空白，我们采用了四种不同的长短时记忆（LSTM）模型来预测鸟的移动纬度和经度。一个可公开获取的鸽子运动数据集用于训练模型并评估其性能。利用鸟类飞行轨迹预测，我们将克利夫兰霍金斯机场的飞机出发时间模拟为由于避免与鸟类撞击而推迟，数量 varying。结果表明，LSTM模型可以预测鸟类运动，具有较高的准确度，实现平均绝对误差小于100米，优于线性和非线性回归模型。我们的研究结果表明，将鸟类运动预测纳入飞行计划可以带来极大的好处。

URL

https://arxiv.org/abs/2312.12461

PDF

https://arxiv.org/pdf/2312.12461.pdf
Read All
ALGNet: Attention Light Graph Memory Network for Medical Recommendation System

2023-12-09 00:46:37

Minh-Van Nguyen, Duy-Thinh Nguyen, Quoc-Huy Trinh, Bac-Hoai Le

arXiv_CV

arXiv_CV CNN Memory_Networks Attention Embedding Recommendation Relation Knowledge Pose Action Medical
Abstract

Medication recommendation is a vital task for improving patient care and reducing adverse events. However, existing methods often fail to capture the complex and dynamic relationships among patient medical records, drug efficacy and safety, and drug-drug interactions (DDI). In this paper, we propose ALGNet, a novel model that leverages light graph convolutional networks (LGCN) and augmentation memory networks (AMN) to enhance medication recommendation. LGCN can efficiently encode the patient records and the DDI graph into low-dimensional embeddings, while AMN can augment the patient representation with external knowledge from a memory module. We evaluate our model on the MIMIC-III dataset and show that it outperforms several baselines in terms of recommendation accuracy and DDI avoidance. We also conduct an ablation study to analyze the effects of different components of our model. Our results demonstrate that ALGNet can achieve superior performance with less computation and more interpretability. The implementation of this paper can be found at: this https URL.

Abstract (translated)

药物推荐是提高患者护理和减少不良事件的关键任务。然而，现有的方法通常无法捕捉患者医疗记录、药物效力和安全性以及药物-药物相互作用（DDI）之间的复杂和动态关系。在本文中，我们提出了ALGNet，一种新模型，它利用光图卷积网络（LGCN）和增强记忆网络（AMN）来增强药物推荐。LGCN可以有效地将患者记录和DDI图编码为低维嵌入，而AMN可以利用外部记忆模块的外部知识来增强患者表示。我们在MIMIC-III数据集上评估我们的模型，并证明了其在推荐准确性和DDI避免方面的性能优于几个基线。我们还进行了消融研究，以分析我们模型中不同组件的影响。本文的结果表明，ALGNet可以在更少的计算和更高的可解释性下实现卓越的性能。该论文的实施可以从以下链接找到：https://this URL。

URL

https://arxiv.org/abs/2312.08377

PDF

https://arxiv.org/pdf/2312.08377.pdf
Read All
User-Aware Prefix-Tuning is a Good Learner for Personalized Image Captioning

2023-12-08 02:08:00

Xuan Wang, Guanhong Wang, Wenhao Chai, Jiayu Zhou, Gaoang Wang

arXiv_CV

arXiv_CV Image_Caption Caption Memory_Networks Knowledge Language_Model Transformer Pose Chat
Abstract

Image captioning bridges the gap between vision and language by automatically generating natural language descriptions for images. Traditional image captioning methods often overlook the preferences and characteristics of users. Personalized image captioning solves this problem by incorporating user prior knowledge into the model, such as writing styles and preferred vocabularies. Most existing methods emphasize the user context fusion process by memory networks or transformers. However, these methods ignore the distinct domains of each dataset. Therefore, they need to update the entire caption model parameters when meeting new samples, which is time-consuming and calculation-intensive. To address this challenge, we propose a novel personalized image captioning framework that leverages user context to consider personality factors. Additionally, our framework utilizes the prefix-tuning paradigm to extract knowledge from a frozen large language model, reducing the gap between different language domains. Specifically, we employ CLIP to extract the visual features of an image and align the semantic space using a query-guided mapping network. By incorporating the transformer layer, we merge the visual features with the user's contextual prior knowledge to generate informative prefixes. Moreover, we employ GPT-2 as the frozen large language model. With a small number of parameters to be trained, our model performs efficiently and effectively. Our model outperforms existing baseline models on Instagram and YFCC100M datasets across five evaluation metrics, demonstrating its superiority, including twofold improvements in metrics such as BLEU-4 and CIDEr.

Abstract (translated)

图像标题通过自动生成自然语言描述来弥合视觉和语言之间的差距。传统的图像标题方法通常忽视用户的偏好和特点。个性化的图像标题通过将用户的先前知识融入模型中来解决这一问题，例如写作风格和喜欢的词汇。大多数现有方法强调通过记忆网络或变换器来融合用户上下文的过程。然而，这些方法忽略了每个数据集的独特领域。因此，在遇到新样本时，它们需要更新整个标题模型参数，这耗时且计算密集型。为解决这个问题，我们提出了一个新颖的个性化图像标题框架，它利用用户上下文来考虑个性因素。此外，我们的框架利用了前缀调整范式来提取知识，从而减少不同语言领域之间的差距。具体来说，我们使用CLIP提取图像的视觉特征，并通过查询引导映射网络将语义空间对齐。通过包含变换器层，我们将视觉特征与用户的上下文先验知识相结合，生成有信息的前缀。此外，我们使用GPT-2作为冻结的大语言模型。由于只需要很少的参数来训练，我们的模型具有高效且有效的能力。我们的模型在Instagram和YFCC100M数据集上优于现有基线模型，在五个评估指标上实现了卓越的表现，包括BLEU-4的 twice 改善和CIDEr的改善。

URL

https://arxiv.org/abs/2312.04793

PDF

https://arxiv.org/pdf/2312.04793.pdf
Read All
Predictive Modeling of Coronal Hole Areas Using Long Short-Term Memory Networks

2023-11-25 03:03:21

Juyoung Yun

arXiv_CV

arXiv_CV RNN Memory_Networks Deep_Learning
Abstract

In the era of space exploration, the implications of space weather have become increasingly evident. Central to this is the phenomenon of coronal holes, which can significantly influence the functioning of satellites and aircraft. These coronal holes, present on the sun, are distinguished by their open magnetic field lines and comparatively cooler temperatures, leading to the emission of solar winds at heightened rates. To anticipate the effects of these coronal holes on Earth, our study harnesses computer vision to pinpoint the coronal hole regions and estimate their dimensions using imagery from the Solar Dynamics Observatory (SDO). Further, we deploy deep learning methodologies, specifically the Long Short-Term Memory (LSTM) approach, to analyze the trends in the data related to the area of the coronal holes and predict their dimensions across various solar regions over a span of seven days. By evaluating the time series data concerning the area of the coronal holes, our research seeks to uncover patterns in the behavior of coronal holes and comprehend their potential influence on space weather occurrences. This investigation marks a pivotal stride towards bolstering our capacity to anticipate and brace for space weather events that could have ramifications for Earth and its technological apparatuses.

Abstract (translated)

在太空探索的时代，太空天气的影响越来越明显。这一现象的核心是太阳黑子现象，黑子对卫星和飞机的运行具有重要影响。这些黑子分布在太阳上，其特征是开放式的磁场线和相对较冷的温度，导致太阳风以更高的速率发射。为了预测这些黑子对地球的影响，我们的研究利用计算机视觉技术确定黑子区域，并使用太阳能动力学观测站（SDO）的图像估计它们的尺寸。此外，我们运用深度学习方法，特别是长短时记忆（LSTM）方法，对黑子区域的数据进行分析和预测，预测它们在七个不同太阳区域中的尺寸。通过评估黑子区域的时间序列数据，我们的研究旨在揭示黑子行为的模式，并理解它们对太空天气事件的影响。这次调查标志着我们向前迈进，提高我们对预测和应对太空天气事件的准备能力，从而影响地球及其技术设备的未来。

URL

https://arxiv.org/abs/2301.06732

PDF

https://arxiv.org/pdf/2301.06732.pdf
Read All
Automated 3D Tumor Segmentation using Temporal Cubic PatchGAN

2023-11-23 18:37:26

Kameswara Bharadwaj Mantha, Ramanakumar Sankar, Lucy Fortson

arXiv_CV

arXiv_CV Segmentation RNN GAN CNN Memory_Networks Deep_Learning Pose Medical 3D
Abstract

Development of robust general purpose 3D segmentation frameworks using the latest deep learning techniques is one of the active topics in various bio-medical domains. In this work, we introduce Temporal Cubic PatchGAN (TCuP-GAN), a volume-to-volume translational model that marries the concepts of a generative feature learning framework with Convolutional Long Short-Term Memory Networks (LSTMs), for the task of 3D segmentation. We demonstrate the capabilities of our TCuP-GAN on the data from four segmentation challenges (Adult Glioma, Meningioma, Pediatric Tumors, and Sub-Saharan Africa subset) featured within the 2023 Brain Tumor Segmentation (BraTS) Challenge and quantify its performance using LesionWise Dice similarity and $95\%$ Hausdorff Distance metrics. We demonstrate the successful learning of our framework to predict robust multi-class segmentation masks across all the challenges. This benchmarking work serves as a stepping stone for future efforts towards applying TCuP-GAN on other multi-class tasks such as multi-organelle segmentation in electron microscopy imaging.

Abstract (translated)

使用最新的深度学习技术开发鲁棒的一般3D分割框架是各种生物医学领域的一个活跃主题。在这项工作中，我们介绍了Temporal Cubic PatchGAN（TCuP-GAN），一种将生成特征学习框架与卷积长短期记忆网络（LSTMs）相结合的体积到体积的传输模型，用于3D分割任务。我们展示了TCuP-GAN在2023年脑肿瘤分割（BraTS）挑战中的数据上的能力，并使用LesionWise Dice相似度和$95\%$ 汉明距离度量对其性能进行量化。我们展示了我们的框架在所有挑战中成功预测鲁棒多类分割掩码。这一基准工作为未来在电子显微镜图像成像等更多多类任务上应用TCuP-GAN奠定了基础。

URL

https://arxiv.org/abs/2311.14148

PDF

https://arxiv.org/pdf/2311.14148.pdf
Read All
Temporal Performance Prediction for Deep Convolutional Long Short-Term Memory Networks

2023-11-13 17:11:35

Laura Fieback (1), Bidya Dash (1), Jakob Spiegelberg (1), Hanno Gottschalk (2) ((1) Volkswagen AG, (2) TU Berlin)

arXiv_CV

arXiv_CV Segmentation Semantic_Segmentation CNN Memory_Networks Prediction Pose Autonomous
Abstract

Quantifying predictive uncertainty of deep semantic segmentation networks is essential in safety-critical tasks. In applications like autonomous driving, where video data is available, convolutional long short-term memory networks are capable of not only providing semantic segmentations but also predicting the segmentations of the next timesteps. These models use cell states to broadcast information from previous data by taking a time series of inputs to predict one or even further steps into the future. We present a temporal postprocessing method which estimates the prediction performance of convolutional long short-term memory networks by either predicting the intersection over union of predicted and ground truth segments or classifying between intersection over union being equal to zero or greater than zero. To this end, we create temporal cell state-based input metrics per segment and investigate different models for the estimation of the predictive quality based on these metrics. We further study the influence of the number of considered cell states for the proposed metrics.

Abstract (translated)

量化深度语义分割网络的预测不确定性对于关键任务来说是至关重要的。在像自动驾驶这样的应用中，由于视频数据可用，卷积长短期记忆网络不仅能够提供语义分割，而且能够预测下一个时间步的分割。这些模型通过细胞状态从前的数据中传播信息，通过对输入的时间序列进行预测，预测一或甚至是未来的更多步骤。我们提出了一个基于时间戳的后处理方法，该方法通过预测预测和真实分割的交集或并集来估计卷积长短期记忆网络的预测性能。为此，我们为每个分割段创建了基于细胞状态的时间戳输入指标，并研究了基于这些指标估计预测质量的不同模型。我们进一步研究了所提出的指标中考虑的细胞状态的数量对预测质量的影响。

URL

https://arxiv.org/abs/2311.07477

PDF

https://arxiv.org/pdf/2311.07477.pdf
Read All
Generation Of Colors using Bidirectional Long Short Term Memory Networks

2023-11-11 11:35:37

A. Sinha

arXiv_CV

arXiv_CV RNN Memory_Networks
Abstract

Human vision can distinguish between a vast spectrum of colours, estimated to be between 2 to 7 million discernible shades. However, this impressive range does not inherently imply that all these colours have been precisely named and described within our lexicon. We often associate colours with familiar objects and concepts in our daily lives. This research endeavors to bridge the gap between our visual perception of countless shades and our ability to articulate and name them accurately. A novel model has been developed to achieve this goal, leveraging Bidirectional Long Short-Term Memory (BiLSTM) networks with Active learning. This model operates on a proprietary dataset meticulously curated for this study. The primary objective of this research is to create a versatile tool for categorizing and naming previously unnamed colours or identifying intermediate shades that elude traditional colour terminology. The findings underscore the potential of this innovative approach in revolutionizing our understanding of colour perception and language. Through rigorous experimentation and analysis, this study illuminates a promising avenue for Natural Language Processing (NLP) applications in diverse industries. By facilitating the exploration of the vast colour spectrum the potential applications of NLP are extended beyond conventional boundaries.

Abstract (translated)

人类视觉可以区分出数百万种色彩的广泛范围，据估计在2到7百万个可辨别色调之间。然而，这一令人印象深刻的范围并不暗示所有这些颜色都已在我们的词汇库中准确地命名和描述。我们通常将颜色与我们在日常生活中熟悉的物体和概念相关联。这项研究旨在弥合我们视觉感知到无数种色彩与我们准确表达和命名它们的能力之间的差距。为了实现这一目标，利用双向长短时记忆（BiLSTM）网络与主动学习，开发了一个新模型。该模型在专门为这项研究 curated的私用数据集上运行。这一研究的主要目标是为分类和命名以前未命名的颜色或识别中间色调提供一个实用的工具。研究结果强调了这种创新方法在颠覆我们对色彩感知和语言的理解方面的潜力。通过严格的实验和分析，这项研究揭示了自然语言处理（NLP）在各种行业应用中的一个有前景的途径。通过促进对丰富色彩范围的探索，NLP的应用范围超越了传统边界。

URL

https://arxiv.org/abs/2311.06542

PDF

https://arxiv.org/pdf/2311.06542.pdf
Read All
Building a Safer Maritime Environment Through Multi-Path Long-Term Vessel Trajectory Forecasting

2023-10-29 09:15:22

Gabriel Spadon, Jay Kumar, Matthew Smith, Sarah Vela, Romina Gehrmann, Derek Eden, Joshua van Berkel, Amilcar Soares, Ronan Fablet, Ronald Pelot, Stan Matwin

arXiv_AI

arXiv_AI RNN CNN Memory_Networks Attention Surveillance
Abstract

Maritime transport is paramount to global economic growth and environmental sustainability. In this regard, the Automatic Identification System (AIS) data plays a significant role by offering real-time streaming data on vessel movement, which allows for enhanced traffic surveillance, assisting in vessel safety by avoiding vessel-to-vessel collisions and proactively preventing vessel-to-whale ones. This paper tackles an intrinsic problem to trajectory forecasting: the effective multi-path long-term vessel trajectory forecasting on engineered sequences of AIS data. We utilize an encoder-decoder model with Bidirectional Long Short-Term Memory Networks (Bi-LSTM) to predict the next 12 hours of vessel trajectories using 1 to 3 hours of AIS data. We feed the model with probabilistic features engineered from the AIS data that refer to the potential route and destination of each trajectory so that the model, leveraging convolutional layers for spatial feature learning and a position-aware attention mechanism that increases the importance of recent timesteps of a sequence during temporal feature learning, forecasts the vessel trajectory taking the potential route and destination into account. The F1 Score of these features is approximately 85% and 75%, indicating their efficiency in supplementing the neural network. We trialed our model in the Gulf of St. Lawrence, one of the North Atlantic Right Whales (NARW) habitats, achieving an R2 score exceeding 98% with varying techniques and features. Despite the high R2 score being attributed to well-defined shipping lanes, our model demonstrates superior complex decision-making during path selection. In addition, our model shows enhanced accuracy, with average and median forecasting errors of 11km and 6km, respectively. Our study confirms the potential of geographical data engineering and trajectory forecasting models for preserving marine life species.

Abstract (translated)

海上运输对全球经济增长和环境保护至关重要。在这方面，自动识别系统（AIS）数据通过提供关于船舶运动的实时流式数据发挥着重要作用，从而提高了交通监控，通过避免船舶之间的碰撞，以及主动预防船舶与鲸鱼的碰撞，有助于船舶安全。本文解决了轨迹预测的一个固有难题：利用工程序列的AIS数据进行多路径长短期记忆网络（Bi-LSTM）预测船舶的下一个12小时轨迹。我们将模型喂入由AIS数据生成的概率特征，这些特征指定了每个轨迹的潜在路线和目的地，以便模型利用卷积层进行空间特征学习，并具有位置感知注意机制，在时间特征学习过程中增加对序列最近时刻的重视，从而预测船舶轨迹时考虑潜在路线和目的地。这些特征的F1得分约为85%和75%，表明其补充神经网络的效率。我们在大西洋一个右旋鲸（NARW）的栖息地——大西洋北部海域的墨西哥湾进行试验，使用不同的技术和特征，取得了超过98%的R2得分。尽管高R2得分归因于定义良好的航运通道，但我们的模型在路径选择过程中表现出卓越的复杂决策能力。此外，我们的模型显示出增强的准确性，平均预测误差为11公里，中位数预测误差为6公里。我们的研究证实了地理数据工程和轨迹预测模型的潜在价值，可以用于保护海洋生物物种。

URL

https://arxiv.org/abs/2310.18948

PDF

https://arxiv.org/pdf/2310.18948.pdf
Read All
In search of dispersed memories: Generative diffusion models are associative memory networks

2023-09-29 14:48:24

Luca Ambrogioni

arXiv_CV

arXiv_CV Memory_Networks Diffusion
Abstract

Hopfield networks are widely used in neuroscience as simplified theoretical models of biological associative memory. The original Hopfield networks store memories by encoding patterns of binary associations, which result in a synaptic learning mechanism known as Hebbian learning rule. Modern Hopfield networks can achieve exponential capacity scaling by using highly non-linear energy functions. However, the energy function of these newer models cannot be straightforwardly compressed into binary synaptic couplings and it does not directly provide new synaptic learning rules. In this work we show that generative diffusion models can be interpreted as energy-based models and that, when trained on discrete patterns, their energy function is equivalent to that of modern Hopfield networks. This equivalence allows us to interpret the supervised training of diffusion models as a synaptic learning process that encodes the associative dynamics of a modern Hopfield network in the weight structure of a deep neural network. Accordingly, in our experiments we show that the storage capacity of a continuous modern Hopfield network is identical to the capacity of a diffusion model. Our results establish a strong link between generative modeling and the theoretical neuroscience of memory, which provide a powerful computational foundation for the reconstructive theory of memory, where creative generation and memory recall can be seen as parts of a unified continuum.

Abstract (translated)

霍夫海姆网络在神经科学中被广泛应用,作为生物学联想记忆简化的理论模型。最初的霍夫海姆网络通过编码二进制关联模式存储记忆,导致一种称为赫伯bian学习规则的联想学习机制。现代霍夫海姆网络使用高度非线性的能量函数可以实现指数级容量扩展。但是这些新的模型的能量函数不能直接压缩为二进制联想耦合,并且它们并不直接提供新的联想学习规则。在本文中,我们表明生成扩散模型可以被视为基于能量模型的模型,并且当训练在离散模式时,它们的能量函数等价于现代霍夫海姆网络的能量函数。这种等价性可以解释监督训练扩散模型的过程视为一个联想学习过程,编码现代霍夫海姆网络的联想动态在深度神经网络权重结构中。因此,在我们的实验中,我们表明连续的现代霍夫海姆网络的存储容量与扩散模型的容量相同。我们的结果建立了生成建模和记忆理论 neuroscience 之间的强烈联系,为记忆重建理论提供了强大的计算基础,其中创造性生成和记忆回忆可以被视为统一连续的一部分。

URL

https://arxiv.org/abs/2309.17290

PDF

https://arxiv.org/pdf/2309.17290.pdf
Read All
Urdu Poetry Generated by Using Deep Learning Techniques

2023-09-25 15:44:24

Muhammad Shoaib Farooq, Ali Abbas

arXiv_CL

arXiv_CL RNN Memory_Networks Deep_Learning Pose
Abstract

This study provides Urdu poetry generated using different deep-learning techniques and algorithms. The data was collected through the Rekhta website, containing 1341 text files with several couplets. The data on poetry was not from any specific genre or poet. Instead, it was a collection of mixed Urdu poems and Ghazals. Different deep learning techniques, such as the model applied Long Short-term Memory Networks (LSTM) and Gated Recurrent Unit (GRU), have been used. Natural Language Processing (NLP) may be used in machine learning to understand, analyze, and generate a language humans may use and understand. Much work has been done on generating poetry for different languages using different techniques. The collection and use of data were also different for different researchers. The primary purpose of this project is to provide a model that generates Urdu poems by using data completely, not by sampling data. Also, this may generate poems in pure Urdu, not Roman Urdu, as in the base paper. The results have shown good accuracy in the poems generated by the model.

Abstract (translated)

这项研究提供了使用不同深度学习技术和算法生成的古拉姆诗歌。数据是通过Rekhta网站收集的,其中包括1341个文本文件,其中包括几个句子。诗歌数据来自不同的具体流派或诗人,而是一组混合了古拉姆诗歌和 Ghazals 的诗歌集。使用了不同的深度学习技术,例如应用了 Long Short-term Memory Networks (LSTM) 和 Gated Recurrent Unit (GRU) 的模型。自然语言处理(NLP)可以在机器学习中用于理解、分析和生成人类可以使用和理解的语言。大量工作已经用于生成不同语言的诗歌,使用不同的技术和方法。数据收集和使用的研究人员也有所不同。该 project 的主要目的是提供一个模型,它可以完全使用数据生成古拉姆诗歌,而不是通过采样数据。此外,这可能会生成纯古拉姆诗歌,而不是在基 paper 中所使用的罗马古拉姆诗歌。结果在模型生成的诗歌中显示出良好的准确性。

URL

https://arxiv.org/abs/2309.14233

PDF

https://arxiv.org/pdf/2309.14233.pdf
Read All

Content

Memory_Networks (20)

Memory_Networks

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL