Abstract
Many tasks can be described as compositions over subroutines. Though modern neural networks have achieved impressive performance on both vision and language tasks, we know little about the functions that they implement. One possibility is that neural networks implicitly break down complex tasks into subroutines, implement modular solutions to these subroutines, and compose them into an overall solution to a task -- a property we term structural compositionality. Or they may simply learn to match new inputs to memorized representations, eliding task decomposition entirely. Here, we leverage model pruning techniques to investigate this question in both vision and language, across a variety of architectures, tasks, and pretraining regimens. Our results demonstrate that models oftentimes implement solutions to subroutines via modular subnetworks, which can be ablated while maintaining the functionality of other subroutines. This suggests that neural networks may be able to learn to exhibit compositionality, obviating the need for specialized symbolic mechanisms.
Abstract (translated)
许多任务可以被视为子任务的组合。尽管现代神经网络在视觉和语言任务上取得了令人印象深刻的表现,但我们对它们实现的功能了解较少。一种可能的解释是神经网络 implicit地分解复杂的任务成子任务,实现这些子任务的模块解决方案,并将它们组合成一个任务的整体解决方案——我们称之为结构组合性。或者它们可能只是学习将新的输入与记忆表示相匹配,完全省略了任务分解。在这里,我们利用模型压缩技术在视觉和语言的各种架构、任务和预训练 regimen上研究这个问题。我们的结果显示,模型常常通过模块化子任务实现解决方案,可以在减少其他子任务功能的同时保持其功能。这暗示着神经网络可能能够学习表现出组合性,从而不必使用专门的符号机制。
URL
https://arxiv.org/abs/2301.10884