Science Cast

Is Attention All What You Need? -- An Empirical Investigation on Convolution-Based Active Memory and Self-Attention

legendblackguardianMarch 2, 2024 9:34am

Views (210)
Comments (0)

Export Citation

Voice is AI-generated

Connected to paperThis paper is a preprint and has not been certified by peer review

Is Attention All What You Need? -- An Empirical Investigation on Convolution-Based Active Memory and Self-Attention

arXivPDFDecember 27, 2019 12:00am

Authors

Thomas Dowdell, Hongyu Zhang

Abstract

The key to a Transformer model is the self-attention mechanism, which allows the model to analyze an entire sequence in a computationally efficient manner. Recent work has suggested the possibility that general attention mechanisms used by RNNs could be replaced by active-memory mechanisms. In this work, we evaluate whether various active-memory mechanisms could replace self-attention in a Transformer. Our experiments suggest that active-memory alone achieves comparable results to the self-attention mechanism for language modelling, but optimal results are mostly achieved by using both active-memory and self-attention mechanisms together. We also note that, for some specific algorithmic tasks, active-memory mechanisms alone outperform both self-attention and a combination of the two.

TwitterandLinkedIn

0 comments

Add comment

Is Attention All What You Need? -- An Empirical Investigation on Convolution-Based Active Memory and Self-Attention

Is Attention All What You Need? -- An Empirical Investigation on Convolution-Based Active Memory and Self-Attention

AI-powered Paper ChatBeta

AI-powered Paper ChatBeta

0 comments