Conv-basis: A new paradigm for efficient attention inference and gradient computation in transformers (arXiv 2024)

Publication
arXiv