Tensor attention training: Provably efficient learning of higher-order transformers (arXiv 2024)

Publication
arXiv