Jiuxiang Gu
Home
News
Publications
CV
Zhenmei Shi
Latest
Conv-basis: A new paradigm for efficient attention inference and gradient computation in transformers (arXiv 2024)
Exploring the Frontiers of Softmax: Provable Optimization, Applications in Diffusion Model, and Beyond (arXiv 2024)
Fourier Circuits in Neural Networks: Unlocking the Potential of Large Language Models in Mathematical Reasoning and Modular Arithmetic (arXiv 2024)
Tensor Attention Training: Provably Efficient Learning of Higher-order Transformers (arXiv 2024)
Unraveling the Smoothness Properties of Diffusion Models: A Gaussian Mixture Perspective (arXiv 2024)
Cite
×