Jiuxiang Gu
Home
News
Publications
CV
Zhenmei Shi
Latest
Conv-basis: A new paradigm for efficient attention inference and gradient computation in transformers (arXiv 2024)
Exploring the frontiers of softmax: Provable optimization, applications in diffusion model, and beyond (arXiv 2024)
Fast John Ellipsoid Computation with Differential Privacy Optimization (arXiv 2024)
Fourier circuits in neural networks: Unlocking the potential of large language models in mathematical reasoning and modular arithmetic (arXiv 2024)
Tensor attention training: Provably efficient learning of higher-order transformers (arXiv 2024)
Toward Infinite-Long Prefix in Transformer (arXiv 2024)
Unraveling the Smoothness Properties of Diffusion Models: A Gaussian Mixture Perspective (arXiv 2024)
Cite
×