Семинар MTML Lab «Tensor Attention and Manifold-Constrained Hyper-Connections»
В эту пятницу (05.02.2026) выступят: Моложавенко Александр, Юдин Николай (НИУ ВШЭ). Семинар начнется в 14:40 и пройдет очно в аудитории G407
На семенаре разберем две статьи:
1. "Tensor Product Attention Is All You Need" (https://arxiv.org/pdf/2501.06425), NeurIPS 2025 (spotlight).
TL;DR: Tensor Product Attention (TPA) is a novel mechanism that factorizes queries, keys, and values into compact, low-rank tensor components to drastically reduce the memory overhead of Key-Value caches during inference.
2. "mHC: Manifold-Constrained Hyper-Connections" (https://arxiv.org/pdf/2512.24880) от команды DeepSeek.
TL;DR: Manifold-Constrained Hyper-Connections (mHC) is a general framework that projects the residual connection space of Hyper-Connections onto a specific manifold to restore the identity mapping property. By enforcing this constraint, mHC mitigates training instability and enables effective training at scale with superior scalability compared to unconstrained methods
