Free Reads
Sign in to view your remaining parses.
Tag Filter
Linear Attention Mechanism
Bridging the Divide: Reconsidering Softmax and Linear Attention
Published:12/9/2024
Linear Attention MechanismLimitations of Softmax AttentionVision Transformer ArchitectureLong-Range Information ModelingAttention Weight Confusion
The paper compares Softmax and linear attention, revealing that performance gaps stem from linear attention's noninjective nature and lack of local modeling. The findings suggest enhancing linear attention with these traits can outperform Softmax while reducing computational cos
02
Jet-Nemotron: Efficient Language Model with Post Neural Architecture
Search
Published:8/22/2025
Post Neural Architecture SearchHybrid-Architecture Language ModelsEfficient Generation InferenceLinear Attention MechanismHardware-Aware Hyperparameter Search
JetNemotron uses PostNAS to freeze pretrained MLP weights and optimize attention blocks, creating efficient hybridarchitecture language models that match or surpass accuracy of leading models while boosting generation throughput by up to 53.6×.
04
LinearSR: Unlocking Linear Attention for Stable and Efficient Image
Super-Resolution
Published:10/10/2025
Image Super-resolutionLinear Attention MechanismPerception-Distortion Trade-off OptimizationEarly-Stopping Guided Fine-tuningSNR-based Mixture of Experts
LinearSR enables stable, efficient image superresolution by addressing training instability, perceptiondistortion tradeoffs, and guidance efficiency using novel finetuning, SNRbased experts, and lightweight guidance strategies.
010
LinVideo: A Post-Training Framework towards O(n) Attention in Efficient
Video Generation
Published:10/9/2025
video diffusion modelsLinear Attention MechanismPost-Training Sparse Attention OptimizationEfficient Video GenerationDistribution Matching Objective
LinVideo is a datafree posttraining framework that selectively replaces selfattention with linear attention in video diffusion models, using anytime distribution matching to maintain performance and achieve up to 15.92× latency reduction and 1.25–2× speedup.
010