Reading Market
Intent-Guided Reasoning for Sequential Recommendation
The IGRSR framework addresses reasoning instability and shallow reasoning in sequential recommendation by anchoring to highlevel user intents, using dualattention architecture. It improves robustness, showing a 7.13% average gain even under noisy data.
On the Memorization Behavior of LLMs in Generative Recommendation: Observations, Implications, and Training Strategies
This study examines the memorization behavior of LLMs in generative recommendation, revealing their reliance on onehop memorization which limits knowledge utilization. The proposed training strategy IIRG enhances performance by capturing richer item relations, especially for und
FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence
FixMatch is a simplified semisupervised learning algorithm that generates highconfidence pseudolabels from weaklyaugmented unlabeled images and trains on stronglyaugmented versions. It demonstrates stateoftheart performance on several benchmarks.
Rethinking Shrinkage Bias in LLM FP4 Pretraining: Geometric Origin, Systemic Impact, and UFP4 Recipe
Published:6/18/2026
This study uncovers shrinkage bias in the E2M1 format for LLM FP4 pretraining, leading to systematic rounding errors and training instability. The proposed UFP4 recipe mitigates geometric asymmetry, enhancing quantization quality and demonstrating lower BF16 relative loss degrada
Selective Synergistic Learning for Video Object-Centric Learning
Published:6/14/2026
The study introduces Selective Synergistic Learning (SSync) to address mismatches between encoder and decoder in video objectcentric learning, enhancing boundary refinement and denoising while reducing computational complexity, improving model decomposition quality and robustnes
HumanScale: Egocentric Human Video Can Outperform Real-Robot Data for Embodied Pretraining
Published:6/19/2026
This study provides a controlled comparison of egocentric human videos and realrobot data for pretraining embodied models, finding that wellprocessed human videos lead to better performance, reducing validation loss by 24% and increasing success rates by 52.5% and 90% respectiv
FreeStyle: Free Control of Style-Content Dual-Reference Generation from Community LoRA Mining
Published:6/19/2026
FreeStyle is a framework for stylecontent dualreference generation that prevents semantic leakage while allowing flexible separation of style and content. It leverages community LoRA mining to create a large triplet dataset and employs a twostage training approach to address d
Configurable Clinical Information Extraction with Agentic RAG: What Works, What Breaks, and Why
Published:6/18/2026
This study presents the ACIE system, utilizing Agentic RAG for configurable clinical information extraction. With a validation of 7,326 clinician judgments, it achieved a 96.5% acceptance rate, highlighting its efficacy and accuracy in structured data extraction.
LegalHalluLens: Typed Hallucination Auditing and Calibrated Multi-Agent Debate for Trustworthy Legal AI
Published:6/16/2026
The paper introduces LegalHalluLens, an auditing framework addressing the 52% hallucination rate in legal AI. It includes typed hallucination profiles, a Risk Direction Index, and a multiagent debate, revealing specific error types and directions for actionable compliance signal
LooseControlVideo: Directorial Video Control using Spatial Blocking
Published:6/18/2026
The study introduces the LooseControlVideo framework, enabling directorial control in texttovideo generation using sparse 3D boxes, eliminating the need for dense perframe signals. Users define highlevel layouts while the model generates realistic occlusions and dynamics, out
LedgerAgent: Structured State for Policy-Adherent Tool-Calling Agents
Published:6/19/2026
LedgerAgent introduces a structured state management approach for policycompliant toolcalling agents, mitigating risks of state grounding and policy violations. It significantly improves toolcalling success rates in four customer service domains.
FAPO: Fully Autonomous Prompt Optimization of Multi-Step LLM Pipelines
Published:6/18/2026
The FAPO framework automatically optimizes prompts and structures in multistep LLM pipelines, addressing failures from complex interactions. It outperforms baselines in 15 of 18 comparisons with an average gain of 14.1 percentage points, also excelling in security tasks.
Context-Aware RL for Agentic and Multimodal LLMs
Published:6/16/2026
The study introduces ContextRL, a reinforcement learning method that enhances large language models' performance in longhorizon reasoning and multimodal tasks by training models to select supportive contexts. ContextRL showed 2.2% and 1.8% improvements in benchmarks for coding
ENPIRE: Agentic Robot Policy Self-Improvement in the Real World
Published:6/18/2026
The ENPIRE framework enables autonomous improvement of robot policies in realworld settings, minimizing human intervention. It consists of four core modules: automatic environment reset, policy execution, outcome verification, and iterative optimization, achieving a 99% success
Beyond Static Leaderboards: Predictive Validity for the Evaluation of LLM Agents
Published:6/18/2026
The paper introduces a novel evaluation method for LLM agents focusing on predictive validity over traditional aggregate scores. By integrating fourteen studies with seven benchmarks, it establishes a twelvedimensional measurement framework, highlighting the shortcomings of curr
Playful Agentic Robot Learning
Published:6/18/2026
This paper explores how robots can continuously learn reusable skills through selfdirected play without explicit instructions. The introduced Robotics Agent Teams (RATs) significantly improve performance on downstream tasks, achieving gains of 20.6% and 17.0% over baselines.
Getting Serious about Humor: Crafting Humor Datasets with Unfunny Large Language Models
Published:2/23/2024
This study explores using Large Language Models (LLMs) to edit humorous texts for synthetic humor detection datasets. Benchmark tests show LLMs excel at 'unfunning' jokes, particularly with GPT4's data, which received high ratings from bilingual annotators and presented challeng
Harmonizing Semantic and Collaborative in LLMs: Reasoning-based Embedding Generator for Sequential Recommendation
This study presents ReaEmb, a novel framework addressing longtail issues in sequential recommendation systems by utilizing latent reasoningenhanced contrastive learning and collaborative reward reinforcement learning, demonstrating its superiority across multiple models through
MetaRCA: A Generalizable Root Cause Analysis Framework for Cloud-Native Systems Powered by Meta Causal Knowledge
Published:3/3/2026
MetaRCA is a generalizable root cause analysis framework for cloudnative systems that constructs a Meta Causal Graph, integrating knowledge from Large Language Models and observational data. It demonstrates improved scalability and generalization, outperforming existing methods
ViT-Up: Faithful Feature Upsampling for Vision Transformers
Published:6/12/2026
ViTUp is an implicit feature upsampling framework for Vision Transformers, utilizing layerwise queries instead of external image guidance. It enables feature prediction at any image coordinate while maintaining alignment with the backbone, demonstrating superior performance in
…