Bienes/productos consumo

Sample-Efficient Diffusion-based Reinforcement Learning with Critic Guidance
- Recent advances in reinforcement learning (RL) have achieved great successes by leveraging the multimodality and exploration capability of diffusion policies. Among these approaches, one...
Fewer Steps, Better Performance: Efficient Cross-Modal Clip Trimming for Video Moment Retrieval Using Language
- Given an untrimmed video and a sentence query, video moment retrieval using language (VMR) aims to locate a target query-relevant moment. Since the untrimmed video is overlong, almost all existing...
When Do Graph Foundation Models Transfer? A Data-Centric Theory
- Graph foundation models (GFMs) aim to reuse a single backbone across diverse graph domains, yet their transfer is often uneven and can exhibit negative transfer. While most prior work improves...
Not All Inputs Are Valid: Towards Open-Set Video Moment Retrieval Using Language
- Video Moment Retrieval (VMR) targets to retrieve the specific moment corresponding to a sentence query from an untrimmed video. Although recent works have made remarkable progress in this task, they...
NICE: A Theory-Grounded Diagnostic Benchmark for Social Intelligence of LLMs
- As large language models (LLMs) are increasingly applied in social contexts such as emotional companionship and customer service, measuring their social intelligence has become critical to the...
GRASP: Gated Regression-Aware Skill Proposer for Self-Improving LLM Agents
- LLM agents acting in structured environments fail in operational rather than conversational ways, and reliability depends on procedural knowledge of the environment. Prior self-improvement methods...
State-Anchored Complete-View Distillation for Robust Conversational Multimodal Emotion Recognition
- Conversational multimodal emotion recognition (MER) requires reliable prediction when language, acoustic, or visual observations are missing or unreliable. Many missing-modality methods reconstruct...
Source-Grounded Semantic Reinforcement Learning for Low-Resource Target-Language Generation
- Low-resource target-language generation is often limited by scarce parallel data, while high-resource source-language monolingual data is abundant but difficult to use with standard supervised...
ConMoE: Expert-Pool Consolidation via Prototype Reassignment for MoE Compression
- Mixture-of-Experts (MoE) language models reduce per-token computation but still require storing and serving all experts, making deployment memory-intensive. Existing post-training compression methods...
OISD: On-Policy Internal Self-Distillation of Language Models
- Recent reinforcement learning (RL) post-training approaches primarily optimize the final output policy using sparse outcome-level rewards, while largely overlooking predictive signals encoded in...
Beyond Binary: Sim-to-Real Dexterous Manipulation with Physics-Grounded Contact Representation
- A primary bottleneck in contact-rich manipulation is the difficulty of collecting real-world data. Sim-to-real reinforcement learning offers a scalable alternative, but the simulation-reality gap...
Not All Uncertainty Is Equal: How Uncertainty Granularity Shapes Human Verification in LLM-Assisted Decision Making
- Despite warnings that LLMs can make mistakes, users often develop inappropriate trust and accept incorrect answers without critical evaluation. Uncertainty quantification (UQ), displaying LLMs'...
Bandwidth-Efficient and Privacy-Preserving Edge-Cloud Many-to-Many Speech Translation
- Multimodal large language models (MLLMs) have demonstrated significant potential for speech-to-text translation (S2TT). However, existing deployment paradigms face critical challenges: pure on-device...
SPRINT: Efficient Spectral Priors for Humanoid Athletic Sprints
- The pursuit of humanoid athletic sprints is hindered by a scarcity of humanoid-viable kinematic reference data and the inability of existing frameworks to maintain stability during sprints. To...
FABSVer: Faster Training and Better Self-Verification for LLM Mathematical Reasoning
- While large language models have made significant progress in mathematical reasoning, they remain unreliable at judging the correctness of their own solutions. Existing approaches that equip models...
Do LLMs Build World Models From Text? A Multilingual Diagnostic of Spatial Reasoning
- Whether large language models (LLMs) construct internal spatial world models from pure-text descriptions remains contested, and whether such capabilities transfer across languages has not been...
AgentGuard: An Attribute-Based Access Control Framework for Tool-Use LLM-Based Agent
- LLM-based agents have recently attracted significant attention due to their ability to autonomously invoke relevant tools to accomplish complex tasks. However, recent studies have shown that these...
ConvMemory: A Lightweight Learned Memory Reranker, a Negative Attribution Result, and a Research-Preview Conflict Editor
- We describe ConvMemory, a small 3.6M-parameter learned reranker for conversational long-term memory retrieval, trained with cross-encoder teacher supervision over fused dense and lexical features. On...
Confidence-Orchestrated Self-Evolution against Uncertain LLM Feedback
- Self-evolving large language models (LLMs) learn by generating their own training tasks and solutions, reducing reliance on human-curated supervision. However, in many reasoning domains, the model...
SPAR: Support-Preserving Action Rectification
- Offline policy improvement faces an inherent conflict between maximizing value and fitting the data distribution. While in-sample weighted regression is stable, it suffers from over-conservatism that...
ClothTransformer: Unified Latent-Space Transformers for Scalable Cloth Simulation
- Unified and scalable Transformers have recently achieved remarkable success in modeling diverse phenomena traditionally associated with computer graphics, such as 3D visual effects, rendering...