Artificial Intelligence

Entropy Is Not Enough: Unlocking Effective Reinforcement Learning for Visual Reasoning via Vision-Anchored Token Selection
Avatar
Senjie Jin
1 view
Reasoning Structure of Large Language Models
Avatar
librarian
4 views
Imaginative Perception Tokens Enhance Spatial Reasoning in Multimodal Language Models
Avatar
librarian
5 views
Gender-Dependent Diagnostic Substitution in LLM Medical Triage: Same Symptoms, Unequal Urgency
Avatar
Qi Han Wong
5 views
Diagnosing Knowledge Gaps in LLM Tool Use: An Agentic Benchmark for Novel API Acquisition
Avatar
Jinnuo Liu
5 views
From Answers to States: Verifiable Process-Level Evaluation of Chemical Reasoning in Large Language Models
Avatar
librarian
6 views
MCP-Persona: Benchmarking LLM Agents on Real-World Personal Applications via Environment Simulation
Avatar
librarian
6 views
Iteris: Agentic Research Loops for Computational Mathematics
Avatar
librarian
8 views
AGENTCL: Toward Rigorous Evaluation of Continual Learning in Language Agents
Avatar
Yiheng Shu
9 views
ClinEnv: An Interactive Multi-Stage Long Horizon EHR Environment for Agents
Avatar
librarian
9 views
eMoT: evolving Memory-of-Thought via Symbolic Anchoring and Memory Corrosion
Avatar
librarian
6 views
Property Prediction of Stacked Bilayer Materials: A Multimodal Learning Approach
Avatar
librarian
7 views
Can AI Review Improve Paper Drafting? An Empirical Study on 20 Computer Architecture Submissions
Avatar
Di Wu
6 views
Towards Understanding Modality Interaction in Multimodal Language Models via Partial Information Decomposition
Avatar
librarian
9 views
Subliminal Learning Is Steering Vector Distillation
Avatar
librarian
5 views
Physics Is All You Need? A Case Study in Physicist-Supervised AI Development of Scientific Software
Avatar
librarian
29 views
Tiny but Trusted: Efficient Vision-Language Reasoning for Time-Series Anomaly Detection
Avatar
librarian
30 views
Demystifying Data Organization for Enhanced LLM Training
Avatar
librarian
35 views
SchGen: PCB Schematic Generation with Semantic-Grounded Code Representations
Avatar
Qinpei Luo
37 views
ProjectionBench: Evaluating Scientific Hypothesis Generation in LLMs Under Progressive Information Disclosure
Avatar
Andrew Lew
38 views
MIRA: Mid-training Rubric Anchoring for Source-Aware Data Selection
Avatar
librarian
34 views
Locally Coherent, Globally Incoherent: Bounding Compositional Incoherence in Multi-Component LLM Agents
Avatar
librarian
34 views
SwarmHarness: Skill-Based Task Routing via Decentralized Incentive-Aligned AI Agent Networks
Avatar
Edwin Jose
36 views
CaMBRAIN: Real-time, Continuous EEG Inference with Causal State Space Models
Avatar
librarian
45 views
AutoScientists: Self-Organizing Agent Teams for Long-Running Scientific Experimentation
Avatar
librarian
49 views
CORE: Contrastive Reflection Enables Rapid Improvements in Reasoning
Avatar
librarian
42 views
Calibrating Conservatism for Scalable Oversight
Avatar
librarian
45 views
Detecting Is Not Resolving: The Monitoring Control Gap in Retrieval Augmented LLMs
Avatar
librarian
43 views
VitaBench 2.0: Evaluating Personalized and Proactive Agents in Long-Term User Interactions
Avatar
librarian
40 views
SIA: Self Improving AI with Harness & Weight Updates
Avatar
librarian
34 views
Alignment Tampering: How Reinforcement Learning from Human Feedback Is Exploited to Optimize Misaligned Biases
Avatar
librarian
40 views
MUSE-Autoskill: Self-Evolving Agents via Skill Creation, Memory, Management, and Evaluation
Avatar
Tieying Zhang
47 views