Artificial Intelligence

Reasoning Structure of Large Language Models
Avatar
librarian
0 views
Imaginative Perception Tokens Enhance Spatial Reasoning in Multimodal Language Models
Avatar
librarian
0 views
Gender-Dependent Diagnostic Substitution in LLM Medical Triage: Same Symptoms, Unequal Urgency
Avatar
Qi Han Wong
2 views
Diagnosing Knowledge Gaps in LLM Tool Use: An Agentic Benchmark for Novel API Acquisition
Avatar
Jinnuo Liu
2 views
From Answers to States: Verifiable Process-Level Evaluation of Chemical Reasoning in Large Language Models
Avatar
librarian
2 views
MCP-Persona: Benchmarking LLM Agents on Real-World Personal Applications via Environment Simulation
Avatar
librarian
4 views
Iteris: Agentic Research Loops for Computational Mathematics
Avatar
librarian
4 views
AGENTCL: Toward Rigorous Evaluation of Continual Learning in Language Agents
Avatar
Yiheng Shu
5 views
ClinEnv: An Interactive Multi-Stage Long Horizon EHR Environment for Agents
Avatar
librarian
6 views
eMoT: evolving Memory-of-Thought via Symbolic Anchoring and Memory Corrosion
Avatar
librarian
4 views
Property Prediction of Stacked Bilayer Materials: A Multimodal Learning Approach
Avatar
librarian
4 views
Can AI Review Improve Paper Drafting? An Empirical Study on 20 Computer Architecture Submissions
Avatar
Di Wu
4 views
Towards Understanding Modality Interaction in Multimodal Language Models via Partial Information Decomposition
Avatar
librarian
7 views
Subliminal Learning Is Steering Vector Distillation
Avatar
librarian
4 views
Physics Is All You Need? A Case Study in Physicist-Supervised AI Development of Scientific Software
Avatar
librarian
24 views
Tiny but Trusted: Efficient Vision-Language Reasoning for Time-Series Anomaly Detection
Avatar
librarian
26 views
Demystifying Data Organization for Enhanced LLM Training
Avatar
librarian
32 views
SchGen: PCB Schematic Generation with Semantic-Grounded Code Representations
Avatar
Qinpei Luo
35 views
ProjectionBench: Evaluating Scientific Hypothesis Generation in LLMs Under Progressive Information Disclosure
Avatar
Andrew Lew
35 views
MIRA: Mid-training Rubric Anchoring for Source-Aware Data Selection
Avatar
librarian
32 views
Locally Coherent, Globally Incoherent: Bounding Compositional Incoherence in Multi-Component LLM Agents
Avatar
librarian
32 views
SwarmHarness: Skill-Based Task Routing via Decentralized Incentive-Aligned AI Agent Networks
Avatar
Edwin Jose
36 views
CaMBRAIN: Real-time, Continuous EEG Inference with Causal State Space Models
Avatar
librarian
45 views
AutoScientists: Self-Organizing Agent Teams for Long-Running Scientific Experimentation
Avatar
librarian
47 views
CORE: Contrastive Reflection Enables Rapid Improvements in Reasoning
Avatar
librarian
41 views
Calibrating Conservatism for Scalable Oversight
Avatar
librarian
44 views
Detecting Is Not Resolving: The Monitoring Control Gap in Retrieval Augmented LLMs
Avatar
librarian
43 views
VitaBench 2.0: Evaluating Personalized and Proactive Agents in Long-Term User Interactions
Avatar
librarian
39 views
SIA: Self Improving AI with Harness & Weight Updates
Avatar
librarian
33 views
Alignment Tampering: How Reinforcement Learning from Human Feedback Is Exploited to Optimize Misaligned Biases
Avatar
librarian
38 views
MUSE-Autoskill: Self-Evolving Agents via Skill Creation, Memory, Management, and Evaluation
Avatar
Tieying Zhang
46 views
The Attribution Blind Spot: Detecting When Language Models Rely on Memory Rather Than Retrieved Context
Avatar
librarian
37 views