Computer Science

ReContext: Recursive Evidence Replay as LLM Harness for Long-Context Reasoning
Avatar
librarian
4 views
EvoPolicyGym: Evaluating Autonomous Policy Evolution in Interactive Environments
Avatar
librarian
6 views
Distributed Attacks in Persistent-State AI Control
Avatar
librarian
5 views
Generalization in offline RL: The structure is more important than the amount of pessimism
Avatar
Max Weltevrede
5 views
One More Time: Revisiting Neural Quantum States from a Reinforcement Learning Perspective
Avatar
Juan Agustín Duque
5 views
Purified OPSD: On-Policy Self-Distillation Without Losing How to Think
Avatar
librarian
5 views
Right in the Right Way: LM Training with Verifiable Rewards and Human Demonstrations
Avatar
Mehul Damani
16 views
Optimal Resource Utilization for Autonomous Laboratory Orchestrators
Avatar
Austin McDannald
9 views
Agentic generation of verifiable rules for deterministic, self-expanding reaction classification
Avatar
librarian
9 views
Theoria: Rewrite-Acceptability Verification over Informal Reasoning States
Avatar
librarian
10 views
AutoMem: Automated Learning of Memory as a Cognitive Skill
Avatar
librarian
10 views
QuasiMoTTo: Quasi-Monte Carlo Test-Time Scaling
Avatar
librarian
8 views
Is One Layer Enough? Training A Single Transformer Layer Can Match Full-Parameter RL Training
Avatar
Zijian Zhang
9 views
Self-Evolving Agents with Anytime-Valid Certificates
Avatar
librarian
10 views
Graph-Native Reinforcement Learning Enables Traceable Scientific Hypothesis Generation through Conceptual Recombination
Avatar
librarian
10 views
Reinforcement Learning with Metacognitive Feedback Elicits Faithful Uncertainty Expression in LLMs
Avatar
librarian
11 views
Evo-PI: Aligning Medical Reasoning via Evolving Principle-Guided Supervision
Avatar
Xianda Zheng
12 views
A Self-Evolving Agentic System for Automated Generation and Execution of Biological Protocols
Avatar
librarian
21 views
Bridging the Gap Between Latent and Explicit Reasoning with Looped Transformers
Avatar
librarian
135 views
QVal: Cheaply Evaluating Dense Supervision Signals for Long-Horizon LLM Agents
Avatar
Sergio Hernández-Gutiérrez
25 views
Harnessing Textual Refusal Directions for Multimodal Safety
Avatar
librarian
17 views