Orchard: An Open-Source Agentic Modeling Framework
Avatar
librarian
27 views
APWA: A Distributed Architecture for Parallelizable Agentic Workflows
Avatar
librarian
25 views
OpenDeepThink: Parallel Reasoning via Bradley--Terry Aggregation
Avatar
librarian
29 views
Senses Wide Shut: A Representation-Action Gap in Omnimodal LLMs
Avatar
librarian
26 views
Harnessing Agentic Evolution

Harnessing Agentic Evolution

Artificial Intelligence
Avatar
librarian
22 views
History Anchors: How Prior Behavior Steers LLM Decisions Toward Unsafe Actions
Avatar
librarian
31 views
D-VLA: A High-Concurrency Distributed Asynchronous Reinforcement Learning Framework for Vision-Language-Action Models
Avatar
Yucheng Guo
25 views
Differentiable Learning of Lifted Action Schemas for Classical Planning
Avatar
Jonas Reiter
25 views
Achieving Gold-Medal-Level Olympiad Reasoning via Simple and Unified Scaling
Avatar
librarian
30 views
CAAFC: Chronological Actionable Automated Fact-Checker for misinformation / non-factual hallucination detection and correction
Avatar
Islam Eldifrawi
31 views
Formalize, Don't Optimize: The Heuristic Trap in LLM-Generated Combinatorial Solvers
Avatar
librarian
31 views
Semantic Reward Collapse and the Preservation of Epistemic Integrity in Adaptive AI Systems
Avatar
librarian
32 views
ToolCUA: Towards Optimal GUI-Tool Path Orchestration for Computer Use Agents
Avatar
Xuhao Hu
72 views
$δ$-mem: Efficient Online Memory for Large Language Models
Avatar
librarian
60 views
Classifier Context Rot: Monitor Performance Degrades with Context Length
Avatar
librarian
36 views
Reward Hacking in Rubric-Based Reinforcement Learning
Avatar
Anas Mahmoud
31 views
On-Policy Self-Evolution via Failure Trajectories for Agentic Safety Alignment
Avatar
librarian
31 views
When Simulation Lies: A Sim-to-Real Benchmark and Domain-Randomized RL Recipe for Tool-Use Agents
Avatar
Xiaolin Zhou
26 views
From Noise to Diversity: Random Embedding Injection in LLM Reasoning
Avatar
librarian
28 views
BenchCAD: A Comprehensive, Industry-Standard Benchmark for Programmatic CAD
Avatar
librarian
33 views
The Generalized Turing Test: A Foundation for Comparing Intelligence
Avatar
librarian
33 views
NanoResearch: Co-Evolving Skills, Memory, and Policy for Personalized Research Automation
Avatar
librarian
66 views
From Controlled to the Wild: Evaluation of Pentesting Agents for the Real-World
Avatar
librarian
21 views
Remember the Decision, Not the Description: A Rate-Distortion Framework for Agent Memory
Avatar
Lizhen Qu
21 views
Shepherd: A Runtime Substrate Empowering Meta-Agents with a Formalized Execution Trace
Avatar
librarian
32 views
SkillOS: Learning Skill Curation for Self-Evolving Agents
Avatar
librarian
45 views
AI Co-Mathematician: Accelerating Mathematicians with Agentic AI
Avatar
Daniel Zheng
42 views
On-line Learning in Tree MDPs by Treating Policies as Bandit Arms
Avatar
Anvay Shah
32 views
Executable World Models for ARC-AGI-3 in the Era of Coding Agents
Avatar
Sergey Rodionov
47 views
Position: Embodied AI Requires a Privacy-Utility Trade-off
Avatar
librarian
34 views
LongSeeker: Elastic Context Orchestration for Long-Horizon Search Agents
Avatar
librarian
34 views
A Foundation Model for Zero-Shot Logical Rule Induction
Avatar
librarian
36 views