Artificial Intelligence

Nonslop: A Gamified Experiment in Human-AI Collaborative Writing
Avatar
Maria Edwards
0 views
Towards Responsibly Non-Compliant Machines
Avatar
librarian
2 views
The Impossibility of Eliciting Latent Knowledge
Avatar
librarian
2 views
A Five-Plane Reference Architecture for Runtime Governance of Production AI Agents
Avatar
Krti Tallam
2 views
PROJECTMEM: A Local-First, Event-Sourced Memory and Judgment Layer for AI Coding Agents
Avatar
librarian
2 views
StatefulDiscovery: Evidence-Calibrated Claim Formation in Open-Ended Scientific Discovery
Avatar
12531182
3 views
Embodied-BenchClaw: An Autonomous Multi-Agent System for Embodied Spatial Intelligence Benchmark Construction
Avatar
librarian
3 views
ABC-Bench: An Agentic Bio-Capabilities Benchmark for Biosecurity
Avatar
librarian
8 views
CIAware-Bench: Benchmarking Control Intervention Awareness Across Frontier LLMs
Avatar
librarian
8 views
Null-Space Constrained Low-Rank Adaptation for Response-Specified Large Language Model Unlearning
Avatar
librarian
8 views
Workflow-GYM: Towards Long-Horizon Evaluation of Computer-use Agentic tasks in Real-World Professional Fields
Avatar
librarian
9 views
ReasonAlloc: Hierarchical Decoding-Time KV Cache Budget Allocation for Reasoning Models
Avatar
librarian
8 views
AutoPDE: Reliable Agentic PDE Solving via Explicitly Represented Solver Strategies
Avatar
librarian
6 views
Frontier Coding Agents Use Metaprogramming to Adapt to Unfamiliar Programming Languages
Avatar
librarian
7 views
Moonshine: An Autonomous Mathematical Research Agent Centered on Conjecture Generation
Avatar
librarian
5 views
WorldKernel: A World Model is the Coupling Kernel of Admissible Possible Worlds
Avatar
librarian
5 views
Recalling Too Well: Sycophancy Evaluation and Mitigation in Memory-Augmented Models
Avatar
librarian
5 views
(Auto)formalization is supposed to be easy: Trellis process semantics for spelling out rigorous proofs
Avatar
librarian
15 views
SIGA: Self-Evolving Coding-Agent Adapters for Scientific Simulation
Avatar
librarian
14 views
Proxy Reward Internalization and Mechanistic Exploitation: A Learned Precursor to Reward Hacking and Its Generalization
Avatar
Mohammad Beigi
13 views
SearchSwarm: Towards Delegation Intelligence in Agentic LLMs for Long-Horizon Deep Research
Avatar
librarian
15 views
Evaluation Cards: An Interpretive Layer for AI Evaluation Reporting
Avatar
librarian
18 views
From 0-to-1 to 1-to-N: Reproducible Engineering Evidence for MetaAI Recursive Self-Design
Avatar
librarian
11 views
Optical Reasoning: Rethinking Images as an Expressive Reasoning Medium Beyond Text
Avatar
Yutong Bian
17 views
TokenMizer: Graph-Structured Session Memory for Long-Horizon LLM Context Management
Avatar
Shweta Mishra
83 views
Vortex: Efficient and Programmable Sparse Attention Serving for AI Agents
Avatar
Zhuoming Chen
31 views
Benchmark Everything Everywhere All at Once
Avatar
librarian
23 views
Goedel-Architect: Streamlining Formal Theorem Proving with Blueprint Generation and Refinement
Avatar
librarian
22 views
MLEvolve: A Self-Evolving Framework for Automated Machine Learning Algorithm Discovery
Avatar
Xiangchao Yan
23 views
Beyond Objective Equivalence: Constraint Injection for LLM-Based Optimization Modeling on Vehicle Routing Problems
Avatar
librarian
25 views
R-APS: Compositional Reasoning and In-Context Meta-Learning for Constrained Design via Reflective Adversarial Pareto Search
Avatar
librarian
28 views
AutoLab: Can Frontier Models Solve Long-Horizon Auto Research and Engineering Tasks?
Avatar
librarian
29 views