Artificial Intelligence

Evo-PI: Aligning Medical Reasoning via Evolving Principle-Guided Supervision
Avatar
Xianda Zheng
3 views
A Self-Evolving Agentic System for Automated Generation and Execution of Biological Protocols
Avatar
librarian
5 views
Harnessing Textual Refusal Directions for Multimodal Safety
Avatar
librarian
7 views
An Agentic AI Framework to Accelerate Scientific Discovery in Plant Phenotyping
Avatar
Renan Souza
4 views
RAISE: LLM-based Automated Heuristic Design with Robust Adversary Instance Search
Avatar
librarian
5 views
TreeAgent: A Generalizable Multi-Agent Framework for Automated Bias Labeling in Forestry via Compiled Expert Rules and Vision-Language Models
Avatar
librarian
5 views
AxDafny: Agentic Verified Code Generation in Dafny
Avatar
librarian
13 views
The FIL Hypothesis: Inductive Biases Help with Kernel Engineering
Avatar
librarian
5 views
Whose Side Is Your Agent On? Multi-Party Principal Loyalty in LLM Agents
Avatar
librarian
5 views
Linguistic Firewall: Geometry as Defense in Multi-Agent Systems Routing
Avatar
Dvir Alsheich
7 views
The Human Creativity Benchmark

The Human Creativity Benchmark

Artificial Intelligence
Avatar
librarian
5 views
DOPD: Dual On-policy Distillation

DOPD: Dual On-policy Distillation

Artificial Intelligence
Avatar
librarian
5 views
Self-Evolving World Models for LLM Agent Planning
Avatar
Xuan Zhang
7 views
MirrorCode: AI can rebuild entire programs from behavior alone
Avatar
Tom Adamczewski
7 views
Clarus: Coordinating Autonomous Research Agents toward Web-Scale Scientific Collaboration
Avatar
librarian
8 views
PromptGNN-sim: Deep Fusion and Alignment of GNN and LLMs for Text-Attributed Graph Learning
Avatar
librarian
9 views
Evidence-Informed LLM Beliefs for Continual Scientific Discovery
Avatar
librarian
6 views
Hierarchical Experimentalist Agents
Avatar
librarian
5 views
PHF: Privileged Hidden Flow for On-Policy Self-Distillation
Avatar
librarian
5 views
Adaptive Utility driven Resource Orchestration for Resilient AI (AURORA-AI)
Avatar
librarian
21 views
Ask, Don't Judge: Binary Questions for Interpretable LLM Evaluation and Self-Improvement
Avatar
Sangwoo Cho
27 views
EO-WM: A Physically Informed World Model for Probabilistic Earth Observation Forecasting
Avatar
Junwei Luo
28 views
Simulation-based inference for rapid Bayesian parameter estimation in epidemiological models: a comparison with MCMC
Avatar
Alina Bazarova
26 views
Language-Based Digital Twins for Elderly Cognitive Assistance
Avatar
librarian
29 views
OpenRCA 2.0: From Outcome Labels to Causal Process Supervision
Avatar
librarian
15 views
When Does Combining Language Models Help? A Co-Failure Ceiling on Routing, Voting, and Mixture-of-Agents Across 67 Frontier Models
Avatar
librarian
18 views
InvestPhilBench: A Multi-Layer Dynamic Benchmark for Evaluating Large Language Model Procedural Reasoning in Expert Investment Philosophy
Avatar
librarian
29 views
Autodata: An agentic data scientist to create high quality synthetic data
Avatar
Ilia Kulikov
29 views
The Unfireable Safety Kernel: Execution-Time AI Alignment for AI Agents and Other Escapable AI Systems
Avatar
librarian
31 views
Cliff Tokens: Identifying Single-Token Failure Triggers in LLM Mathematical Reasoning
Avatar
librarian
35 views
AI Snitches Get Glitches: Towards Evading Agentic Surveillance
Avatar
Hyejun Jeong
32 views
Confidence Sequences for Online Statistical Model Checking of Markov Decision Processes
Avatar
librarian
34 views