The Morning
From the arXiv
Ada-Diffuser: Latent-Aware Adaptive Diffusion for Decision-Making
da-Diffuser addresses decision-making by treating it as sequence modeling with diffusion models, but crucially incorporates evolving latent dynamics. The core method is a causal diffusion model that simultaneously learns observed interaction patterns and underlying latent processes from minimal observations. This allows for more precise environment modeling and effective planning and control by explicitly accounting for hidden factors influencing agent behavior.


Context, Reasoning, and Hierarchy: A Cost-Performance Study of Compound LLM Agent Design in an Adversarial POMDP
This paper investigates how different design choices for compound LLM agents impact performance and cost in adversarial, partially observable environments. The core method involves a controlled study in a cyber defense simulation, systematically varying agent …
DebiasRAG: A Tuning-Free Path to Fair Generation in Large Language Models through Retrieval-Augmented Generation
DebiasRAG is a novel, tuning-free framework that uses retrieval-augmented generation (RAG) to dynamically debias large language models (LLMs) without requiring additional training. By retrieving relevant and unbiased information, it mitigates social biases in …


FORGE: Self-Evolving Agent Memory With No Weight Updates via Population Broadcast
FORGE is a novel method for improving LLM agent decision-making by evolving natural-language memory without gradient updates. It uses a population-based approach where failed experiences are converted into reusable knowledge (heuristics or demonstrations) by a…
Formal Methods Meet LLMs: Auditing, Monitoring, and Intervention for Compliance of Advanced AI Systems
This paper bridges formal methods and LLMs to address AI governance. It proposes techniques for auditing and monitoring LLM behavior throughout their lifecycle, enabling the verification of complex, temporally extended constraints like safety and regulatory co…

Look Before You Leap: Autonomous Exploration for LLM Agents
This paper addresses LLM agents' failure in new environments due to premature action. It introduces "Exploration Checkpoint Coverage" to measure how well agents discover key enviro…
paper.json: A Coordination Convention for LLM-Agent-Actionable Papers
This paper introduces `paper.json`, a companion JSON file to academic PDFs, designed to improve LLM agent comprehension. Its core method is a set of lightweight conventions for sta…
RecMem: Recurrence-based Memory Consolidation for Efficient and Effective Long-Running LLM Agents
RecMem addresses the inefficiency of LLM agents' memory systems by delaying memory consolidation. Instead of processing every interaction, it stores them in a lightweight "subconsc…
Argus: Evidence Assembly for Scalable Deep Research Agents
Argus addresses the inefficiency of current deep research agents by treating evidence gathering as a jigsaw puzzle. Instead of parallelizing redundant searches, its Searcher collec…
Confirming Correct, Missing the Rest: LLM Tutoring Agents Struggle Where Feedback Matters Most
This paper evaluates LLM tutoring agents' ability to distinguish between correct, suboptimal, and incorrect student reasoning in propositional logic. The core method involves a ben…
The Town Square
AI is a foundational technology, like electricity or the internet, that enables new products and services rather than being a standalone product itself.
Workshops
This repository provides a structured framework for academic research, guiding users through the iterative process of research, writing, review, revision, and finalization.
CLI-Anything transforms any software into an agent-native tool by providing a unified interface for interacting with command-line applications, enabling agents to seamlessly execute commands and retrieve results.