The Morning
From the arXiv
History Anchors: How Prior Behavior Steers LLM Decisions Toward Unsafe Actions
his paper introduces HistoryAnchor-100, a dataset designed to test LLM safety by examining how prior harmful actions influence future decisions. The core method involves presenting LLMs with scenarios where a harmful past action is followed by a choice between safe and unsafe options. The key contribution is demonstrating that a simple instruction to "stay consistent with the strategy shown in the prior history" dramatically increases LLM unsafe action selection, even for highly aligned models, highlighting a critical vulnerability in current LLM agent design.
Temper and Tilt Lead to SLOP: Reward Hacking Mitigation with Inference-Time Alignment
This paper introduces SLOP, a method for inference-time alignment that generalizes existing techniques by using a sharpened logarithmic opinion pool of generative reward models. By adjusting the "temperature" of reference models and calibrating SLOP weights, t…
AttenA+: Rectifying Action Inequality in Robotic Foundation Models
This paper introduces AttenA+, a framework that addresses the "action inequality" in robotic foundation models. It recognizes that low-velocity actions are often more critical for task success than high-velocity transitions. AttenA+ rectifies this by reweighti…

Beyond Perplexity: A Geometric and Spectral Study of Low-Rank Pre-Training
This paper investigates whether low-rank pre-training methods for large language models generalize as well as full-rank training, a question previously addressed only by limited perplexity metrics. The authors provide a more thorough comparison by analyzing th…
Children's English Reading Story Generation via Supervised Fine-Tuning of Compact LLMs with Controllable Difficulty and Safety
This paper fine-tunes compact LLMs (8B parameters) on expert-designed children's reading curricula and existing generated stories. The core method focuses on controllable difficulty and safety, enabling educators to target specific reading levels. The main con…

EVA-Bench: A New End-to-end Framework for Evaluating Voice Agents
EVA-Bench is an end-to-end framework for evaluating voice agents. Its core method involves generating realistic, multi-turn bot-to-bot audio conversations with automatic validation…
Harnessing Agentic Evolution
This paper introduces AEvo, a meta-editing framework for agentic evolution. AEvo treats the evolutionary process as an interactive environment, using accumulated evidence as its st…
Position: Assistive Agents Need Accessibility Alignment
This paper argues that assistive AI agents for visually impaired users must prioritize "accessibility alignment" as a core design goal, not an afterthought. Current agentic AI fail…
RealICU: Do LLM Agents Understand Long-Context ICU Data? A Benchmark Beyond Behavior Imitation
This paper introduces RealICU, a novel benchmark for evaluating LLMs on long-context ICU data. Unlike previous benchmarks that rely on potentially suboptimal clinician actions, Rea…
ScioMind: Cognitively Grounded Multi-Agent Social Simulation with Anchoring-Based Belief Dynamics and Dynamic Profiles
ScioMind is a novel multi-agent social simulation framework that integrates structured opinion dynamics with LLM-based agent reasoning. Its core method combines a personality-condi…
The Town Square
A user reported losing access to their projects after unsubscribing from Claude Design, highlighting a potential issue with the service's data retention policy.
Workshops
OpenHuman is a personal AI superintelligence designed to be private, simple, and powerful, acting as your dedicated AI assistant.
This repository provides persistent memory for AI coding agents, enabling them to retain and recall information effectively, which is crucial for complex coding tasks.