2026-05-19 — Linnet Daily

There is nothing better than being a parent. It is the most challenging job one could ever ask for. I love being a mom and I love being a friend to my children as well. — Marlee Matlin 39 items · 3 sections

§I arXiv Papers (20) §II Hacker News (10) §III GitHub Trending (9)

§0 Weather §I arXiv Papers §II Hacker News §III GitHub Trending

§ 0

The Morning

Local weather 1

This morning in

London

Overcast

Today's range

17.4°↓11.1°

currently 15.0°

Feels

13.0°

Rain

100%

Wind

17 km/h

Humid

80%

Rise

05:02

Set

20:51

§ I

From the arXiv

arXiv preprints 10 of 20

cs.AIarxiv:2605.18661v1Lead article

AI for Auto-Research: Roadmap & User Guide

Lingdong Kong, Xian Sun, Wei Chow, Linfeng Li, Kevin Qinghong Lin

his paper analyzes the AI research lifecycle, from idea generation to dissemination, identifying a critical boundary between reliable AI assistance and unreliable autonomy. While AI excels at structured tasks like literature review and data generation, it struggles with nuanced aspects like fabricating results, identifying errors, and assessing novelty, particularly under scientific pressure. The authors provide a roadmap and user guide to navigate these capabilities and limitations.

Read abstract →Full PDF

AI auto-research across the complete lifecycle. We organize AI assistance into four phases and eight stages: 1 Creation spans idea generation, literature review, coding & experiments, and tables & figures; 2 Writing centers on paper writing; 3 Validation includes peer review and rebuttal & revision; and 4 Dissemination transforms papers into posters, slides, videos, social media, project pages, and interactive paper agents. — AI auto-research across the complete lifecycle. We organize AI assistance into four phases and eight stages: 1 Creation spans idea generation, literature review, coding & experiments, and tables & figures; 2 Writing centers on paper writing; 3 Validation includes peer review and …

cs.AIarxiv:2605.18747v1

Code as Agent Harness

Xuying Ning, Katherine Tieu et al.

This paper introduces "code as agent harness," a new perspective on how large language models (LLMs) are used in agentic systems. The core method is to view code not just as an output, but as the fundamental infrastructure for agent reasoning, action, and envi…

abstract pdf

cs.AIarxiv:2605.18672v1

Position: A Three-Layer Probabilistic Assume-Guarantee Architecture Is Structurally Required for Safe LLM Agent Deployment

S. Bensalem, Y. Dong et al.

This paper argues that LLM agent safety requires a three-layer probabilistic architecture, not a single one. Each layer enforces a distinct safety dimension (intent, environment, dynamics) using independently certified probabilistic guarantees, which then form…

abstract pdf

Overview of SkillGenBench. Skill-generation pipelines transform repository- and document-grounded sources into standardized skill packages, which are evaluated under task-conditioned and task-agnostic tracks with fixed execution checks and artifact-level diagnostics. — Overview of SkillGenBench. Skill-generation pipelines transform repository- and document-grounded sources into standardized skill packages, which are evaluated under task-conditioned and task-agnostic…

cs.AIarxiv:2605.18693v1

SkillGenBench: Benchmarking Skill Generation Pipelines for LLM Agents

Yifan Zhou, Zhentao Zhang et al.

This paper introduces SkillGenBench, a novel benchmark designed to evaluate the crucial ability of LLM agents to generate correct and reusable skills from raw data. Unlike previous benchmarks, SkillGenBench specifically isolates and assesses the skill generati…

abstract pdf

cs.LGarxiv:2605.18703v1

EnvFactory: Scaling Tool-Use Agents via Executable Environments Synthesis and Robust RL

Minrui Xu, Zilin Wang et al.

EnvFactory addresses the challenges of scaling tool-use LLM agents by automatically synthesizing realistic, stateful execution environments from authentic resources. It then generates robust, multi-turn training data by sampling and refining trajectories to ca…

abstract pdf

The left figure presents an overview of EnvGen : the Search Agent autonomously proposes and searches for authentic sources; the Code Agent implements the database and code using feedback from the Test Agent; and the Test Agent generates test cases and error reports. The collaboration between three agents construct diverse, verified environments. The right figure displays a sunburst plot of environments , with the inner ring indicating the proportion of each domain they belongs to and the outer ring showing the number of tools for each environment. — The left figure presents an overview of EnvGen : the Search Agent autonomously proposes and searches for authentic sources; the Code Agent implements the database and code using feedback from the Test…

General Preference Reinforcement Learning

Muhammad Umer, Muhammad Ahmed Mohsin et al.

This paper introduces General Preference Reinforcement Learning (GPRL) to bridge the gap between online RL and preference optimization for LLMs. GPRL uses a General Preference Mode…

abstract pdf

MA$^{2}$P: A Meta-Cognitive Autonomous Intelligent Agents Framework for Complex Persuasion

Dingyi Zhang, Ziqing Zhuang et al.

MA$^{2}$P is a novel framework for complex persuasive dialogue generation that addresses limitations in current approaches. It employs a meta-cognitive, multi-agent architecture to…

abstract pdf

AMR-SD: Asymmetric Meta-Reflective Self-Distillation for Token-Level Credit Assignment

Zhenlin Wei, Pu Jian et al.

This paper introduces Asymmetric Meta-Reflective Self-Distillation (AMR-SD) to address the credit-assignment problem in aligning LLMs for complex reasoning. Instead of directly usi…

abstract pdf

CrossView Suite: Harnessing Cross-view Spatial Intelligence of MLLMs with Dataset, Model and Benchmark

Wei Wang, Yuqian Yuan et al.

This paper introduces CrossView Suite, a comprehensive framework to enhance multimodal large language models' (MLLMs) spatial reasoning across multiple viewpoints. It addresses dat…

abstract pdf

DashAttention: Differentiable and Adaptive Sparse Hierarchical Attention

Yuxiang Huang, Nuno M. T. Gonçalves et al.

DashAttention introduces a novel hierarchical attention mechanism that addresses limitations of prior methods. Its core innovation is using an adaptive sparse $α$-entmax transforma…

abstract pdf

See all 20 papers →

§ II

The Town Square

Hacker News 10

470

pts

Top story

The last six months in LLMs in five minutes

The past six months have seen rapid LLM advancements, including improved reasoning, multimodal capabilities, and the emergence of smaller, more efficient models.

simonwillison.net19 May discuss on HN →

469

We stopped AI bot spam in our GitHub repo using Git's –author flag

archestra.ai18 May

355

Eric Schmidt speech about AI booed during graduation

nbcnews.com18 May

270

We let AIs run radio stations

andonlabs.com18 May

235

AI eats the world (Spring 26) [pdf]