First-hand information for everyone
This paper shows how to combine modern score-based generative models with a popular optimization algorithm, the alternating direction method
This paper introduces Questions-of-Thoughts, or QoT, a method that steers large language models (LLMs) toward higher-quality software design
This paper looks at how large language models (LLMs) can turn a researcher’s plain-language request into running code and tools, while still
This paper introduces LieCraft, a new evaluation framework and sandbox for measuring deception in large language models (LLMs). In plain ter
This paper introduces SAHOO, a practical framework to watch and control subtle shifts in behavior when machine learning systems update thems
Researchers tested whether dense numerical summaries of DNA sequences — called embeddings — can be turned back into the original DNA. Embedd
Researchers introduced MEMO, a method that makes long, multi-turn games played by large language model (LLM) agents both stronger and more c
Researchers propose AgentOS, a new way to design personal computers where people talk or type naturally and a central “Agent Kernel” carries
This paper looks at security risks that arise when many autonomous AI agents work together. These “multi‑agent systems” share memory, call e
Large language model (LLM) agents often claim they called a tool or read a webpage when they did not. This paper introduces NabaOS, a practi