Natural Language Processing

Artificial Intelligence

April 23, 2026

AVISE: an open‑source framework that automates tests for AI security and jailbreaks

This paper introduces AVISE (AI Vulnerability Identification and Security Evaluation), a modular, open‑source framework meant to help resear

EN

2 min read

Artificial Intelligence

April 14, 2026

Meerkat: a new way to find safety failures that only appear across many AI traces

AI safety problems sometimes hide across many short interactions. A single conversation or log file can look harmless, but when a small set

EN

2 min read

Artificial Intelligence

March 24, 2026

ThinkJEPA blends a dense video predictor with a vision–language “thinker” to forecast longer-range hand movements

This paper presents ThinkJEPA, a method that combines two ways of understanding video to predict future states for tasks like hand-manipulat

EN

2 min read

Artificial Intelligence

March 23, 2026

How large language models switch moral frameworks while they reason

This paper studies how large language models (LLMs) organize moral thinking across multiple intermediate steps. The authors introduce the id

EN

2 min read

Natural Language Processing

March 20, 2026

Study measures how much sound knowledge lives inside language models and how that affects audio AI

This paper asks a simple question: how much do large language models (LLMs) already know about sounds from text-only training, and does that

EN

2 min read

Artificial Intelligence

March 20, 2026

FinTradeBench: a new test that asks language models to combine company filings and market price signals

This paper introduces FinTradeBench, a benchmark that tests whether Large Language Models (LLMs) can reason about both company fundamentals

EN

2 min read

Natural Language Processing

March 14, 2026

Study shows simple human preferences can hide problems when judging long scientific answers

This paper looks at how we check the checkers for long-form question-answering systems. The authors focus on ScholarQA-CS2, a benchmark for

EN

2 min read

Natural Language Processing

March 14, 2026

REdit: reshaping neural circuits to edit specific reasoning errors in large language models

Large language models can reason in impressive ways. But they also make systematic reasoning mistakes that are hard to fix with broad retrai

EN

2 min read

Artificial Intelligence

March 14, 2026

LieCraft: a sandbox game that tests whether large language models will lie to meet goals

This paper introduces LieCraft, a new evaluation framework and sandbox for measuring deception in large language models (LLMs). In plain ter

EN

2 min read

arXiv News

Today's Briefing

AVISE: an open framework that automates finding jailbreaks in language models

Latest Research

AVISE: an open‑source framework that automates tests for AI security and jailbreaks

Meerkat: a new way to find safety failures that only appear across many AI traces

ThinkJEPA blends a dense video predictor with a vision–language “thinker” to forecast longer-range hand movements

How large language models switch moral frameworks while they reason

Study measures how much sound knowledge lives inside language models and how that affects audio AI

FinTradeBench: a new test that asks language models to combine company filings and market price signals

Study shows simple human preferences can hide problems when judging long scientific answers

REdit: reshaping neural circuits to edit specific reasoning errors in large language models

LieCraft: a sandbox game that tests whether large language models will lie to meet goals