Tool receipts plus Indian epistemology flag AI agent hallucinations in under 15 ms
Large language model (LLM) agents often claim they called a tool or read a webpage when they did not. This paper introduces NabaOS, a practi
Large language model (LLM) agents often claim they called a tool or read a webpage when they did not. This paper introduces NabaOS, a practical system that detects many of those “hallucinations” quickly. NabaOS creates signed “tool receipts” that the model cannot forge and then checks each claim in an agent’s reply against those receipts in real time.
The authors borrow categories of knowledge from Nyaya Shastra, a classical Indian theory of how we know things. Every factual claim in an agent response is labeled by its epistemic source (pramn.a): pratyaks.a (direct tool output), anumna (inference from data), upamna (analogy or comparison), sabda (external testimony, like a website), abhva (an absence claim), or ungrounded opinion. NabaOS issues HMAC-signed receipts (HMAC = Hash-based Message Authentication Code) for each tool call. The system cross-references claims with receipts and runs extra checks — for example, independently re-fetching URLs or replaying computations — when an agent delegates complex web tasks.
The paper tests NabaOS on NyayaVerifyBench, a new benchmark of 1,800 agent-response scenarios in four languages with six injected hallucination types. In those tests NabaOS detected 94.2% of fabricated tool references, 87.6% of miscounted outputs, and 91.3% of false absence claims. For multi-step web tasks (deep delegation), the cross-checking protocol caught 78.4% of fabricated URLs. Verification added less than 15 milliseconds per response. The authors also report that responses labeled “Fully Verified” by their system were correct 98.7% of the time in their experiments.
Why this matters: prior proposals for verifiable AI inference use zero-knowledge (ZK) cryptographic proofs that show a computation ran as claimed. Those proofs give strong cryptographic guarantees but are slow (the paper reports examples needing about 180 seconds per query) and require specialized hardware. They also do not guarantee factual correctness: a model can correctly run a computation that nevertheless outputs a false or invented fact. NabaOS focuses on whether a claim is grounded in actual tool outputs. That makes it far faster and more practical for interactive agents, at the cost of different guarantees.