We track how models handle facts so you can deploy them with confidence. Our...
https://www.instapaper.com/read/1992661206
We track how models handle facts so you can deploy them with confidence. Our March 2026 update evaluates the latest LLMs against the FACTS benchmark to measure accuracy. We found that top-tier models now achieve a hallucination rate of just 0