Research: chatbots interpret 'likely' as 80% probability while humans hear 65%

A study published in NPJ Complexity found significant gaps between how LLMs and humans interpret probability words. When an AI uses "likely," it represents about 80% probability, but human readers assume closer to 65%. The models also showed sensitivity to gendered language, becoming more rigid when prompts changed from "he" to "she." The researchers warn these misalignments could lead to consequential misunderstandings in high-stakes domains like medicine or finance where AI-generated probability language influences human decisions.

View full digest for February 25, 2026