Large Language Model Reasoning Failures


Large Language Models (LLMs) have exhibited remarkable reasoning capabilities, achieving impressive results across a wide range of tasks. Despite these advances, significant reasoning failures persist, occurring even in seemingly simple scenarios. To systematically understand and address these shortcomings, the authors of the paper present the first comprehensive survey dedicated to reasoning failures in LLMs.

The authors introduce a novel categorization framework that distinguishes reasoning into embodied and non-embodied types, with the latter further subdivided into informal (intuitive) and formal (logical) reasoning. In parallel, the authors classify reasoning failures along a complementary axis into three types: fundamental failures intrinsic to LLM architectures that broadly affect downstream tasks; application-specific limitations that manifest in particular domains; and robustness issues characterized by inconsistent performance across minor variations. For each reasoning failure, the authors provide a clear definition, analyze existing studies, explore root causes, and present mitigation strategies.

Read more
Source: ARXIV, Cornell University


Sign up for the Cyber Security Review Newsletter
The latest cyber security news and insights delivered right to your inbox


Related:

  • Security experts warn of AI-boosted scam campaigns that can trick even the smartest victims

    June 21, 2026

    Messaging scams are becoming increasingly sophisticated as criminals use AI to imitate trusted people, familiar brands, and everyday conversations. New research from Kaspersky suggests these schemes are succeeding with alarming speed, often convincing victims to hand over money within minutes. The findings indicate that digital experience alone may no longer provide reliable protection against modern fraud attempts. Read more… Source: TechRadar ...

  • Threat Actors Abuse claude.ai Shared Chat for ClickFix Malvertising Campaign

    June 17, 2026

    TrendAI™ Research tracked a sustained malvertising campaign that abused Google Ads to deliver ClickFix social engineering attacks disguised as popular AI developer tools. The campaign impersonated at least six legitimate brand names, including ChatGPT Codex, Perplexity, Cursor IDE, JetBrains, Claude AI, and claude.ai, and simultaneously ran Mac utility scam lures. By leveraging paid search ads targeting users actively ...

  • Hijacking Vertex AI Model Uploads for Cross-Tenant RCE

    June 16, 2026

    Palo Alto Unit42 discovered a vulnerability in the Google Cloud Vertex AI software development kit (SDK) for Python, and responsibly disclosed it to Google. Before Google’s fix, the vulnerability would have allowed an attacker operating entirely from their own Google Cloud project to hijack a victim’s model upload and poison it. By exploiting this flaw ...

  • UK: Derbyshire police officer investigated over alleged use of AI to ‘create evidence’

    June 13, 2026

    A Derbyshire police officer is being investigated over claims they used artificial intelligence (AI) to create evidence in criminal cases. The investigation is the first known case of its kind in UK criminal justice and has seen the cop removed from frontline duties. The Crown Prosecution Service (CPS) said it was “engaging with” defence lawyers and the courts over ...

  • Criminal AI-as-a-Service in 2026: How the Underground Market Is Operationalizing Cybercrime

    June 11, 2026

    The underground market for criminally oriented generative AI has moved beyond the early hype surrounding ‘malicious chatbots.’ The gradual integration of AI as a productivity layer within cybercrime operations has become the dominant story, indicating that while the potential for fully autonomous AI hacking systems is possible, attackers are not embracing them as expected. Instead, ...

  • Microsoft’s open source tools were hacked to steal passwords of AI developers

    June 8, 2026

    Microsoft has cut off access to dozens of its open source projects hosted on GitHub as it investigates how hackers apparently breached the projects and injected password-stealing malware into the code. Many of the affected projects relate to Microsoft’s cloud service Azure and other tools used by developers to code with AI development apps, such as ...