Author: High Dimensional Research

Why Agents Fail
AI-based agents currently struggle with reliability. As the article notes, "they are really bad at doing things. They can't reliably do even basic addition and multiplication."
The Core Problem: Probabilistic Compounding
The fundamental issue stems from probability mathematics. When an agent must succeed at multiple sequential steps, the overall success rate compounds exponentially.
The Formula:
P(n) = p^n
Where p equals the success rate per step and n equals the number of steps.
Example: A coin flip landing heads five consecutive times has only a 3.125% success rate.
Real-World Impact
An agent performing at 90% accuracy per step would achieve:
- 59.05% success on 5-step tasks
- 34.87% success on 10-step tasks
- 72.90% success on 3-click web tasks
The article emphasizes: "Software that only works 72.90%, 95%, or even 99% of the time is bad software."
Production systems require "eleven nines" of reliability (AWS S3 standard).
When to (Not) Be Agentic
HDR proposes reducing probabilistic action space by combining AI models for reasoning with predetermined, verified actions. The hotel booking example demonstrates this—while websites differ, their structural flow is similar.
Solution components:
- Collective Memory Index (search over web trajectories)
- Accessibility tools for page structure understanding
- Model-assisted reasoning for state-specific decisions
Resources mentioned:
- hdr.is/memory
- Nolita (web automation framework)
