An Old Problem in New Clothes

In 1976, economists Michael Jensen and William Meckling formalized one of the most enduring challenges in business: the principal-agent problem. Their insight was elegantly simple: when you hire someone to act on your behalf, their interests rarely align perfectly with yours. The mechanic who recommends unnecessary repairs. The CEO who prioritizes quarterly earnings over long-term value. The consultant paid hourly, incentivized to prolong the project rather than deliver quickly. Economists and firms have spent decades developing solutions to mitigate these misalignments, from stock options to performance bonuses to reputation systems. And then, we just… forgot them?

A brief note on the principal agent problem

Today, a new principal-agent dynamic is hiding behind SaaS pricing models. Consider the typical AI agent setup:

You (the principal) purchase an AI customer service bot, data analysis tool, or AI coding assistant.
The AI provider (the agent) deploys sophisticated models to handle your tasks. You pay a flat monthly fee: $20 for individual plans, thousands or millions for enterprise.

Simple, predictable, familiar. This worked for SaaS as the marginal computational costs were (more or less) known and fixed a priori to runtime. But with agents reasoning through DAGs, that is no longer the case.

This exposes the fundamental misalignment. While you pay a fixed price, your provider faces variable costs measured in inference tokens. Every time the AI agent "thinks" longer, reasons more deeply, or generates more verbose output, it is consuming tokens. These tokens are costly for the AI agent provider.

This information asymmetry creates a textbook hidden action problem. Your AI agent provider profits by minimizing token usage while maintaining just enough quality to keep you subscribed. It's like hiring a contractor who gets paid the same whether they use premium materials or cut corners, except you can't inspect the foundation.

The examples are already mounting. Users of AI coding assistants report inconsistent performance; sometimes Claude Code writes elegant, comprehensive solutions; other times it delivers bare-minimum code that technically works but requires extensive cleanup. Customer service bots fluctuate between thorough, empathetic responses and terse, unhelpful snippets. The quality variance is the natural result of providers optimizing for token efficiency behind the opaque curtain of flat-rate pricing.

The irony is striking. Token burn represents perhaps the most perfectly measurable form of "effort" in economic history. Unlike human labor, where monitoring is costly and imprecise, every token is counted. The AI agent pricing problem is essentially asking: How do we design contracts when effort can be perfectly hidden but are perfectly measurable?

Of course, this goes beyond pricing. As Charlie Munger said, "Show me the incentive and I will show you the outcome." These incentive structures will profoundly shape both the utility of AI agents and the overall growth trajectory of the AI agent economy.

Solutions from Contract Theory, and Why They Fall Short

A few solutions from history and why they will not solve incentive alignment for AI agents: monitoring, efficiency wages, tournaments, and bonding.

Monitoring

One interpretation of the above is for AI agent providers to make their “effort” observable through token usage dashboards or exposing reasoning traces directly to the principal. AI agent providers may not want to do this for a variety of reasons. Exposing graph traces is giving away part of their secret sauce!

At first glance, marking up inference tokens appears suited to solving the classic principal-agent issue. The simplicity in this is alluring: the value that the AI agent provider brings beyond the foundation model provider (FMP) is proportional to the markup they can place on the tokens. However, while tokens are a compelling proxy for effort, they risk recreating precisely the kind of perverse incentives economists have cautioned against for decades.

Token Tally Rewards Activity, Not Outcomes

Marking up tokens encourages AI agent providers to optimize for… more inference tokens! Just as labor economists recognized long ago, paying workers solely by hours worked creates incentives for inefficiency: employees stretch tasks, inflate billable hours, or otherwise avoid efficiency improvements. Applied to AI agents, the following issues may arise: