AI features are designed around three cost-cutting habits. The studio refuses to ship an AI feature that's economically broken — a feature that costs more per user than it earns is a liability, not an asset.
Habit 1 — Cache deterministic outputs
The same prompt with the same input produces the same output. The studio caches every deterministic call — workout summaries, programme rewrites, content classifications — in Firestore keyed by an input hash. Hit rate on the largest cache is roughly 60%, which means 60% of "AI calls" never touch a model.
Habit 2 — Route to the cheapest model that meets the bar
A short classification call goes to Gemini Flash or Claude Haiku at ~$0.25 per million tokens. A long-form coaching response goes to Sonnet or GPT-4-class only when the quality difference shows in evaluation. The studio runs a quality eval on every model swap before promoting it to production.
Habit 3 — Prompt caching for stable system prompts
The studio uses prompt caching on the Anthropic API for any prompt with a stable system message, which drops the cost of repeated calls by up to 90%. The same approach works on OpenAI with their cached-input pricing.
