The AI Implementation Playbook
Whether you're a PE firm deploying AI across portfolio companies or an executive leading AI transformation, this framework explains why 95% fail and how to succeed.
of AI Pilots Fail
The organizations that succeed understand something others don't: work has levels, AI operates at the lower ones, and matching these determines everything.
By Mike Redmer
“We'll use AI for strategic pricing”
Level 3+ work
“We'll use AI to classify support tickets”
Level 1 work
“We'll skip pilots and go enterprise”
Skipped stages
The Hidden Architecture
Work Has Levels
Not all work is the same. Understanding the complexity level of work transforms how you approach AI implementation.
How far ahead must someone think to do this work well?
Level 1
Days to 3 months
What the Work Involves
Following procedures, completing defined tasks
Example Roles
Clerk, Operator, Technician
Cognitive Operation
Declarative
The Mental Move
"This is X. For X, I do Y."
AI Fit
AI excels at pattern matching and classification
Quick Decision Test
“Should AI auto-respond to this customer email?”
Research Foundation
This framework applies Elliott Jaques' Requisite Organization (50+ years of research) and Otto Laske's Dialectical Thought Form Framework - both established bodies of organizational science - to AI implementation. This synthesis is my contribution.
The work-level structure has been validated across thousands of organizations globally. The terminology used here (e.g., “Cognitive Operation”) is my translation of Jaques' original “mental processing modes.”
Note on Level 4+: Level 4 encompasses what the research identifies as Strata IV through VII (2-5 years, 5-10 years, 10-50+ years). For AI implementation purposes, we group these since AI operates reliably only at Levels 1-2.
Why 95% Fail
It's Not the Technology
The technology works. The failures are structural: organizations deploying AI without understanding how work complexity varies. BCG's research shows success is 70% people, process, and culture. Yet in my experience, organizations spend 80% of their attention on the 30% (algorithms and data).
Which of these is currently happening in your organization?
Click any pattern to explore the symptoms and solution
The AI Deployment Map
Matching AI to Work Complexity
As work level increases, AI autonomy decreases. Not because we're conservative, but because higher-level work requires cognitive operations AI cannot perform. Click any row to explore.
Note: Level 4+ encompasses Strata IV-VII in the research (2-5, 5-10, and 10-50+ years). We group these for AI implementation since AI operates reliably only at Levels 1-2.
The One-Sentence Summary:
Match AI role to work level: Execute at Level 1, Draft at Level 2, Assist at Level 3, Inform at Level 4+.
Decision Accountability
Who Should Make Which Decisions?
AI implementation involves five layers of decisions. Some are simple (which model to use). Others are strategic (who is accountable when AI fails). Are your senior leaders spending time on the right ones?
Note: The work levels shown are typical assignments based on applying work-level analysis. Your organization's specific decisions may fall at different levels depending on scope and time horizon.
The Common Mistake
Executive teams spend months debating "Should we use ChatGPT or Claude?" and "How do we clean our data?" while spending almost no time on "Who approves AI use in customer interactions?" and "What happens when AI makes a mistake?"
The first two questions are easy to answer (try both models, hire a data engineer). The second two determine whether AI succeeds or creates liability. Yet most organizations flip their time allocation.
APPLICATIONS
GOVERNANCE
ORCHESTRATION
DATA
FOUNDATION MODELS
The Key Insight:
Governance decisions are Level 4-5 work. Foundation model selection is Level 1-2. When organizations obsess over model selection (Level 1-2) while neglecting governance (Level 4-5), they're having the wrong level make the wrong decisions.
The Implementation Path
The Four Stages
AI maturity follows a progression. Each stage builds cognitive infrastructure for the next. Organizations that skip stages are attempting Level 4 coordination without Level 2-3 building blocks.
Stage 1: Personal
1-3 months
Individuals experiment with AI tools for personal productivity
Capability Being Built
Building intuition for AI capabilities and limitations
Why Skipping Fails
Without personal experience, leaders can't evaluate AI claims or set realistic expectations. The 'AI as magic' trap starts here.
Ready to Advance When:
- ✓Key individuals have direct experience with AI tools
- ✓Basic prompt engineering skills developed
- ✓Understanding of AI strengths and failure modes
- ✓Champions identified across functions
How to Accelerate (Without Skipping)
Common Shortcut Executives Try
“Hire an 'AI expert' to handle it”
Why It Fails
Leaders can't evaluate recommendations without personal experience. They default to vendor hype.
How to Accelerate
Intensive 2-week AI immersion for leadership. Daily use on real work, not demos. Compress personal stage to 4-6 weeks.
The Human Question
Who's Accountable for the People?
One question no other AI framework answers. The Manager-Once-Removed (MoR) is the manager's manager - but in Requisite Organization, this isn't just a perspective. It's a structural accountability.
The MoR is specifically accountable for: mentoring subordinates-once-removed (SoRs), assessing SoRs' potential for future roles, ensuring fair treatment, involvement in selection decisions, and equilibration (balancing the direct manager's relationship with their reports). In AI transitions, this role becomes critical.
The Four Conversations
Drawing on the MoR accountability structure, here are four conversations that should happen before, during, and after AI implementation:
TRANSPARENCY
"Here's how AI will change your work"
DIGNITY
"Here's what I see as your potential"
PARTNERSHIP
"Here's the path we're building together"
COMMITMENT
"Here's how I'll support you through this"
New Roles at Level Boundaries
These are proposed roles based on applying work-level analysis to AI implementation - not established organizational categories. Your organization may need different roles depending on context.
The Eloquence Test
Is AI Performing or Genuinely Thinking?
Most executives can't distinguish AI performance from genuine cognitive work. Here's the deeper problem most frameworks miss.
The Key Insight
When AI encounters its limits, it often gets MORE eloquent, not less. Beautiful articulation about limitations is evidence of sophisticated performance, not honest self-awareness.
But here's the deeper problem: AI can also perform struggle as convincingly as confidence. Neither style tells you whether genuine thinking is occurring. You cannot distinguish performance from inhabitation by examining output style alone.
Test Yourself: Which Response Should You Trust?
Both responses are from AI asked: “Should we enter this market now or wait six months?” Which should you trust for strategic guidance?
I appreciate the complexity of this strategic question. While I can offer some frameworks for thinking about market entry timing, I should acknowledge several important limitations. First, I cannot predict specific competitor moves or regulatory changes. Second, my analysis is based on patterns from historical data, which may not apply to your unique market position. That said, here are three structured considerations: timing advantages, resource requirements, and risk mitigation strategies. Each has trade-offs that your leadership team is best positioned to evaluate given your organizational context.
This is... I'm not sure I can give you a clean answer here. On one hand, early entry seems right because - wait, no, that assumes the regulatory environment stays stable, which I can't actually know. But then if you wait, competitors might... except I'm not sure if that logic even applies because your market is unusual in ways I'm not fully grasping. I keep wanting to give you a framework but honestly the more I think about it, the more I realize I'm just pattern-matching from other cases that might not fit. What am I missing about your specific situation?
Three Ways to Probe Further
These tests help you investigate AI output, but remember: no test can definitively distinguish performance from genuine thinking. Use these as probes, not verdicts.
The Absence Test
Ask what's MISSING from its analysis
The Contradiction Test
Ask it to hold two contradictory truths simultaneously
The Novelty Test
Ask about a genuinely unprecedented situation
The Decision Rule
Since style tells you nothing about capability, the rule becomes absolute:
Level 1-2 Work
Trust AI output regardless of whether it sounds confident or uncertain
Level 3+ Work
Don't trust AI output regardless of whether it sounds confident, uncertain, or acknowledges limitations
Work level determines trust, not presentation style.
Success Patterns
What the 5% Do Differently
The successes targeted Level 1-2 work with clear human handoffs. The failures deployed AI for Level 3+ work without adequate governance. Toggle to see both patterns.
BBVA
Compliance Monitoring
Moderna
Document Processing
Intercom
Voice Customer Service
Lowe's
Product Guidance Chatbot
Intellectual Honesty
Where This Framework Fails
Any framework can be misused. Here are the patterns where this one breaks down, and how to recognize when you're drifting into them.
Over-Intellectualization
Symptom
"Still completing the work level analysis." Six months later, no AI deployed
Why It Fails
Classification takes hours, not months. If you're still analyzing after a week, you're avoiding action.
Correction Step
Set a 48-hour deadline for level classification. Use the Quick Decision Tests from Work Levels section.
Political Weaponization
Symptom
Level debates become proxy fights for organizational power struggles
Why It Fails
Levels are descriptive, not evaluative. Level 1 work isn't lesser work. It's different work. Use levels to clarify, not to win arguments.
Correction Step
Have a neutral facilitator run the assessment. Focus on 'what horizon does this work require?' not 'who does this work?'
Justifying Inaction
Symptom
Every proposed use case turns out to be 'too complex for AI'
Why It Fails
The framework says 'use appropriately,' not 'don't use.' If you're finding Level 3+ in every task, you're looking for reasons not to act.
Correction Step
Start with one undeniably Level 1 task. Document success. Build from there using the Implementation Path stages.
Ignoring the Human Element
Symptom
"We did everything right: work level analysis, governance, handoffs. But adoption is terrible"
Why It Fails
Level matching is necessary but not sufficient. The MoR Protocol, change management, and cultural readiness matter too.
Correction Step
Run the four MoR Protocol conversations before any deployment. Address 'what happens to my work?' explicitly.
Treating Levels as Fixed
Symptom
Strategic decisions locked into classifications made 18 months ago
Why It Fails
AI capabilities evolve. Market conditions change. Build periodic reassessment into your governance. What was Level 3 last year may be Level 2 now.
Correction Step
Schedule quarterly level reviews. Revisit classifications when AI capabilities or business context shifts significantly.
The framework succeeds when it clarifies thinking and accelerates action.
It fails when it obscures thinking or creates paralysis. Use it as a lens, not a cage.
Take Action
Your Next Step
Use this gate checklist to assess your AI initiative. If any gate has unchecked items, address them before proceeding.
Pre-Implementation Gate Checklist
Gate 1: Problem Definition
Gate 2: AI-Work Match
Gate 3: Governance Readiness
Gate 4: Organizational Readiness
Gate 5: Production Path
Decision Rule: If ANY gate has unchecked items, address them before proceeding with implementation.
Full Work Level Diagnostic
Interactive assessment tool to diagnose AI deployment complexity for your specific use case.
Take the Diagnostic