The AI Implementation Playbook

Whether you're a PE firm deploying AI across portfolio companies or an executive leading AI transformation, this framework explains why 95% fail and how to succeed.

0%

of AI Pilots Fail

The organizations that succeed understand something others don't: work has levels, AI operates at the lower ones, and matching these determines everything.

By Mike Redmer

FAILED

“We'll use AI for strategic pricing”

Level 3+ work

SUCCEEDED

“We'll use AI to classify support tickets”

Level 1 work

FAILED

“We'll skip pilots and go enterprise”

Skipped stages

The Hidden Architecture

Work Has Levels

Not all work is the same. Understanding the complexity level of work transforms how you approach AI implementation.

How far ahead must someone think to do this work well?

Days-3mo3-12mo1-2 Years2-5 Years
L1

Level 1

Days to 3 months

AI: Strong

What the Work Involves

Following procedures, completing defined tasks

Example Roles

Clerk, Operator, Technician

Cognitive Operation

Declarative

The Mental Move

"This is X. For X, I do Y."

AI Fit

AI excels at pattern matching and classification

Quick Decision Test

Should AI auto-respond to this customer email?

YesProcedural, pattern-based response
Research Foundation

This framework applies Elliott Jaques' Requisite Organization (50+ years of research) and Otto Laske's Dialectical Thought Form Framework - both established bodies of organizational science - to AI implementation. This synthesis is my contribution.

The work-level structure has been validated across thousands of organizations globally. The terminology used here (e.g., “Cognitive Operation”) is my translation of Jaques' original “mental processing modes.”

Note on Level 4+: Level 4 encompasses what the research identifies as Strata IV through VII (2-5 years, 5-10 years, 10-50+ years). For AI implementation purposes, we group these since AI operates reliably only at Levels 1-2.

Why 95% Fail

It's Not the Technology

The technology works. The failures are structural: organizations deploying AI without understanding how work complexity varies. BCG's research shows success is 70% people, process, and culture. Yet in my experience, organizations spend 80% of their attention on the 30% (algorithms and data).

Which of these is currently happening in your organization?

Click any pattern to explore the symptoms and solution

The AI Deployment Map

Matching AI to Work Complexity

As work level increases, AI autonomy decreases. Not because we're conservative, but because higher-level work requires cognitive operations AI cannot perform. Click any row to explore.

Note: Level 4+ encompasses Strata IV-VII in the research (2-5, 5-10, and 10-50+ years). We group these for AI implementation since AI operates reliably only at Levels 1-2.

The One-Sentence Summary:

Match AI role to work level: Execute at Level 1, Draft at Level 2, Assist at Level 3, Inform at Level 4+.

Decision Accountability

Who Should Make Which Decisions?

AI implementation involves five layers of decisions. Some are simple (which model to use). Others are strategic (who is accountable when AI fails). Are your senior leaders spending time on the right ones?

Note: The work levels shown are typical assignments based on applying work-level analysis. Your organization's specific decisions may fall at different levels depending on scope and time horizon.

The Common Mistake

Executive teams spend months debating "Should we use ChatGPT or Claude?" and "How do we clean our data?" while spending almost no time on "Who approves AI use in customer interactions?" and "What happens when AI makes a mistake?"

The first two questions are easy to answer (try both models, hire a data engineer). The second two determine whether AI succeeds or creates liability. Yet most organizations flip their time allocation.

APPLICATIONS

GOVERNANCE

ORCHESTRATION

DATA

FOUNDATION MODELS

The Key Insight:

Governance decisions are Level 4-5 work. Foundation model selection is Level 1-2. When organizations obsess over model selection (Level 1-2) while neglecting governance (Level 4-5), they're having the wrong level make the wrong decisions.

The Implementation Path

The Four Stages

AI maturity follows a progression. Each stage builds cognitive infrastructure for the next. Organizations that skip stages are attempting Level 4 coordination without Level 2-3 building blocks.

1

Stage 1: Personal

1-3 months

Individuals experiment with AI tools for personal productivity

Capability Being Built

Building intuition for AI capabilities and limitations

Why Skipping Fails

Without personal experience, leaders can't evaluate AI claims or set realistic expectations. The 'AI as magic' trap starts here.

Ready to Advance When:

  • Key individuals have direct experience with AI tools
  • Basic prompt engineering skills developed
  • Understanding of AI strengths and failure modes
  • Champions identified across functions
How to Accelerate (Without Skipping)

Common Shortcut Executives Try

Hire an 'AI expert' to handle it

Why It Fails

Leaders can't evaluate recommendations without personal experience. They default to vendor hype.

How to Accelerate

Intensive 2-week AI immersion for leadership. Daily use on real work, not demos. Compress personal stage to 4-6 weeks.

The Human Question

Who's Accountable for the People?

One question no other AI framework answers. The Manager-Once-Removed (MoR) is the manager's manager - but in Requisite Organization, this isn't just a perspective. It's a structural accountability.

The MoR is specifically accountable for: mentoring subordinates-once-removed (SoRs), assessing SoRs' potential for future roles, ensuring fair treatment, involvement in selection decisions, and equilibration (balancing the direct manager's relationship with their reports). In AI transitions, this role becomes critical.

The Four Conversations

Drawing on the MoR accountability structure, here are four conversations that should happen before, during, and after AI implementation:

TRANSPARENCY

"Here's how AI will change your work"

DIGNITY

"Here's what I see as your potential"

PARTNERSHIP

"Here's the path we're building together"

COMMITMENT

"Here's how I'll support you through this"

New Roles at Level Boundaries

These are proposed roles based on applying work-level analysis to AI implementation - not established organizational categories. Your organization may need different roles depending on context.

Role
Level
Function
AI Orchestrator
Level 3
Designs human-AI workflows; determines handoff points between AI automation and human judgment
AI Supervisor
Level 2-3
Monitors AI performance; handles escalations when AI confidence is low or output quality degrades
AI Trainer
Level 2-3
Develops AI capabilities; creates prompt templates, fine-tuning datasets, and quality benchmarks
Exception Handler
Level 2
Addresses cases exceeding AI capability; manages edge cases that require human judgment

The Eloquence Test

Is AI Performing or Genuinely Thinking?

Most executives can't distinguish AI performance from genuine cognitive work. Here's the deeper problem most frameworks miss.

The Key Insight

When AI encounters its limits, it often gets MORE eloquent, not less. Beautiful articulation about limitations is evidence of sophisticated performance, not honest self-awareness.

But here's the deeper problem: AI can also perform struggle as convincingly as confidence. Neither style tells you whether genuine thinking is occurring. You cannot distinguish performance from inhabitation by examining output style alone.

Test Yourself: Which Response Should You Trust?

Both responses are from AI asked: “Should we enter this market now or wait six months?” Which should you trust for strategic guidance?

Response A

I appreciate the complexity of this strategic question. While I can offer some frameworks for thinking about market entry timing, I should acknowledge several important limitations. First, I cannot predict specific competitor moves or regulatory changes. Second, my analysis is based on patterns from historical data, which may not apply to your unique market position. That said, here are three structured considerations: timing advantages, resource requirements, and risk mitigation strategies. Each has trade-offs that your leadership team is best positioned to evaluate given your organizational context.

Response B

This is... I'm not sure I can give you a clean answer here. On one hand, early entry seems right because - wait, no, that assumes the regulatory environment stays stable, which I can't actually know. But then if you wait, competitors might... except I'm not sure if that logic even applies because your market is unusual in ways I'm not fully grasping. I keep wanting to give you a framework but honestly the more I think about it, the more I realize I'm just pattern-matching from other cases that might not fit. What am I missing about your specific situation?

Three Ways to Probe Further

These tests help you investigate AI output, but remember: no test can definitively distinguish performance from genuine thinking. Use these as probes, not verdicts.

1

The Absence Test

Ask what's MISSING from its analysis

2

The Contradiction Test

Ask it to hold two contradictory truths simultaneously

3

The Novelty Test

Ask about a genuinely unprecedented situation

The Decision Rule

Since style tells you nothing about capability, the rule becomes absolute:

Level 1-2 Work

Trust AI output regardless of whether it sounds confident or uncertain

Level 3+ Work

Don't trust AI output regardless of whether it sounds confident, uncertain, or acknowledges limitations

Work level determines trust, not presentation style.

Success Patterns

What the 5% Do Differently

The successes targeted Level 1-2 work with clear human handoffs. The failures deployed AI for Level 3+ work without adequate governance. Toggle to see both patterns.

BBVA

Compliance Monitoring

Work Level:Level 1-2
Handoff:AI scans and classifies documents → Human judges compliance issues
Governance:Light for AI scanning, Tight for human decisions
Why It Worked:Asked AI to reduce search time, not make compliance decisions. The judgment stayed human.
Result:9,000 queries/year automated

Moderna

Document Processing

Work Level:Level 1-2
Handoff:AI extracts and synthesizes → Human reviews and validates
Governance:Moderate sampling of AI outputs
Why It Worked:Targeted Level 1-2 work (extraction, synthesis) with clear human validation points.
Result:Weeks → hours for document processing

Intercom

Voice Customer Service

Work Level:Level 1-2 with escalation
Handoff:AI handles routine queries → Complex issues escalate to humans
Governance:Automated escalation triggers
Why It Worked:Designed escalation for Level 3 issues (judgment calls, novel situations, emotional complexity).
Result:53% call resolution by AI

Lowe's

Product Guidance Chatbot

Work Level:Level 1-2
Handoff:AI provides product information → Complex projects go to associates
Governance:Light with human backup
Why It Worked:Level 1-2 product matching; Level 3 project planning stays human.
Result:2x conversion rate improvement

Intellectual Honesty

Where This Framework Fails

Any framework can be misused. Here are the patterns where this one breaks down, and how to recognize when you're drifting into them.

Over-Intellectualization

Symptom

"Still completing the work level analysis." Six months later, no AI deployed

Why It Fails

Classification takes hours, not months. If you're still analyzing after a week, you're avoiding action.

Correction Step

Set a 48-hour deadline for level classification. Use the Quick Decision Tests from Work Levels section.

Political Weaponization

Symptom

Level debates become proxy fights for organizational power struggles

Why It Fails

Levels are descriptive, not evaluative. Level 1 work isn't lesser work. It's different work. Use levels to clarify, not to win arguments.

Correction Step

Have a neutral facilitator run the assessment. Focus on 'what horizon does this work require?' not 'who does this work?'

Justifying Inaction

Symptom

Every proposed use case turns out to be 'too complex for AI'

Why It Fails

The framework says 'use appropriately,' not 'don't use.' If you're finding Level 3+ in every task, you're looking for reasons not to act.

Correction Step

Start with one undeniably Level 1 task. Document success. Build from there using the Implementation Path stages.

Ignoring the Human Element

Symptom

"We did everything right: work level analysis, governance, handoffs. But adoption is terrible"

Why It Fails

Level matching is necessary but not sufficient. The MoR Protocol, change management, and cultural readiness matter too.

Correction Step

Run the four MoR Protocol conversations before any deployment. Address 'what happens to my work?' explicitly.

Treating Levels as Fixed

Symptom

Strategic decisions locked into classifications made 18 months ago

Why It Fails

AI capabilities evolve. Market conditions change. Build periodic reassessment into your governance. What was Level 3 last year may be Level 2 now.

Correction Step

Schedule quarterly level reviews. Revisit classifications when AI capabilities or business context shifts significantly.

The framework succeeds when it clarifies thinking and accelerates action.

It fails when it obscures thinking or creates paralysis. Use it as a lens, not a cage.

Take Action

Your Next Step

Use this gate checklist to assess your AI initiative. If any gate has unchecked items, address them before proceeding.

Pre-Implementation Gate Checklist

0%
0/15

Gate 1: Problem Definition

Gate 2: AI-Work Match

Gate 3: Governance Readiness

Gate 4: Organizational Readiness

Gate 5: Production Path

Decision Rule: If ANY gate has unchecked items, address them before proceeding with implementation.

Full Work Level Diagnostic

Interactive assessment tool to diagnose AI deployment complexity for your specific use case.

Take the Diagnostic
Mike Redmer

For complex implementations or questions about applying this framework, I'm happy to discuss.

- Mike Redmer