Human-AI Interaction & Decision Quality
How effectively are humans and AI actually working together in your context? Benchmark your decision acceptance rates, automation bias exposure, and collaboration quality against global research from McKinsey, BCG, Stanford HAI, and MIT — adjusted for your industry, role, and decision type.
Decision Acceptance Funnel
Journey from AI recommendation → meaningful review → accepted → acted on.
Automation vs Augmentation Split
Optimal balance for this decision type. Centaur model (human+AI) outperforms either alone by —.
Decision Quality Dimensions
Five-dimension quality profile vs industry benchmark. Scores reflect improvement vs solo-human baseline.
Acceptance Rate by Industry (BCG / Stanford HAI 2023 · optimal zone 65–85%)
What each metric measures, what the research says, and how to improve your collaboration posture.
Decision Acceptance Rates & Automation Bias
Decision Acceptance Rate tracks the proportion of AI recommendations that humans act on. Automation Bias Index measures how many of those acceptances occurred without meaningful human review — the silent governance failure that most organisations have not yet instrumented. The EEOC and EU AI Act both require documented human oversight for high-risk decisions; automation bias is evidence that oversight is nominal rather than real.
- Healthcare: 79% acceptance, 31% automation bias — radiologists accepting AI diagnostic flags without independent verification (MIT CSAIL 2023)
- Financial Services: 68% acceptance, 42% automation bias — highest bias rate across sectors; credit officers approving AI-scored applications at volume without case review (BCG 2023)
- Legal: 61% acceptance, 23% bias — most conservative sector; liability exposure drives genuine review (Stanford HAI)
- Optimal zone: 65–85% acceptance with <25% automation bias. Below 50% = undertrust; above 85% = over-reliance
- DARPA XAI finding: providing explanations with AI recommendations reduces automation bias by 18% — the single most effective intervention
- Instrument your AI systems to log whether humans accessed the explanation before accepting — this is your automation bias rate
- Add mandatory explanation display before acceptance for P1/P0 decisions — interface friction that requires acknowledgement, not just click-through
- Set a review SLA: for high-stakes decisions, require logged time-on-task before acceptance (>60 seconds minimum)
- Report acceptance rates by team to leadership monthly — the act of measurement alone reduces automation bias by 12% (MIT Sloan 2022)
Automation vs Augmentation Spectrum
The automation vs augmentation split defines how AI is deployed across a decision portfolio. Automation means AI decides and acts without human involvement. Augmentation (the "centaur model") means AI advises, humans decide. The optimal split is not fixed — it varies critically by decision type, reversibility, regulatory context, and cognitive stakes. Getting this wrong in either direction destroys value: over-automation creates liability and error propagation; under-automation wastes the tool.
- Centaur model outperformance: Human+AI teams beat solo AI by 23% and solo humans by 31% on complex decisions — BCG 2023 study of 12,000 knowledge workers
- Routine decisions: 65% automation / 35% augmentation optimal — McKinsey Global Institute 2023
- Complex decisions: 25% automation / 75% augmentation — Stanford HAI recommendation
- High-stakes decisions: 8% automation / 92% augmentation — any fully automated high-stakes decision is a governance violation under NIST AI RMF
- Augmentation preference: 71% of knowledge workers prefer AI as thought-partner (McKinsey 2023); this rises to 84% for healthcare workers and 88% for legal professionals
- Map every AI deployment to one of three tiers: Automate (routine, reversible, low-stakes), Augment (complex, consequential, regulated), Advise-only (irreversible, high-liability, safety-critical)
- Calculate your Return on Employee (RoE): measure hours freed from automated tasks + decision quality improvement per person — this is the centaur dividend
- Resist the automation bias in system design — the default should be augmentation, with automation requiring explicit justification and governance sign-off
- Survey team augmentation preference quarterly — low preference scores predict adoption failure before it happens
Decision Quality & Cognitive Load
Decision quality in human-AI systems is multi-dimensional: accuracy (correctness), speed (time-to-decision), consistency (same decision in the same context), error rate (critical failures), and cognitive load (mental effort required). The HAIS (Human-AI Integration Scale), developed by researchers at MIT and Northeastern, provides a validated 25-item instrument for measuring how well AI integration serves human cognition rather than taxing it.
- Accuracy lift: +18% average improvement in AI-assisted vs solo human decisions; up to +22% in healthcare (BCG/MIT 2023)
- Error reduction: −37% critical errors in healthcare AI with human review; −22% in financial services (Stanford HAI 2023)
- Time-to-decision: −28% faster on average; routine decisions faster by 45%; high-stakes decisions faster by only 12% (appropriate caution)
- Decision consistency: +31% improvement — AI dramatically reduces "decision fatigue" variance; humans make worse decisions in the afternoon, AI does not
- Cognitive load: −24% reduction in perceived mental effort when AI provides structured options vs open-ended assistance (HAIS scale validation studies)
- Trust Score (Edelman 2024): Global average 53/100; healthcare 62; financial services 49; legal 44
- Deploy the HAIS scale as a quarterly 25-item survey — it takes 8 minutes and produces a validated composite score across trust, transparency, control, and explainability dimensions
- Establish baseline accuracy and error rates before AI deployment — you cannot measure lift without a pre-AI baseline captured in the same period
- Track decision consistency using the same case presented to the same person twice over 4 weeks — the variance is your "human inconsistency baseline" that AI should reduce
- Monitor cognitive load as a leading indicator of adoption failure — high perceived effort predicts abandonment within 90 days, well before accuracy drops become visible
Get the Human-AI Collaboration Assessment Template
The 38-point assessment template used to evaluate human-AI interaction quality across your organisation — covering decision acceptance protocols, automation bias audit, augmentation framework design, and the HAIS survey instrument for measuring cognitive integration.
- Decision acceptance rate tracker with automation bias audit (12 items)
- Automation vs augmentation decision matrix for your use-case portfolio
- Abbreviated HAIS instrument (25-item validated survey, 8 minutes)
- Return on Employee (RoE) measurement framework for AI-assisted roles
No spam. Unsubscribe any time.