Why 74% of Companies Are Getting AI Wrong And How to Join the 26% Who Get It Right

I’ve lost count of the pilots that started with a glossy demo and ended as an orphaned dashboard. The noise in this space rewards being loud and wrong over being useful; it’s exhausting and it shows up in the work.

Here’s the sober view. After years of spend and hype, only a minority of organisations are actually turning AI into measurable value. Boston Consulting Group’s global survey says it plainly: just 26% of companies have the capabilities to move beyond proofs of concept and generate tangible value at scale; 74% haven’t cracked it yet. The leaders concentrate value creation in core business processes (not vanity demos) and they invest far more in people and process than in algorithms. Their rule of thumb: ~10% algorithms, ~20% technology and data, ~70% people and process.

Meanwhile, failure rates are growing. S&P Global’s research reports that the share of firms scrapping most of their AI initiatives jumped to 42% this year, with nearly half of proofs of concept abandoned before production. Cost, data privacy and security are the top reasons. McKinsey finds adoption is widespread but risk practices lag; inaccuracy is the most experienced risk and only a small cohort can attribute meaningful EBIT to gen AI today.

The lesson: adoption is not impact. This isn’t cause for cynicism; it’s an argument for discipline.

What the 74% are getting wrong (patterns we see again and again)

Tech-first, process-second. Teams start with a model, a vendor, or the latest “copilot” then go hunting for a use. The result is bolt on assistants living outside the flow of work. Meanwhile, the real value sits in unglamorous handoffs (claims triage, exception management, month end close, KYC remediation) where latency, rework and error rates are expensive and fixable. Leaders focus on those core flows; that’s where most of AI’s value resides.

Pilot theatre. Pilots run on clean data and goodwill. There’s no plan for permissions, auditability, or routing exceptions. The pilot “works” until it meets the messy tangle of systems of record, access controls and human oversight. No surprise so many POCs never see production.

Data wishful thinking. A workshop assumes unified customer histories, canonical product lists, event streams with timestamps that mean something. Then the team discovers five CRMs, seven Excel kingdoms and IDs that don’t match. Value dies in the join.

Governance bolted on at the end. Legal, risk and compliance are invited to the party after the invitations have been printed. By then, the only honest answer is “not like this.” High performers “shift left,” embedding risk, IP, security and explainability considerations during design, not as a late stage hurdle.

No resourcing for change. The business case assumes 30% time saved per person then quietly assumes the org will capture it without redesigning roles, incentives, controls or measurement. It won’t. If you don’t change how performance is managed, savings evaporate into nicer looking backlogs.

What the 26% do differently

They start where the money moves. Leaders don’t start with a chatbot; they start with the value chain. Map the end to end flow, quantify the three biggest drains (waiting, rework, fragmentation) and aim AI at those constraints first. In BCG’s analysis, 62% of realised value shows up in core processes such as operations, sales and R&D, not in the support functions that dominate the demos.
They design for the flow of work, not for a demo. That means integrating with systems of record, enriching with context (IDs, entitlements, SLAs) and handling exceptions with clear human in the loop steps. The test is simple: can a new starter follow the runbook on a Monday and still succeed?
They fund people and process on purpose. The 70-20-10 allocation isn’t a slogan; it’s budget lines:
People and process, which make up about 70%, cover role redesign, training, new controls, SOPs, adoption support, communication and line manager coaching. Technology and data, around 20%, provide connectors, event logs, feature stores, monitoring and access controls. Algorithms, about 10%, focus on model selection, tuning and evaluations. Leaders treat change management itself as the product.
They govern by design. Establish a small, empowered Responsible AI forum with authority over policy and exceptions; involve it from day one. High performers “shift left” on risk and they implement playbooks for inaccuracy, IP, security and explainability early, not after a breach or front page embarrassment.
They scale selectively. Leaders pursue half as many opportunities yet scale more than twice as many solutions. They pick three to five “lighthouse” workflows with line of sight to P&L improvements and run them to standard. Then and only then they expand to adjacent flows.

A practical path (90 days to certainty)

Week 1–2: Where value leaks. Map the end to end workflow (one value stream, not a department). Baseline five numbers: cycle time, touch time, first time right, exception ratio and queue length. Confirm the decision moments where AI can help (classify, extract, route, summarise, predict). Draft the risk profile and controls (data sources, legal basis, human oversight, audit trail).

Week 3–4: Build the thin slice. Stand up a single thin end to end slice (ingest → reason → action) connected to real systems and real permissions. Instrument quality from day one: establish evaluation sets, accuracy thresholds and error budgets. Define an exception path that’s as well designed as the happy path.

Week 5–8: Prove it in production. Ship to a contained user group and run live for four weeks. Track adoption (weekly active users and depth of use), operational impact (cycle time deltas, rework avoided) and risk events (number, severity, time to mitigate). Fix what reality reveals: edge cases, permissions, model drift, prompt brittleness.

Week 9–12: Make it standard. Codify SOPs, update role descriptions, switch incentives to reward the new way of working and add the controls to your audit universe. Decide: scale, iterate, or kill. The bravest decision is “no.” Leaders stop what doesn’t pay back.

A five‑point checklist to avoid “AI for AI’s sake”

If no one can name the P&L line this helps, stop.
If the thin slice doesn’t touch a system of record, stop.
If Legal hasn’t been asked yet, stop.
If success is defined as ‘a demo that looks cool’, stop.
If the business case counts “time saved” but not “time redeployed”, stop.

The metrics that matter (and the ones that don’t)

Do measure:

Time to first value (from kick off to first live decision supported).
Flow efficiency (touch time ÷ total time).
First time right and exception ratio.
Adoption depth (median tasks per user per week).
Defects escaped (issues discovered by auditors/customers vs. caught in flow).

Don’t fetishise:

Tokens, prompts, model bragging rights.
“Hours saved” without a plan to capture them (role redesign, span of control changes).
Pilot NPS if the pilot isn’t in the flow of work.

Two honest risks to manage

Scrap‑rate risk.
Scrap rate risk. S&P Global’s figures should focus minds: 42% of firms scrapped most initiatives; nearly half of POCs never reached production. The antidote is not fear; it’s a better funnel. Fewer bets, clearer kill criteria, faster “no.”

Inaccuracy and explainability risk.
McKinsey notes inaccuracy is the most experienced risk today and governance practices remain thin in most organisations. Embed evaluation sets, monitor drift and provide human override with audit trails. Treat explainability to the operator as non negotiable.

Where this is heading

The near term gains are not sci fi. They are quietly radical: compressing cycle times, shrinking rework, reducing queues and making exception handling sane. The firms that win won’t be those with the most models; they’ll be those with the cleanest flows, the clearest controls and the courage to say no to glamour projects.

If you remember one thing, make it this: start with the work, not the model. The companies in the 26% aren’t smarter; they’re more disciplined. They choose fewer battles, close the loop between design and adoption, and put most of the money into people and process. The rest (vendors, models, frameworks) only matter if the work moves.