Strategy

Build vs. Buy in the AI Era: A Framework for Strategic Decision-Making

Foundation models changed the rules. Enterprise AI software spend tripled to $37 billion in 2025. The organizations making wrong build-vs-buy calls are paying for it in ways that don't show up in the original budget.

Key Takeaways

  • Enterprise AI software spend grew from $1.7B in 2023 to $11.5B in 2024 to $37B in 2025 — the fastest category expansion in enterprise software history. In a single year, the buy option went from 53% to 76% of AI use cases.
  • Purchased AI tools and strategic partnerships succeed approximately 67% of the time. Fully internal AI builds succeed at roughly half that rate (~33%).
  • Foundation models are commoditizing rapidly. As model quality converges, competitive differentiation migrates to proprietary data, workflow integration, and orchestration — not to the model itself.
  • The hybrid path — RAG, fine-tuning, and wrapper strategies — is how most sophisticated enterprises are navigating the spectrum. Each has a distinct cost structure and appropriate use case.
  • 67% of organizations aim to avoid high dependency on a single AI provider, yet 45% report that vendor lock-in has already hindered their ability to adopt better tools.

The Old Question With Fundamentally New Stakes

The build-versus-buy question has existed since enterprise software was invented. It was never simple. But the arrival of large language models and the broader generative AI platform wave has made it genuinely harder to reason about — while simultaneously raising the stakes of getting it wrong.

In the traditional software era, the build-vs-buy calculus was relatively tractable: custom development offered maximum control and differentiation but took longer and cost more; packaged software offered faster deployment and lower initial cost but came with feature gaps and vendor dependency. Organizations could model the tradeoffs with reasonable confidence.

The AI era introduces three structural changes that disrupt this calculus:

  1. Foundation models as shared infrastructure. The base intelligence layer is now available to everyone. No organization needs to train a GPT-4-class model from scratch. The competitive question has shifted from "can we build this intelligence?" to "where does building something atop this intelligence create unique value?"
  2. Rapidly shifting vendor landscape. Model quality, pricing, and market share are all in significant flux. OpenAI fell from 50% of enterprise LLM spending in 2023 to 27% in 2025. Anthropic grew from 12% to 40% in the same period. Technology choices made in 2023 may be strategically suboptimal today — and the pace of change is not slowing.
  3. Accelerated time-to-value for buying. GenAI-native SaaS converts at 2× the speed of traditional enterprise software. The competitive advantage of a custom build must now justify a development cycle that may be longer than the vendor's product release cycle.

Foundation Models Changed the Rules

The most important contextual fact in the build-vs-buy analysis is this: the foundational intelligence layer is becoming a commodity. By early 2026, performance gaps among leading foundation models have narrowed to incremental improvements rather than categorical differences. Open-source models like DeepSeek V3.1 and Qwen3 achieve inferencing costs up to 90% lower than proprietary alternatives at comparable performance levels for many standard tasks.

Enterprise LLM Market Share Shift: 2023 → 2025
% of enterprise LLM API spend. Source: Menlo Ventures State of Generative AI 2025
2023 2025 50% 40% 30% 20% 10% 50% 27% OpenAI 12% 40% Anthropic 7% 21% Google 31% 12% Others ↓ 23pp drop ↑ 28pp gain ↑ 14pp gain

This convergence has a strategic implication: as foundation model quality normalizes, the value in AI systems migrates up the stack — to workflow integration, proprietary data, domain-specific fine-tuning, and the orchestration layer that connects AI capability to business process. McKinsey confirms: high-performing organizations are nearly 3× as likely to have fundamentally redesigned individual workflows — the variable that drives outcomes is workflow integration, not model selection.

"If today's AI is an intern with the internet in its pocket, tomorrow's competitive edge will come from the walled data gardens that the public internet doesn't offer."

— Cloudera Enterprise AI Strategy Report, 2025

The AI Stack: Where Value Now Lives

To reason clearly about build vs. buy, it helps to think in layers. The AI stack has six distinct levels, each with different build-vs-buy economics:

The Enterprise AI Stack
Where value lives — and where the build vs. buy decision actually matters
6. Enterprise Integration Layer Proprietary workflows · Unique data · Business process orchestration → BUILD Highest differentiation 5. Application / Wrapper Layer Specialized agents · Vertical AI SaaS · Copilots → BUY / FINE-TUNE Depends on use case 4. Foundation Model Layer OpenAI · Anthropic · Google · Meta Llama · Mistral → BUY (API / Open Source) Commoditizing rapidly 3. Infrastructure Orchestration LangChain · LlamaIndex · Agents · Vector DBs · RAG → BUY / CONFIGURE Use established tooling 2. Cloud / Infrastructure AWS · Azure · GCP · Oracle Cloud → BUY (always) No differentiation from building 1. Hardware NVIDIA · AMD · Intel · Specialized AI Silicon → BUY (always) No use case for custom hardware

The insight from this layered view is that most organizations are making build-vs-buy decisions about the wrong layers. Buying cloud infrastructure and foundation model access is almost always correct — these layers are commodities, and building them from scratch offers no competitive advantage. The build decision becomes strategically significant at layers 5 and 6: the application layer (where specialized AI agents and vertical products live) and the enterprise integration layer (where proprietary data, unique workflows, and business-specific orchestration create defensible differentiation).

The Case for Buying

The evidence strongly favors purchasing for the majority of enterprise AI use cases in 2025-2026. The 76% buy rate observed by Menlo Ventures (up from 53% just one year prior) is not a statistical anomaly — it reflects a genuine shift in the economics of AI deployment.

Where Buying Is Clearly Right

Standard use cases without differentiation. Email drafting, meeting summarization, code completion, document search, customer FAQ handling — if your competitors can deploy the same vendor solution in 90 days or less, this capability is infrastructure, not advantage. Buying is faster, cheaper, and avoids maintaining a development effort that will produce only parity outcomes.

Speed to market is the constraint. Custom AI development cycles run 12–24 months for production-grade deployments. GenAI-native SaaS can be deployed in weeks. If competitive timing matters — and in most cases it does — the opportunity cost of building must be explicitly modeled against the value at risk during development.

Low-to-medium usage volumes. Below approximately 500,000 tokens per day, cloud APIs are significantly simpler and cheaper than self-hosted alternatives. The break-even point for self-hosting (hardware, maintenance, talent) only emerges at substantial volume.

What Purchased Solutions Actually Deliver

The published ROI data on off-the-shelf AI purchases is instructive. Microsoft's own internal research on Copilot at $30/user/month shows users saving an average 1.2 hours per week — an annual value of approximately $3,120 per user against a $360 annual license cost, producing an 8.7× return ratio for roles where the tool fits. Salesforce Agentforce deployments have produced 213% ROI for research organization clients. ServiceNow Now Assist, considered the most mature enterprise AI add-on, delivers AI-generated case summaries, intelligent search, and predictive routing with minimal integration work.

Caveat on published ROI figures: Vendor-published ROI data selects for deployments that worked. Real-world averages are lower. The critical variable is use-case fit: Microsoft Copilot ROI approaches zero for roles with minimal document and email work. Evaluate vendor ROI claims against your specific workflow composition, not aggregate figures.

The Case for Building

Building becomes strategically justified — and in some cases, strategically necessary — when specific conditions are present. These conditions are less common than vendor marketing would have you believe, and more common than internal development advocates admit.

When Building Is the Right Call

Competitive differentiation is the core use case. If the AI capability you are building is the product — or creates capabilities that are central to your competitive position — and if that position would be equally available to competitors through the same vendor, then you need to build. The question to ask: if your best competitor deployed the same vendor solution tomorrow, would your advantage evaporate? If yes, build.

Proprietary data advantage. You have training or retrieval data that your competitors cannot access — years of clinical notes, unique customer interaction history, proprietary transaction logs, sensor streams from physical infrastructure. Stanford's Human-Centered AI research confirms that access to organic alignment data is a genuine strategic differentiator. Without the data moat, the argument for building weakens significantly.

Regulatory or data sovereignty requirements. PHI under HIPAA, PII under GDPR or CCPA, financial data under various regulatory frameworks, defense or government classified data — scenarios where sending data to third-party APIs creates unacceptable compliance or sovereignty risk. Self-hosting or building on-premises becomes mandatory, not optional.

Scale economics at volume. At greater than 10 million tokens per month, the economics of self-hosting open-source models begin to approach API parity cost. At 100 million tokens monthly and above, organizations can save $5 million to $50 million annually through self-hosting — a number that justifies substantial engineering investment.

The Hybrid Middle Path

The most sophisticated enterprise AI strategies are not pure build or pure buy — they are layered architectures that deploy different approaches at different levels of the stack. Three techniques define this hybrid space:

Prompt Engineering (Wrapper)

The fastest, cheapest, and most accessible customization approach. System prompts, few-shot examples, and structured output specifications can dramatically change how a foundation model behaves for a specific use case — without any model modification. Appropriate for: task framing, tone and style calibration, output formatting, basic constraint enforcement. Limitations: context window dependent, no persistent learning, token costs scale linearly with injected context.

Retrieval-Augmented Generation (RAG)

Connect a foundation model to a vector database containing your proprietary documents, records, knowledge base, and reference data. The model retrieves relevant context at inference time and grounds its responses in your verified information. RAG is the dominant enterprise architecture pattern because it solves the most common enterprise AI problem — accuracy and currency of information — without the cost and complexity of model retraining. It is also inherently updatable: add new documents to the knowledge base and the model immediately has access to them, without retraining.

The cost tradeoff to model: each retrieved context chunk substantially inflates prompt size, which drives up per-inference token costs. At high volume, the math needs explicit modeling. At moderate volumes, RAG is almost always more cost-effective and more flexible than fine-tuning.

Fine-Tuning

Modifying model weights by training on domain-specific examples — teaching the model your vocabulary, your task patterns, your preferred output structures. Fine-tuning is appropriate when you have a stable, high-volume, repetitive task type where the standard foundation model output is consistently suboptimal in specific, documentable ways. It is not appropriate for general-purpose knowledge improvement (RAG handles this better) or for frequently changing information (retraining is expensive and slow).

Fine-tuning costs are frequently underestimated. Stanford HAI research finds that 73% of organizations underestimate fine-tuning budgets by 40–60% in their first year. The cost includes: data preparation and cleaning, GPU compute time (40–80 GPU-hours for a 7B parameter model), evaluation infrastructure, and ongoing retraining as requirements evolve. Build the full budget before comparing fine-tuning against alternative approaches.

Total Cost of Ownership: What the Numbers Actually Show

Cost Category Build (Custom) Buy (Vendor SaaS) Hybrid (RAG/Fine-Tune)
Initial development High ($500K–$5M+) Low (integration only) Medium ($100K–$800K)
Time to first value 12–24 months Weeks to 3 months 3–9 months
Ongoing inference cost High at scale, lower at very high volume Predictable per-user or per-call pricing Moderate, optimizable
MLOps / maintenance talent $200K–$400K+ per engineer Minimal internal talent required Moderate — engineers needed for data pipeline
Feature roadmap control Full control Dependent on vendor Partial — own the customization, buy the base
Vendor lock-in risk None (but talent risk) High Moderate — abstract the model layer
Data privacy / sovereignty Full control Dependent on vendor data agreements Manageable with self-hosted models
Switching cost over time Low (own the asset) High — deep workflow integration Low-Medium with abstraction layer

The hidden cost categories that most organizations underestimate: inference cost at production volume (OpenAI's 2024 inference spend was $2.3 billion — 15× its GPT-4 training cost), model monitoring and drift detection infrastructure, the cost of evaluating and periodically switching providers as the market evolves, and the organizational cost of switching when a vendor relationship becomes untenable. CIOs are now setting aside 9% of IT budget specifically for price increases on existing AI services.

Vendor Lock-In: The Risk Most Organizations Underestimate

Forty-five percent of enterprises report that vendor lock-in has already hindered their ability to adopt better tools — a striking statistic given that generative AI has only been enterprise-mainstream for three years. The concern is structural: as organizations integrate AI deeply into workflows, switching costs compound. Data formats, API dependencies, fine-tuned model weights, retrieval indexes, and embedded prompt structures all accumulate over time as switching barriers.

The mitigation strategies that sophisticated enterprises are using:

  • AI gateway architecture: Gartner projects that 70% of organizations building multi-LLM applications will use AI gateway capabilities by 2028, up from less than 5% in 2024. Gateways (LiteLLM, custom routers, enterprise API management) allow organizations to route to any foundation model provider behind a common interface — dramatically reducing switching cost.
  • Open-source model mix: Maintaining a proportion of workloads on open-source models (Llama, Mistral, Qwen) that can be self-hosted eliminates vendor dependency for those use cases and provides negotiating leverage on the proprietary side.
  • Open data formats: Ensuring that knowledge bases, embedding indexes, and fine-tuning datasets are stored in vendor-agnostic formats preserves portability. Proprietary vector database formats are one of the more insidious lock-in mechanisms.

The Decision Framework

Bringing this together into a practical decision framework: the build-vs-buy question is most cleanly resolved along two axes — the strategic value of the AI capability (how central is this to competitive advantage?) and the degree of data differentiation available (do you have proprietary data that creates meaningful model advantage?).

AI Build vs. Buy Decision Matrix
Map your AI use case to determine the appropriate strategy. Most enterprise use cases fall in the left column.
Strategic Value of AI Feature Data Differentiation Advantage Low High Low High BUY Commodity infrastructure Standard SaaS tools Email, meeting summaries, FAQ BUY + RAG Leverage data in retrieval layer Knowledge bases, document Q&A Buy the model, own the data BUY → HYBRID Buy now, build data moat Customer-facing AI features Plan for fine-tuning as data matures Deploy fast; data investment unlocks build later BUILD Core competitive differentiation Proprietary data × strategic outcome Unique workflow automation at scale Only justified when both axes are high

Practical Heuristics for 2025-2026

For leadership teams making AI investment decisions in the current environment, these heuristics reflect where the evidence currently points:

  • If competitors can deploy the same solution in 90 days: buy. The competitive moat does not exist if it can be replicated at vendor speed. Deploy the vendor, iterate on workflow design, and move faster than competitors on the next layer.
  • If your data moat represents 2+ years of proprietary accumulation: model the build case. Proprietary data of sufficient depth and quality is the most defensible AI differentiator available. If you have it, evaluate whether it justifies custom development.
  • If regulatory or compliance requirements prohibit third-party data processing: self-host or build. This is not a grey area. The compliance requirement resolves the question.
  • If you process more than 10 million tokens per month and growing: model open-source TCO. The economics of self-hosting become materially competitive at scale. Do the math explicitly before the next contract renewal.
  • If you need it working in less than 6 months: buy and plan for future customization. Build cycles do not compress reliably. Deploy a vendor solution, learn from production usage, and incorporate those learnings into a future build or fine-tune roadmap.
  • Always abstract the model layer. Whether building or buying, design your AI architecture so that the foundation model can be swapped without rewriting everything above it. The market is moving too fast to bet permanently on any single provider.

leapHL's assessment: The organizations generating the highest AI ROI in 2025 are not those who chose the "right" model — they are those who made clear-eyed build-vs-buy decisions, moved quickly on the buy decisions, and invested deliberately in the workflow integration and data layer that creates genuine differentiation. The build-vs-buy decision is the first, most important call. Get it right, and the rest follows.


Sources: Menlo Ventures State of Generative AI in the Enterprise 2025; MarkTechPost Build vs Buy Enterprise AI Framework 2025; NStarX Strategic Framework for Enterprise AI 2025; McKinsey State of AI 2025; Gartner IT Predictions 2026; MIT 2025 Enterprise AI Research; Stanford HAI Proprietary Data Research; Cloudera AI Convergence Report 2025; IBM Proprietary Data in GenAI 2025; Software Pricing Guide Enterprise AI Cost Analysis 2025; Matillion RAG vs Fine-Tuning Guide; Cyfuture Fine-Tuning Cost Analysis 2026; SitePoint Local LLM vs Cloud API TCO 2026; Kai Waehner Enterprise Agentic AI Landscape 2026.

Voice agents
Voice AI
April 22, 202614 min read

Voice Agents Are Now Enterprise-Ready. Is Your Organization?

Read Article
AI literacy
AI Strategy
April 15, 202616 min read

The AI Literacy Imperative: Your Organization's Biggest AI Risk Isn't the Technology

Read Article