Moats vs. Margin

Hello and welcome back.

The software industry spent the 2010s worshiping at the altar of gross margins. Cloud infrastructure companies with 70% margins commanded premium valuations while services businesses with 40% margins were dismissed as lifestyle companies. This made intuitive sense. Software scaled infinitely with near-zero marginal costs. High margins signaled escape velocity from the gravitational pull of human labor.

By 2020, the margin obsession had calcified into orthodoxy. Investors used gross margin as a proxy for quality. Founders optimized around it religiously. The logic seemed unassailable: margins reflected operational excellence, pricing power, and the distance between your business and something that required warm bodies to scale.

But margins are a lagging indicator. They tell you what happened, not what will happen. They measure efficiency within an existing business model, not the durability of that model against future competition. The conflation of high margins with strong moats created a blind spot that the AI era is now exploiting ruthlessly.

Together with

Unlock Amazon Prime Perks You Never Knew Existed

Amazon Prime offers much more than just free shipping! Ensure you’re getting the most out of your membership with help from our deal experts. Discover 9 Prime benefits you should be using.

Discover Your Perks

The Software Era’s Margin Illusion

The 2020 consensus emerged from a specific economic reality. SaaS had standardized around predictable unit economics. Customer acquisition cost, lifetime value, payback periods. The playbook was mature. Companies competed on execution, not invention. In that environment, margins became the clearest signal of operational maturity.

Several factors reinforced this belief system. First, the marginal cost of serving an additional software customer approached zero. Each new user represented nearly pure profit after fixed costs were covered. Second, the market had eliminated most sources of friction. Cloud infrastructure was commoditized. Sales and marketing followed repeatable patterns. Product development could be outsourced or automated.

Third, and most important, competition moved slowly. Building a competing SaaS product took 18 to 24 months minimum. Recruiting engineers, designing interfaces, achieving feature parity, establishing credibility. The long gestation period gave incumbents time to compound advantages. High margins in this context actually did signal defensibility because the barriers to entry were structural and time-based.

But this created a conceptual trap. Markets began treating margins as intrinsic indicators of moat strength rather than artifacts of a specific competitive environment. A company with 80% gross margins must be more defensible than one with 50% margins, the thinking went, because higher margins meant more pricing power, better unit economics, and greater distance from commoditization.

This logic broke down when you examined actual competitive dynamics. Companies with identical margin profiles showed wildly different survival rates when faced with well-funded challengers. Slack and Zoom both had exceptional margins but responded very differently to Microsoft’s competitive pressure. HubSpot maintained steady margins while competitors churned. Margins predicted operational efficiency but not strategic resilience.

The error was assuming that what made sense in a mature, slow-moving software market would generalize across all technology businesses. High margins worked as a signal when the underlying business model was stable and competition was execution-driven. But they said nothing about what would happen when the model itself became vulnerable.

The AI Recalibration

The introduction of capable foundation models in late 2022 created a different economic regime. Suddenly, building functional products became faster but operating them became more expensive. The traditional SaaS cost structure inverted. Instead of high fixed costs and low marginal costs, AI-native products had lower fixed costs (less engineering time to launch) but higher marginal costs (inference expenses per user action).

This inversion changed what margins signaled. In the software era, low margins suggested either poor unit economics or a services component that limited scale. In the AI era, low margins in early products often indicate high usage intensity and deep integration of AI capabilities. A company with 40% margins might be processing dozens of model calls per user session, suggesting genuine utility rather than a thin wrapper around an API.

Consider the economics of a modern AI assistant versus a traditional SaaS tool. The SaaS product costs roughly the same to serve the 100th customer as the 1,000th. The AI assistant’s costs scale with usage intensity. A power user generating 500 queries per day costs 50x more than a casual user with 10 queries. This makes margin optimization a different problem entirely.

The companies that launched AI products in 2023 and 2024 faced a choice. They could artificially constrain usage to preserve margins, or they could accept lower margins to maximize product utility and user lock-in. The correct strategic choice depended on their theory of how moats would be built in this new environment. Companies betting on distribution and user retention accepted margin pressure. Companies betting on maintaining premium positioning kept usage constrained.

By mid-2024, evidence was emerging that usage intensity created its own form of defensibility. Products that became daily habits proved resistant to switching even when cheaper alternatives emerged. Users who integrated AI tools into workflows developed cognitive dependencies that high-margin competitors couldn’t easily replicate. The margin sacrifice bought something more valuable than short-term profitability: behavioral lock-in.

This doesn’t mean margins are irrelevant. A company with 20% margins and no path to improvement has a structural problem. But early margin pressure in pursuit of usage depth and retention can be a strategic choice rather than a failure. The question is whether the margin sacrifice is building something durable.

From our partner

Put your emails on autopilot, and your business results in overdrive.

Marketing that runs itself? Yeah, that’s a thing now.

Constant Contact has automation tools that run in the background so you don’t have to. Emails, texts, offers—they go out exactly when you want them to, without needing to hit send every time.

Want to make customers feel seen? Use an automation template to send birthday wishes. Trying to boost sales? Set up an abandoned cart email—it’s a friendly reminder that often leads to actual purchases. Got customers who’ve gone MIA? Send a promo or offer to bring them back.

You don’t need to micromanage any of it. Just choose your triggers—like someone clicking a link or leaving something in their cart—and the system handles the rest.

You stay focused on your actual to-do list. The marketing keeps humming in the background.

So, BuildMode reader: Are you ready to stop spending time on repetitive stuff? Then give Constant Contact’s automation tools a try. It’s free to get started, and honestly? It’s kind of a game changer.

Get started for free

Computational Moats in Practice

The AI era hasn’t eliminated traditional moats. It has changed how they manifest and which ones matter most. Understanding this requires moving beyond the classic categories into more granular analysis of what actually prevents customers from switching and competitors from catching up.

Operational Gradients

Traditional software improved through deliberate feature development. Engineers identified user needs, wrote code, shipped updates. Improvement was linear and labor-intensive. AI-native products can improve automatically through usage if designed correctly. Every interaction generates data that can refine models, improve relevance, or optimize workflows.

This creates operational gradients: systems that get better continuously without proportional increases in engineering effort. Perplexity’s search quality improves as it sees more queries and learns which sources users find valuable. The improvement isn’t just from retraining models but from understanding query patterns, result quality, and user satisfaction signals at scale.

The key distinction is whether improvement happens automatically or requires manual intervention. A product that collects usage data but requires engineers to analyze it and ship updates hasn’t built an operational gradient. A product that feeds usage data directly into ranking algorithms, recommendation systems, or model fine-tuning has created a compounding advantage.

Waymo represents the clearest example. Each autonomous mile driven generates telemetry that improves routing, object detection, and edge case handling. Competitors can access the same foundation models and sensor hardware, but they can’t replicate 20 million miles of real-world driving data being continuously processed to refine the full stack. The operational gradient is the moat, not any single technical component.

This form of defensibility is difficult to fake. Products claim to use AI to improve over time, but most improvements still come from engineering sprints rather than automated learning. The test is whether the product meaningfully improves for a specific user between month one and month six without any code deployments. If yes, an operational gradient exists. If no, it’s traditional software with AI features.

Cognitive Switching Costs

Users develop mental models of how tools work. In traditional software, these models are relatively stable. You learn Photoshop’s interface and that knowledge transfers across versions. AI products create different switching costs because user behavior shapes model behavior, and that relationship is unique to each product.

A user who has spent six months training Claude or ChatGPT to understand their writing style, domain expertise, and communication preferences has built a cognitive relationship that doesn’t transfer to competitors. The model has adapted to their patterns. They’ve learned how to prompt effectively. That relationship has value independent of the underlying model’s capabilities.

This goes beyond simple preference. It’s closer to learning a language. You can switch to a different language model the same way you can switch to speaking a different human language, but the cognitive overhead is real. Your fluency drops. Your efficiency decreases. The switching cost isn’t financial but neurological.

Companies building collaboration layers on top of models amplify this effect. Notion’s AI knows your document structure. GitHub Copilot learns your coding patterns. These systems aren’t just processing requests; they’re accumulating context about how you work. That context is the moat, not the AI itself.

The practical implication is that first products to achieve genuine daily usage in a category build cognitive moats that are difficult to overcome with purely technical superiority. Users will tolerate a 10-20% quality disadvantage in exchange for not relearning workflows and losing accumulated context. This wasn’t true in traditional software where switching between similar tools was relatively friction-free.

Data Feedback Loops

The value of proprietary data has always been recognized as a moat. What’s changed is how data creates defensibility in AI systems. In the software era, data was primarily used for analytics and reporting. In the AI era, data directly improves product functionality through fine-tuning, retrieval augmentation, and model adaptation.

The critical distinction is between data that exists and data that actively improves the product. Many companies collect vast amounts of user data but gain minimal defensibility from it because the data doesn’t feed back into product improvement. The moat exists only when the data loop is closed: usage generates data, data improves models, better models drive more usage.

Flock Safety built this deliberately. Their fixed cameras capture vehicle data across entire neighborhoods. The network effect comes from coverage density allowing them to track vehicles across larger geographic areas than competitors. But the deeper moat is the data loop: more cameras generate more complete datasets, which improve license plate recognition and object detection in challenging conditions, which makes the product more valuable, which drives more installations.

Multi-agent workflows create particularly strong data loops. When multiple AI agents collaborate to complete tasks, the interaction data between agents becomes increasingly valuable. A system that coordinates multiple models to research topics, synthesize information, and generate reports learns from every coordination failure and success. That meta-level data about how to orchestrate AI systems is harder to replicate than individual model performance.

The test for whether data creates a real moat is time-based: how long would it take a competitor with better models to match your product’s quality? If the answer is “immediately,” your data isn’t creating defensibility. If the answer is “18 months of similar usage volume,” you’ve built something real.

Infrastructure Scale Advantages

Training frontier models requires hundreds of millions in capital expenditure. This creates obvious moat-like barriers but not in the traditional sense. Model capabilities tend to converge over 6-12 month periods as multiple labs reach similar performance levels. The defensibility isn’t in having the best model today but in having the infrastructure to continuously train better models and serve them efficiently.

OpenAI’s advantage isn’t GPT-4 itself; it’s the infrastructure that allowed them to train GPT-4, serve it to 100 million users, collect feedback, and train GPT-5. That infrastructure represents thousands of person-years of engineering work in distributed systems, model optimization, and inference efficiency. Competitors can build similar capabilities but not quickly and not cheaply.

This matters more for vertical AI companies than many realize. Running sophisticated AI products at scale requires solving hard infrastructure problems. Prompt caching, batching strategies, model selection, latency optimization, cost management. Companies that internalize these capabilities build durable advantages over competitors using third-party APIs because they can optimize the full stack.

Anduril exemplifies this in defense technology. Their moat isn’t individual hardware products but Lattice, the software platform that connects sensors, analyzes threats, and coordinates responses in real-time. Building that platform required solving problems about data fusion, low-latency processing, and reliable operations in contested environments. Those infrastructure advantages compound over time and are difficult to replicate.

The signal to watch is margin trajectory combined with scale. Companies with infrastructure moats see margins improve as they scale because they’ve internalized expensive operations that others outsource. Companies without infrastructure moats see margins compress at scale because they’re dependent on third-party providers whose costs remain fixed.

Memory and Workflow Moats

The most underestimated form of AI-era defensibility comes from products that embed themselves in users’ daily workflows and maintain persistent context. This isn’t about features or capabilities but about becoming part of how users think about their work.

Traditional CRM systems achieved this by owning customer data. Every sales interaction, every email, every deal stage lived in Salesforce. Switching meant data migration and workflow disruption. AI products create similar lock-in through memory systems that understand user preferences, past decisions, and work patterns. The difference is that this context is continuous rather than transactional.

A product that remembers your last 100 interactions and uses that memory to inform future responses creates switching costs that grow over time. The longer you use it, the more valuable it becomes because it understands your specific needs better. This is distinct from personalization in traditional software, which was typically rules-based and static.

The workflow moat emerges when products slot into existing processes and become difficult to remove without disrupting those processes. This isn’t about adding features but about becoming infrastructure. When a team uses an AI tool for every customer support ticket, every code review, or every research task, removing that tool requires rebuilding processes from scratch.

Companies building for this deliberately design around recurring workflows rather than one-off tasks. They optimize for frequency over breadth. They invest in memory systems that maintain context across sessions. They build integrations that make the product a natural part of existing tool chains rather than a separate destination.

The metric that matters is daily active usage in critical workflows. Products used daily for essential tasks build moats through habit formation and context accumulation. Products used weekly for optional tasks remain vulnerable no matter how good they are technically.

How AI Transforms Traditional Moats

The classical moat categories still apply but manifest differently in AI contexts. Understanding these transformations is essential for evaluating competitive positioning.

Economies of Scale (Inverted Then Restored)

Traditional software had pure economies of scale. Fixed costs spread across more users drove better unit economics in a linear, predictable way. AI products initially inverted this dynamic. The marginal cost per user increased with usage intensity, making scale potentially destructive to margins.

But this inversion is temporary. Companies that reach sufficient scale can reinvest in infrastructure that brings costs down dramatically. Custom silicon, model optimization, efficient serving infrastructure. These investments only make sense at scale, creating a J-curve where costs first increase with usage then decrease as scale enables infrastructure optimization.

The practical signal is the gross margin trajectory. Companies with real economies of scale in AI show declining margins initially, then inflection points where margins begin improving. Companies stuck at scale without margin improvement have failed to achieve genuine infrastructure advantages. They’re just reselling compute at a markup.

Google’s advantage in AI isn’t primarily their model capabilities but their infrastructure for serving billions of queries efficiently. That infrastructure took decades and billions in capital to build. New entrants can’t replicate it without reaching comparable scale first, which requires surviving the period of negative unit economics.

Network Effects (Fragmented and Reconstituted)

Classic network effects made products more valuable as more people used them. Social networks, marketplaces, communication platforms. AI products create network effects but they manifest differently.

Single-player AI products don’t have traditional network effects. A better writing assistant doesn’t become more valuable because other people use it. But multi-player AI products, particularly those involving collaboration or coordination, create strong network effects. The value emerges not from other users existing but from the system learning to coordinate between users more effectively.

Notion’s AI becomes more valuable as teams use it because the system learns to surface relevant information across documents and users. The network effect isn’t social but operational. The product gets better at connecting information and coordinating work as it sees more examples of how teams collaborate.

Platform network effects also work differently in AI. GitHub Copilot benefits from training on public code repositories, but the network effect is primarily about model improvement rather than direct user interaction. More users generate more data about coding patterns, which improves suggestions for everyone. This creates a data network effect distinct from traditional social network effects.

The test is whether adding users improves the product for existing users. If yes, network effects exist. If users benefit only from their own usage, the product may be strong but network effects aren’t creating defensibility.

Brand (Redefined as Reliability)

Brand in software meant recognition and trust. Enterprise buyers chose Salesforce over competitors partially because the brand signaled safety. AI is redefining what brand means because the stakes are higher and the evaluation criteria are different.

AI brand is primarily about reliability and predictability. Users need confidence that outputs are accurate, safe, and appropriate. This matters more than features or capabilities because AI systems can fail in unpredictable ways. A model that’s correct 98% of the time but dangerously wrong 2% of the time is worse than a model that’s correct 95% of the time and predictably cautious the other 5%.

OpenAI built brand through consistent performance and thoughtful deployment. Users trust ChatGPT not because it’s always right but because it’s reliably useful and its failure modes are understood. That trust took years to build through careful product decisions and public interaction.

The brand moat in AI manifests as user willingness to adopt new products from trusted sources. When OpenAI launches a new capability, users try it immediately. When an unknown startup launches a competing feature, adoption is measured and cautious. That gap is brand value.

This suggests brand can be built faster in AI than traditional software because performance is immediately measurable. A product that consistently delivers accurate, useful results builds trust rapidly. But it can also be destroyed faster because AI failures are more public and more consequential than software bugs.

Switching Costs (Amplified Through Context)

Software switching costs came from data migration, user training, and integration complexity. AI adds another dimension: context loss. Users who have spent months training a system to understand their needs lose that investment when switching.

This creates switching costs that scale with usage duration and intensity. A user in month one can switch freely. A user in month twelve has accumulated context that would take months to rebuild with a competitor. This is especially true for products with long-term memory or learning components.

The switching cost multiplier is integration depth. Products that sit on top of existing workflows have low switching costs. Products that become infrastructure and touch multiple systems have high switching costs because replacement requires coordination across those systems.

Multi-agent systems create particularly high switching costs because users develop understanding of how agents interact and which agents to use for which tasks. That knowledge is specific to the platform and doesn’t transfer to competitors even if they have superior individual models.

Proprietary Data (Table Stakes, Not Differentiator)

Access to proprietary data was historically a strong moat. Companies with unique datasets could build products competitors couldn’t replicate. AI has both strengthened and weakened this dynamic.

Data is more valuable in AI because it directly improves model performance through fine-tuning and retrieval augmentation. A company with proprietary medical records, legal documents, or financial transactions can build domain-specific models that generic foundation models can’t match.

But data alone is insufficient because foundation models have become so capable that proprietary data advantages erode quickly. A competitor with good synthetic data generation, strong foundation models, and clever prompting can often match 80% of the performance of a proprietary data-trained model. That 20% gap matters in some domains but not all.

The defensibility comes from combining proprietary data with other moats. Data plus network effects plus switching costs creates durability. Data alone just delays commodity status. Companies that treat data as their only moat are vulnerable to model improvements that reduce the value of fine-tuning.

The practical test is model performance trajectory. If your proprietary data advantage is real, the gap between your product and generic alternatives should remain constant or grow as you collect more data. If the gap is shrinking over time, foundation models are catching up and your data moat is eroding.

Differentiated Technology (Temporary Unless Continuously Renewed)

Technical differentiation matters but its half-life has shortened dramatically. In the software era, technical advantages lasted years because reimplementing complex systems took time. In AI, technical advantages can be reproduced in months because the underlying components are increasingly commoditized.

A novel architecture, training technique, or model design provides temporary advantage but not durable moat unless continuously refreshed. The exception is when technical differentiation compounds through usage. If your technology improves automatically as more people use it, the advantage can be durable.

The key question is whether technical differentiation is in the foundation or the application layer. Foundation model improvements tend to be replicable because the research is published and talent is mobile. Application-layer improvements, especially those involving domain-specific integration or user experience optimization, can be more durable because they’re not easily extracted and reproduced.

OpenAI’s technical moat isn’t any single model but their velocity in shipping improvements and their infrastructure for training successively better models. That velocity is the defensible asset, not GPT-4 or GPT-5 individually. Competitors can match model capabilities but matching the pace of improvement requires comparable infrastructure and talent density.

The Distribution Override

Here’s a contrarian insight that founders often miss: in AI products, distribution advantages can be more durable than technical advantages. This inverts the conventional wisdom of the 2010s, when technical superiority was supposed to compound faster than distribution advantages.

The reason is speed of technical convergence. When frontier models improve every six months and capabilities spread across labs within 12 months, maintaining technical leadership requires enormous ongoing investment. Foundation model companies can sustain this. Application-layer companies rarely can.

Distribution advantages, by contrast, compound naturally. Once users adopt your product, you have direct access to usage data, feedback, and attention. You can iterate faster because you’re learning from real usage rather than hypothetical use cases. You can build network effects and switching costs that technical superiority alone can’t overcome.

Microsoft’s AI strategy exemplifies this. Their models aren’t always technically superior to OpenAI’s, but their distribution through Office and GitHub gives them direct access to billions of dollars of incumbent spending. They can be 10-20% worse technically and still capture significant market share because the switching cost from Excel to a new product is enormous.

This creates a challenging dynamic for startups. Building technical differentiation is expensive and has a short half-life. Building distribution is slow but creates compounding advantages. The optimal strategy may be to build enough technical differentiation to achieve product-market fit, then shift investment aggressively toward distribution before technical advantages erode.

The evidence supports this. Successful AI startups in 2024-2025 aren’t the ones with the best models. They’re the ones that achieved distribution velocity in specific niches before competitors recognized the opportunity. Perplexity didn’t have better search technology than Google. They had better distribution strategy for users who wanted AI-native search.

Momentum vs. Defensibility

Another underappreciated distinction: momentum creates growth but doesn’t guarantee defensibility. This matters because AI markets are showing extreme momentum dynamics. Products go from zero to millions of users in months. That growth feels like validation but doesn’t necessarily indicate moat formation.

Momentum comes from being first with a compelling use case. In early 2023, any competent ChatGPT wrapper could acquire users because the alternative was nothing. That user acquisition didn’t build moats because users had no switching costs and no accumulated context. When better alternatives emerged, users switched freely.

Defensibility comes from converting momentum into structural advantages. Users become sticky. Usage patterns create data advantages. Distribution channels deepen. Technology improves through operational gradients. These transformations take time and intentional strategy. They don’t happen automatically just because growth is fast.

The trap is conflating growth with moat formation. Fast growth buys time to build moats, but that time is finite. If you’re growing 20% month-over-month but not building structural advantages, you’re just postponing competition. When funded competitors arrive with comparable products, your early users have no reason to stay.

This explains why some 2023 AI startups that achieved extraordinary early growth have struggled to maintain defensible positions. They captured momentum but didn’t convert it into moats quickly enough. By the time they recognized the problem, competitors had reached feature parity and users had no compelling reason to stay.

The diagnostic question for founders is simple: if a competitor launched tomorrow with an identical product but 20% better performance, what percentage of your users would stay? If the answer is below 80%, you haven’t built real defensibility yet. You have momentum, which is valuable but temporary.

Early Margin Pressure as Signal

Here’s a non-obvious insight that cuts against conventional wisdom: early margin pressure in AI products can be a positive signal rather than a problem, provided it’s the right kind of margin pressure.

Margin pressure from high usage intensity suggests users find genuine value. If users are generating hundreds of requests per session, that’s evidence of product necessity rather than casual experimentation. The cost is real but the underlying usage behavior indicates potential for defensibility through cognitive switching costs and workflow integration.

Margin pressure from inefficient infrastructure is different. If costs are high because you’re using expensive third-party APIs without optimization, that’s a problem. The solution is infrastructure investment, not usage restriction. But many companies optimize the wrong direction: they constrain usage to preserve margins rather than investing in infrastructure to support usage.

The strategic choice matters enormously. Constraining usage to maintain 70% margins might protect short-term unit economics but surrenders the chance to build usage-based moats. Accepting 40% margins while optimizing infrastructure and maximizing usage might build something more valuable: daily habits and cognitive dependencies.

This doesn’t mean burning cash on customer acquisition or subsidizing unsustainable usage. It means distinguishing between margin pressure from value delivery and margin pressure from poor unit economics. The former can be strategic. The latter is always a problem.

The companies that will win long-term are those that achieve high usage intensity in critical workflows and then systematically reduce costs through infrastructure investment. That requires accepting margin pressure in year one and year two to build defensibility that enables margin expansion in year three and beyond.

The key metric is cohort margin trajectory. Are users from older cohorts more or less profitable than newer users? If margins improve with cohort maturity, you’re on the right path. If margins stay flat or decline, you have a structural problem.

The Enterprise Complexity Trap

AI products face a unique challenge in enterprise markets. Traditional enterprise software succeeded by adding features, integrations, and customization to justify high prices. But AI products that follow this playbook often destroy their core value proposition in the process.

The problem is that AI’s value comes from simplicity and automation. Users want products that work instantly without configuration. Enterprise buyers want control, customization, and integration with existing systems. These requirements are in direct tension.

Companies that over-index on enterprise requirements end up building complex products that lose the AI-native advantage. They add permission systems, approval workflows, and customization options that make the product harder to use and slower to improve. The product becomes defensible through switching costs but loses the growth characteristics that made it valuable initially.

The alternative is to maintain product simplicity and accept that some enterprise buyers won’t adopt. This limits short-term revenue but preserves long-term velocity. Products that stay simple can iterate faster, ship improvements more frequently, and serve a broader market. The trade-off is real but often worth making.

Slack’s early strategy exemplifies this. They resisted enterprise feature requests that would have slowed product velocity. They grew through bottoms-up adoption in engineering teams rather than top-down enterprise sales. Only after achieving critical scale did they layer in enterprise features. By then, the usage patterns and switching costs were strong enough that the added complexity didn’t destroy the core value proposition.

The diagnostic for founders is product velocity. Are you shipping meaningful improvements every month? If enterprise features are slowing that velocity, you’re trading long-term defensibility for short-term revenue. That trade might be necessary but should be made consciously and strategically.

Diagnostic Questions for Founders

Evaluating moat progress requires specific, grounded questions rather than generic assessment. Here’s a framework for honest evaluation:

Would a competitor with equal model access catch up within 90 days? If yes, you don’t have real defensibility yet. Your advantage is temporary and fragile. This isn’t necessarily wrong in month six, but it should be concerning in month 24. The question forces clarity about whether your moat is technical, operational, or distribution-based.
Is your product improving automatically with increased usage? Not “could it improve theoretically” but “is it measurably better for users today than six months ago without significant code changes?” If improvements require engineering sprints, you’re building traditional software with AI features. If improvements come from usage-driven learning, you’re building something with operational gradients.
What does a user lose if they switch away from your system? Be specific. Not “they would lose their data” because data is exportable. But “they would lose six months of context about their preferences, accumulated shortcuts, and the system’s understanding of their domain.” Quantify the rebuilding time. If it’s less than two weeks, switching costs are minimal.
Does the system generate data that reinforces itself? This is different from collecting data. It’s about whether the data directly improves the product in ways that compound over time. A positive feedback loop means more usage generates better data which improves the product which drives more usage. Absent this loop, you’re accumulating data but not building moats.
Are your users developing product-specific fluency? Watch for evidence that users are learning how to interact with your product more effectively over time. Are they writing better prompts? Using advanced features? Integrating the product deeper into workflows? This fluency represents cognitive switching costs that transfer value from the underlying models to your product layer.
Do users describe your product as essential or optional? Track the language users employ. “I use it daily” is different from “It’s helpful when I remember it exists.” Essential products have high activation energy for removal. Optional products can be turned off without disruption. Essentiality comes from workflow integration, not just utility.
Can you articulate why users wouldn’t switch to a free competitor? If your answer relies primarily on “we have better features,” that’s temporary. If your answer includes “they would lose accumulated context, integrated workflows, and team coordination patterns,” that’s structural. The switch-resistance test reveals whether you’re building features or moats.
Are your best users from month one or month twelve? If month one users are your best users, the product isn’t improving with usage or building retention mechanisms. If month twelve users are significantly more valuable, engaged, and retained, something is working. The cohort comparison reveals whether time in product creates accumulating advantages.
What would $10 million in competitor spending accomplish? Be realistic about what funded competition could achieve. Could they match your product in six months? Would users have compelling reasons to stay anyway? This forces honesty about whether your advantages are real or just head starts that capital can overcome.
Does margin structure support your moat thesis? If you believe usage intensity creates defensibility but you’re restricting usage to preserve margins, your actions contradict your strategy. If you believe infrastructure scale creates advantages but you’re outsourcing core operations to third parties, you’re not building what you claim. Margin structure should align with moat strategy.

The Next Five Years

Predicting the future is hazardous but some patterns seem increasingly clear. The next five years of AI defensibility will be defined by several shifts that are already visible in early form.

First, foundation model capabilities will continue converging rapidly. By 2027, the performance gap between leading labs will be minimal for most commercial applications. This doesn’t mean models won’t improve; it means improvement will benefit everyone approximately equally. The defensibility will shift entirely to application layer, distribution, and data feedback loops.

Second, infrastructure advantages will matter more than model advantages for most companies. The organizations that internalize inference optimization, fine-tuning infrastructure, and evaluation systems will compound advantages over those relying on third-party APIs. This is already evident in companies like Perplexity and Cursor that have built deep technical stacks rather than thin wrappers.

Third, usage intensity will become the primary leading indicator of defensibility. Products that achieve daily, workflow-critical usage will build durable moats through cognitive switching costs and context accumulation. Products that remain occasional-use tools will struggle to defend against competition regardless of technical quality.

Fourth, distribution advantages will compound faster than expected. Companies with existing user relationships, attention, or integration points will leverage AI to deepen those advantages. New entrants will need to find distribution innovation, not just technical innovation, to compete effectively.

Fifth, margins will cease being a primary valuation signal. Markets will focus instead on usage metrics, retention curves, and evidence of structural moats. Companies with 40% margins and strong usage-based defensibility will command premium valuations over companies with 70% margins and weak retention.

Sixth, vertical integration will reemerge as a competitive advantage. Companies that own their infrastructure, train their own models, and control their distribution will outcompete those that assemble commoditized components. This reverses the unbundling trend of the 2010s and returns to integrated stack advantages.

The companies that will win are those that recognize these shifts early and restructure their strategies accordingly. That means accepting margin pressure to maximize usage. Building proprietary infrastructure even when APIs seem sufficient. Optimizing for daily retention over monthly revenue. Investing in data feedback loops over feature development.

Most importantly, it means thinking about moats as dynamic systems rather than static advantages. The question isn’t “what moat do you have today” but “what moats are you building through your daily operations.” Companies that build systems where defensibility accumulates automatically through normal operation will compound advantages faster than those that treat moats as separate initiatives.

The AI era hasn’t eliminated the importance of defensibility. It has changed what defensibility looks like, how quickly it must be built, and what signals indicate its presence. Margins matter but they’re no longer the primary indicator. Usage patterns, infrastructure depth, and cognitive switching costs matter more. Companies that internalize this shift will build durable businesses. Those that don’t will discover that high margins provide no protection against obsolescence.

P.P.S. If you know someone on a product, growth, or finance team who would find this useful, feel free to forward it. And if this was forwarded to you, you can subscribe here for more grounded analysis on how tech companies actually grow.