How AI changed in five years (and what it means for your business)
From GPT-3 to autonomous agents: a practitioner's account of what actually shifted, what's still hype, and why the window for early advantage is shorter than you think.
I started building AI automations before most people had heard of ChatGPT. The tools were rougher. The outputs were inconsistent. The use cases were narrower and required more duct tape. But something was clearly happening, and most businesses had no idea.
Five years later, the situation is different. The tools are better by orders of magnitude. And most businesses still have no idea.
Here's what actually changed — not the hype version, the practitioner version.
The Three Shifts
I think about the last five years in three named phases. They blend into each other, but naming them is useful because each phase unlocked a different category of business application.
Shift 1 — Capability (roughly 2020–2022):
The question in this period was: *can it do the thing at all?*
GPT-3 dropped in mid-2020 and it was immediately obvious that something real had happened. You could prompt it to write a marketing email and it would produce a recognizable marketing email. You could ask it to summarize a document and it would summarize the document.
The outputs were imperfect. They required heavy editing. They hallucinated freely and with enthusiasm. But the baseline capability — generating coherent, contextually relevant language — was there.
Business use in this period was mostly experimental. Early adopters were using it for content ideation, rough drafts, basic code assistance. Nobody was putting it near customers or critical workflows without heavy supervision.
Shift 2 — Reliability (roughly 2023–2024):
The question in this period became: *can we trust it with real tasks?*
GPT-4 and the generation of models that followed it closed an enormous gap. Outputs were not just coherent, they were accurate enough to use in production contexts — with the right guardrails. More importantly, the tooling ecosystem grew up. APIs stabilized. Prompt engineering matured from a dark art into something teachable. Vector databases made it possible to give models access to your company's actual documents without fine-tuning.¹
This is when I started building systems I'd let touch customers. Lead qualification agents that could handle 80% of inbound conversations without a human. Document processors that extracted structured data from unstructured PDFs. Automated follow-up sequences that felt personal because they actually referenced specific details from prior conversations.
The failures were still real, but they were predictable. You could design around them.
Shift 3 — Affordability (2025+):
The question now is: why would you not use it?
The cost argument against AI automation has largely collapsed. Running GPT-4-class intelligence costs fractions of a cent per interaction. The open-source models that run on commodity hardware are legitimately capable for many business tasks. The no-code and low-code platforms — Make, n8n, Voiceflow, and a dozen others — mean you don't need a software engineering team to build production workflows.
For a business task that involves reading, writing, classifying, summarizing, or responding — the question is no longer whether AI can do it or whether you can afford it. The question is whether you've gotten around to building it yet.
"The cost argument against AI automation has largely collapsed. The barrier left is organizational, not economic."
What's Still Hype
I want to be honest about what hasn't happened, because the hype-to-reality gap is still significant in certain directions.
Fully autonomous agents are not ready for most business workflows. The vision of an AI that independently manages complex multi-step projects — making judgment calls, handling exceptions, escalating appropriately — is real in demos and fiction in production. Current agents are reliable for narrow, well-defined tasks. They fall apart in proportion to how much ambiguity and edge-case judgment is required.²
AI that replaces senior knowledge workers is not the near-term story. AI that replaces the junior, repetitive portions of senior knowledge workers' jobs is very much the near-term story. The leverage is in reclaiming the hours, not replacing the person.
Trend prediction and forecasting remain weak spots. AI is not a crystal ball. It's a pattern matcher, and patterns break at precisely the moments you most need predictions. Anyone selling you "AI-powered demand forecasting" that implies the model will catch the next black swan is selling you something that doesn't exist yet.
The Early Advantage Window
Here's the uncomfortable truth about timing: the window for differentiation is real, but it's not infinite.
In 2022, building an AI-powered customer service workflow was a genuine competitive moat. Almost nobody was doing it. If you were in a market where your competitors were still handling all support manually, the efficiency advantage was enormous.
In 2026, that advantage is shrinking — not because the technology matured, but because adoption is accelerating. The software vendors in your industry are building AI features directly into their platforms. Your competitors are starting to figure it out. The question is shifting from whether to adopt to how fast and how well.
What to Do With This
If you've been watching this space and waiting for the right moment — this is the closest thing to a right moment you're going to get. Not because the technology is done improving (it isn't), but because the tools are mature enough to build reliable things, cheap enough to justify the investment, and new enough that you still have a head start in most markets.
The specific advice: don't start with the flashiest use case. Start with the highest-volume, lowest-judgment task that touches your customers or your team every day. That's where the ROI is fastest and the risk is lowest.³
And when the next capability shift arrives — because it will — you'll have the operational muscle to absorb it. That's the real asset.
¹ Retrieval-augmented generation (RAG) is the technique of giving a model access to a searchable database of your own documents at inference time, rather than baking the knowledge into the model via fine-tuning. It's more flexible, cheaper, and keeps your data out of the model's weights. Most enterprise AI applications use some version of this.
² The benchmark for agent reliability in my experience: if the task has more than five distinct decision points, test extensively before deploying without human oversight. Edge cases multiply faster than you expect.
³ A useful heuristic — if a human doing the task looks at it as pure processing (no real judgment, just applying rules to inputs), AI can probably handle it with minimal supervision. If the human applies anything they'd describe as "professional judgment," build in a review step.