A conversation with Kevin Sampson, first data hire at Vertex Service Partners, on what it actually takes to make AI useful inside a real business.
A few weeks ago, at a meetup in front of a couple hundred people, a speaker on the main stage said something I could not shake.
“AI does not need a semantic layer.”
It is a clean line. In most real businesses, it is also wrong.
I tested the idea on Kevin Sampson. Kevin built Vertex Service Partners’ data and analytics platform from zero. Before Vertex, he spent four and a half years at Amazon doing BIE work inside their logistics organization. Vertex itself is a private equity backed roll up of a couple dozen residential and commercial roofing brands, and Kevin was their first data hire. If anyone has felt the shape of this problem from both sides, hyperscaler and traditional industry, it is him.
His answer was more careful than a yes or no. The careful version is the whole story.
The hot take, and where it falls apart
The argument for skipping a semantic layer goes like this. Modern LLMs are good enough that, given raw tables and column names, they can work out what you are asking. You do not need a translation layer in the middle. Point Claude or GPT at the warehouse and let it run.
Kevin’s pushback was quiet, and firm. That only works in a business that is trivially simple.
“If you track sales the same way as the industry standard, revenue the same way, leads the same way, then you might get away with it. But if your business is not like that, and most businesses are not, you need a semantic layer.”
The reason is something every data engineer has lived through and rarely names cleanly. Businesses do not share a vocabulary. They share enough vocabulary to talk past each other in meetings, then quietly mean very different things by the same words.
The lingo problem is real, and AI makes it worse
Vertex is a textbook case. Roll up a couple dozen roofing companies, and you get three kinds of language colliding.
The traditional roofing terms each acquired company has used for decades. The business side terms layered on top by people who did not grow up in roofing. And all the gray area in between, where the same metric goes by three names depending on who you ask.
Switzerland is a useful comparison. When valleys do not talk to each other, dialects diverge. Companies inside a roll up are doing the same thing. They were isolated, so they developed their own words for the same roof. Without a semantic layer, an LLM has no way to know that what one brand calls a “lead conversion” is what another calls a “signed deal” is what finance calls “booked revenue.”
It gets sharper when you hit metrics that only exist because of how this specific business operates. Kevin gave me one example. In residential roofing sales, there is a concept of a two legged versus one legged visit. The belief, backed up in the data, is that a salesperson is much more likely to close when both household decision makers are home, because nobody can stall with “let me talk to my wife and get back to you.” That is a metric you cannot derive from column names. It is a definition that lives in the heads of people who have sold roofs for twenty years. Without a place to write it down, formally, with synonyms and conditions, your LLM is guessing. Confidently.
Confidently wrong is the worst failure mode
This is the part that should keep data engineers up at night.
LLMs do not hedge well. They produce confident, well structured answers regardless of whether the underlying data model agrees with the question. If a stakeholder asks “how did we do on conversion last quarter,” and the model picks the wrong definition of conversion, you do not get an error. You get a clean, plausible answer that is just wrong.
Increasingly, operators are taking those answers as truth.
Kevin put it this way. There is a real risk we are heading into a world where operators have no idea how the business actually works. They are being served pre chewed answers. They have stopped doing the twenty minute drill down into raw data, the trip back to the CRM, the sanity check against an SOP. The critical thinking lens is quietly disappearing.
A semantic layer does not fix this on its own. Without one, though, you have removed the only mechanism that grounds an LLM in your business’s actual definitions. You have made confidently wrong the default.
Semantic views are the floor, not the ceiling
Here is where the practical bit gets uncomfortable.
dbt’s semantic views, MetricFlow, the various metric layer offerings on the market. They are a good floor. A metric. A few synonyms. A definition. That is the baseline.
But Kevin made a point a lot of teams underestimate.
“It is an entire thirty years of experience being distilled into a README file. That is a lot.”
The institutional knowledge that makes a metric correct is not two lines. It is a paragraph. Sometimes a life story. Why this WHERE clause exists. Why we exclude these job IDs. Why the assumption baked in three years ago is still load bearing, or quietly wrong now and nobody has challenged it.
If you are building toward AI being a real participant in your data work, the job is not “ship the semantic layer and move on.” The job is continuously surfacing the assumptions buried in your SQL and writing them down where humans and LLMs can both read them.
The trust component compounds. If your semantic model cannot be pulled cleanly into BI, into ad hoc analysis, and into an LLM, and produce the same number every time, adoption stalls before it starts.
What this means if you are a team of one, or twenty
Kevin spent his first year at Vertex doing the unglamorous work. Ingestion. Transformations. Building a data model with no hardcoded values, where a change at the top of the funnel flows correctly to every downstream report. That investment is what makes everything he does now (Claude Code, React apps, ad hoc answers in two minutes instead of thirty) actually work.
The shortcut does not exist. AI is a force multiplier on the foundation you already have. If the foundation is a clean, well documented semantic model, the multiplier is enormous. If it is not, you are multiplying confidently wrong answers and calling it productivity.
So. AI does not need a semantic layer? Sure. If your business has no nuance, no jargon, no unique metrics, no acquired companies, and no institutional knowledge worth preserving.
For the rest of us, the semantic layer just became the most important thing you will build this year.
This conversation is from season four, episode one of the Meltano podcast. Subscribe for the rest of the series. Every guest closes by asking the next one a question they wish someone had asked them.
Dig Deeper 🎧
📺 Watch: https://youtu.be/dqCEep3naIU
👤 Connect with Kevin Sampson on LinkedIn
🌐 More episodes: meltano.com/podcasts