Your customers want to ask questions of their data in your product, in plain language, and get answers they can trust. This guide walks through how to add AI-powered (agentic) analytics inside your own application — for your customers — without shipping a chatbot that hallucinates numbers.
It's written for product and engineering leaders building embedded analytics: the customer-facing kind, where every tenant must see only their own data.
TL;DR
Adding AI analytics to your product is mostly a data-architecture problem, not a prompt problem. The reliable path: model your metrics in a semantic layer (the governed foundation that grounds the AI), enforce multi-tenant access control so each customer sees only their data, pick an embed surface, ground the AI agent in the semantic layer over MCP and governed metrics (not raw text-to-SQL), and cache with pre-aggregations for performance. Brex built its embedded AI financial analyst, Spaces, on Cube this way. You can wire up your own prompts and text-to-SQL stack, but the accuracy, governance, and tenant isolation are the hard 80% — and that's exactly what a semantic-layer-grounded platform gives you.
What teams get wrong about AI analytics in their product
The seductive demo is a chat box wired straight to GPT-class text-to-SQL over the production database. It works in the demo. It falls apart in front of customers for three reasons:
- Correctness. The model doesn't know your join paths, what "active customer" means, or which of three revenue columns is certified. Free-form SQL re-derives business logic on every prompt, so the same question can return different numbers — and you can't explain why.
- Isolation. In a multi-tenant product, the catastrophic failure isn't a wrong number; it's one
customer seeing another's data because an LLM-written
WHEREclause dropped a tenant filter. Access control cannot live in a prompt. - Performance and cost. Every AI question can fan out into several warehouse queries. Without caching, latency and bills climb with every tenant you add.
The fix for all three is the same: don't point the AI at raw tables. Point it at a governed semantic layer that already encodes your metrics, your access rules, and a fast query path. The semantic layer is what makes the AI useful — it's the reason Brex evaluated Cube against the dbt Semantic Layer and LookML and chose Cube to build Spaces, an embedded AI financial analyst its customers query directly.
The steps below assume that frame: a semantic layer at the foundation, with the AI grounded on top.
Step 1 — Model your metrics in a semantic layer
Start here, before any AI. A semantic layer is one governed definition of your metrics (revenue, active users, MRR), dimensions, entities, joins, and access policies. Every consumer — dashboards, APIs, and AI agents — reads the same definitions, so the numbers match everywhere.
This is the layer that grounds the AI. Instead of the model inferring table structure, it selects from certified metrics and dimensions, and the semantic layer generates the SQL. That single decision is what converts "impressive but unreliable" into "consistent and explainable."
With Cube, this layer is Cube Core — the open-source (Apache 2.0) semantic layer at the foundation of the platform. A few practical notes:
- It sits on top of your warehouse. Cube connects to Snowflake, BigQuery, Redshift, or Databricks; it does not replace them. You keep your data where it is.
- It can reference your dbt models. If you already transform data with dbt, you don't rebuild it. dbt models the data; the semantic layer governs the metrics on top and serves them to BI, embedded apps, and AI agents. Use dbt for shared persistent logic and the semantic layer for query-time iteration.
- It's SQL-first and extensible at query time. Your governed definitions stay intact while the AI builds ad-hoc calculations on top — so analysts and agents can go beyond the pre-built metrics without forking the model.
Define the handful of metrics and dimensions your customers actually ask about first. You can grow the model later; you cannot ground an agent on metrics that don't exist yet.
Step 2 — Enforce multi-tenant access control
In embedded analytics this is the step that protects the business. Every query a customer makes — and every query their AI agent makes on their behalf — must be scoped to that customer's data, and that scoping has to happen at the data layer, not in application code or a prompt.
The standard pattern:
- Your application authenticates the user and issues a signed token (typically a JWT) carrying a security context — for example, the tenant ID and role.
- The semantic layer reads that context and applies row-level, multi-tenant security to every query it generates, regardless of how the query was phrased or who phrased it.
- Because the rules live in the semantic layer, an AI agent inherits them automatically. The agent can't ask its way around a tenant boundary, because the boundary isn't in the question — it's in the layer that answers it.
This is why grounding the AI in a semantic layer is a security feature, not just an accuracy one. A raw text-to-SQL agent has to be trusted to write the tenant filter correctly every time; a semantic-layer agent never writes that filter at all — the platform does, on every request.
Step 3 — Choose your embed surface
How AI analytics shows up in your product depends on how custom you need the experience to be. Cube offers a few surfaces; pick by goal, not by default.
| Embed surface | Best for | What you get | Effort |
|---|---|---|---|
| Analytics Chat API | Fully custom or agent-to-agent experiences | A governed, conversational endpoint your own UI — or another AI agent — calls to ask questions and receive structured answers | Higher (you build the UI/agent) |
| iframes | The fastest drop-in | A governed analytics experience embedded with a signed security token, minimal frontend work | Lowest |
| Creator Mode | In-app authoring | Let your customers build and save their own views and reports inside your product | Medium |
| Core Data APIs | Maximum control | Direct, governed access over SQL, REST, and GraphQL to build any experience on top | Higher |
A common pattern: start with an iframe to ship something governed quickly, then move the highest-value flows to the Analytics Chat API or Core Data APIs as you invest in a bespoke UI. The security context (Step 2) flows through all of them, so you're not re-solving isolation when you switch surfaces.
If embedded analytics is new territory for your team, our guide to the best embedded analytics platforms in 2026 covers the surrounding decisions.
Step 4 — Ground the AI agent in the semantic layer
This is the difference between a feature you can put in front of customers and one you can't. The agent should query governed metrics, not the warehouse directly and not via free-form SQL.
Concretely:
- Expose the semantic layer to the agent over MCP. The Model Context Protocol is the emerging standard for connecting AI agents to tools and data. Cube exposes governed metrics over an MCP server, so an agent — yours, or your customer's own agent in an agent-to-agent setup — selects from certified definitions instead of re-deriving SQL.
- Let the model pick metrics; let the layer write the query. The agent reasons about which metric, dimension, and filters answer the question. The semantic layer turns that into governed SQL. Because a metric maps to one definition, you can show your customer exactly what produced a number — and it matches the same number on their dashboard.
- Keep it extensible. SQL-first, query-time extensibility means the agent can construct an ad-hoc calculation on top of governed metrics when a customer asks something the model didn't anticipate — without abandoning governance.
The anti-pattern to avoid is raw text-to-SQL against raw tables. It's the fastest thing to prototype and the slowest thing to trust. For a deeper treatment of this pattern, see semantic layer for AI agents.
Step 5 — Cache with pre-aggregations for performance
AI analytics is query-hungry. A single natural-language question can become several warehouse queries as the agent explores, and in a multi-tenant product many customers run those questions at once. Without a caching strategy, latency and warehouse spend scale with adoption — the opposite of what you want.
Cube's pre-aggregations materialize common query shapes into optimized rollups, so frequent questions are answered from the cache instead of round-tripping to the warehouse every time. In practice: pre-aggregate the metric and dimension combinations your customers (and their agents) hit most, let the warehouse handle the long tail of novel queries, and watch concurrency — not just single-query latency. Multi-tenant embedded load is about many simultaneous queries, and AI agents amplify that.
Performance is also a trust signal: an AI analyst that answers in a second feels reliable; one that stalls for fifteen feels broken, no matter how correct it is.
Step 6 — Ship, then iterate
Get a thin slice in front of real customers early: a few governed metrics, strict tenant isolation, one embed surface, the agent grounded over MCP, and pre-aggregations on your hottest queries. Then iterate on what people actually ask.
- Add metrics in response to real questions, not anticipated ones. Your customers' questions are your roadmap for the semantic layer.
- Review agent transcripts for questions that returned "I can't answer that" — those map to missing metrics or dimensions, a model gap you can close rather than a prompt to tweak.
- Tune pre-aggregations against observed query patterns, and expand surfaces from a quick iframe to a custom Analytics Chat experience for your highest-value flows.
Because every surface and the agent all read the same governed model, each metric you add improves the dashboards, the APIs, and the AI at once.
Build vs. buy: your own text-to-SQL stack vs. a semantic-layer platform
You can build this yourself. The honest tradeoff:
Building your own prompts and text-to-SQL stack gets you a demo quickly. What you then own forever is the hard part: making answers correct across changing business logic, isolated so no tenant ever sees another's data, fast under concurrent multi-tenant load, and explainable enough to put in front of paying customers. That's not a prompt-engineering project; it's a governed data platform project.
Buying a semantic-layer-grounded platform gives you certified metrics, multi-tenant access control, caching, and embed surfaces out of the box, with the agent grounded on the same governed model your dashboards use — so you spend your engineering time on your product. Note that many point analytics tools were built single-tenant-first or layer AI on after the fact; neither is what you want underneath a customer-facing AI feature. An AI-native platform that's multi-tenant by construction is.
For most teams shipping customer-facing AI analytics, the build-it-yourself path underestimates the governance and isolation work by an order of magnitude. Buy the foundation; build the experience.
Implementation checklist
- Metrics modeled in a semantic layer (start with the metrics customers actually ask about; reference existing dbt models rather than rebuilding them).
- Warehouse connected (Snowflake / BigQuery / Redshift / Databricks) — semantic layer on top, not a replacement.
- Multi-tenant security enforced at the data layer via a signed security context (e.g., a tenant ID in a JWT); verified that a tenant cannot retrieve another tenant's rows.
- Embed surface chosen by goal: Analytics Chat API (custom / agent-to-agent), iframe (fastest), Creator Mode (in-app authoring), or Core Data APIs (max control).
- Agent grounded in the semantic layer over MCP / governed metrics — no raw text-to-SQL against raw tables.
- Explainability confirmed: every answer can be traced to a metric and filters, and matches the corresponding dashboard.
- Pre-aggregations configured for the highest-traffic queries; concurrency tested under multi-tenant load.
- Iteration loop in place: review agent transcripts, add metrics for unanswered questions, retune caching.
When building it yourself is still the right choice
Be honest about your situation. Rolling your own can make sense if AI analytics is a throwaway internal prototype, if you have a single tenant with no isolation requirements, or if "analytics" means one or two static charts that don't need natural-language querying or governance. In those cases the overhead of a platform may not pay off yet.
The calculus flips the moment the feature is customer-facing, multi-tenant, and expected to be correct. That's when the semantic layer stops being optional infrastructure and becomes the thing that lets you ship.
Our verdict
To add AI analytics to your product the way it'll survive contact with real customers: model your metrics in a semantic layer, enforce multi-tenant access control at the data layer, pick an embed surface for your goal, ground the AI agent in the semantic layer over MCP and governed metrics, and cache with pre-aggregations. That's the architecture behind Brex Spaces, built on Cube — the agentic analytics platform whose open-source core, Cube Core, is the semantic layer that makes the AI useful. Build the experience your customers love; buy the governed foundation that makes the answers trustworthy.
Methodology
This guide reflects the architecture we see work for customer-facing AI analytics as of 2026: a semantic layer at the foundation, multi-tenant access control enforced at the data layer, an embed surface matched to the use case, an agent grounded in governed metrics (increasingly over MCP), and pre-aggregation caching for performance. Product capabilities and standards like MCP evolve quickly, so confirm specifics against current documentation. As the publisher, Cube has an obvious interest here; we've aimed to describe the build-vs-buy tradeoff fairly and to be explicit about when building it yourself is the better call.