"Chat with your data" – the new reality?

LLMs in Practice: How Intelligent Is the Data Dialogue – and Why Human Expertise Remains Indispensable

28. July 2025 //
Azure DevOps, Data & Analytics, Microsoft Power BI

What if you could simply ask your data model: "Why is the contribution margin of product group XYZ so low?" - and get a well-founded answer? GenAI and tools like Claude, Copilot & Co. promise exactly that. But how well does it really work? And what does this mean for data experts, power users and governance? An experience report between fascination and responsibility by data & analytics expert Marco Nielinger.

Motivation

A few weeks ago I saw a video by Kurt Buhler on the subject of "Talk to Power BI using Claude". Normally, I'm not that easy to get excited about. I've experienced too many hypes in my career that were cooked hotter than they were eaten. But I was immediately fascinated by this video. Never before had I seen such an impressive demo for handling data models and data in Power BI. I immediately had to try it out for myself.

What we mean by "turning data into impact"

Our claim in the area of data & analytics is: "We bring data and people into effect." Data engineering is at the heart of this: we prepare data from various source systems for analyses - by integrating, harmonising and applying technical logics using ETL/ELT tools, lakehouses or data warehouses. Good data engineering is the basis of successful projects - and requires not only technical expertise but also strong communication skills, problem-solving ability, abstraction skills and common sense.

»LLMs don't replace

data literacy – they make

a indispensable.«

From standard reports to self-service BI

We ultimately bring these people "into effect with the data" through the results of our work that are visible to them. The use of these artefacts requires different levels of data literacy, depending on their complexity.

Standard reports - use with minimal data literacy
Open report, fill in prompt, read result. Often provided via PDF subscription.
→ Consumption instead of interaction
Interactive dashboards - moderate data expertise required
Users must be able to understand and use filters, drill-throughs, navigation, etc.
→ Targeted interaction with pre-structured data
OLAP models - high data competence for individual analyses
Own analyses and reports require an understanding of model structure, relationships, granularities, etc.
→ Creative use and answering of previously unknown questions
Self-service BI & data science - expert skills required.
Data enrichment, modelling, use of Python or ML. Usually takes place in specialised teams.
→ Maximum freedom and technical depth

The reality of data analysis

Specialist departments are increasingly building up data expertise - but in practice, the use of standard reports and dashboards still dominates. This is rarely due to a lack of interest, but rather to day-to-day business: financial statements, system operation, ad hoc queries - there is little time to familiarise oneself with tools such as SQL, DAX, Python or ETL processes. The result: the majority consumes ready-made reports, only a few actively analyse on the basis of OLAP models. Genuine self-service BI or data science have so far been the exception.

"Chat with your data" in Power BI – history and current status

With the advent of LLMs such as ChatGPT, people are increasingly asking: "Do we still need power users at all - won't ChatGPT, Copilot etc. take over?" The answer: No - LLMs do not replace human expertise, but support it. They open up exciting fields of application, especially in the area of "chat with your data": questions such as "What is the contribution margin for product XYZ in 2019?" can be asked directly in natural language - without knowledge of model structures or data sources.

A fascinating use case that can make it much easier for users to access data. There is also an excellent article on this by Kurt Buhler.

Precursor "Q&A"

The option to ask Power BI questions using natural language has been available since the introduction of "Q&A" (Microsoft, 2013). With the Q&A visual, this function also became available for Power BI Desktop at the end of 2017. I personally never used this function intensively - partly due to limitations such as: English language only / no live connections to Analysis Services for a long time / very simple syntax required (e.g. "Show Revenue by Customer Name in 2019"). Q&A was also usually no more efficient as a reporting tool than the direct manual method.

Copilot for Power BI

Copilot for Power BI with LLM functions has been available since 2023 (initially only with Fabric F64, since April 2025 already usable from Fabric F2 ). Copilot supports three central functions: Generating DAX code, creating reports/pages and answering questions about data ("Chat with your data").

In May 2025, the preview of the "Standalone Copilot Experience" was launched, a standalone chat interface with cross-workspace search, report summaries and data queries.

The new AI instructions for controlling Copilot behaviour per model also significantly expand the possibilities. Development is rapid - copilot features currently dominate the Power BI updates.

Not quite a real conversation yet: Copilot remains terse

Nevertheless, the results of Copilot for Power BI in the "Chat with your data" area still feel very wooden: The answers are short, you have to ask very specific questions, and there is a lack of creative interpretations. Questions such as "Show me an overview of my data" or "What could be the cause of a low contribution margin?" are not answered satisfactorily. This is probably a deliberate decision by Microsoft (e.g. due to uncertainties in AI answers, number of tokens, etc.). In addition, only English is officially supported as a language. Questions in German sometimes produce a wild mixture of German and English in Copilot's answers.

Copilot Chat: Data Summary

Copilot Chat: Contribution Margin per Customer

Copilot Chat: Interpretation Question

Copilot Chat: Interpretation Question 2

Copilot Chat: Interpretation Question 3

Taking "Chat with your data" literally

I think: "Chat with your data" should feel like a real conversation with an intelligent partner - not just a shortcut for analyses that I could also implement in Excel or classic Power BI. Of course, this also leads to mistakes. But this is precisely how new perspectives can become visible that you might never have come across without LLMs. The responsibility remains with the human being: AI results must always be critically scrutinised and categorised.

"Chat with your data" with Claude and MCP server/tools for Power BI

Kurt Buhler's video showing how to talk to Power BI data using Claude, an MCP server and the Model Context Protocol (MCP) immediately got me excited. Not just because you don't need Power BI Premium or Fabric, but because the results were really impressive. Even before Kurt published his code on GitHub, I had a working version for my own test models. Claude (Sonnet 4.0) provided precise summaries, surprising insights and comprehensible explanations - e.g. on the discount structure as the cause of low contribution margins. No hallucinations, no mistakes. A real "aha" moment - I have rarely experienced so concretely what "data in effect" can look like.

LLMs become more powerful with tools

An MCP server provides an LLM with so-called "tools" via a standardised protocol - similar to the internet search in ChatGPT, which provides information beyond the knowledge cut-off. Python interpreters and file uploads are also tools for LLMs. AI agents (e.g. for planning, booking, automation) are based on such tools.

My metaphor: tools are to LLMs what tools are to us humans - they make real intelligence possible in the first place.

Chat, tool, response: How Claude understands and uses data

In the "Chat with your data" use case, the MCP tools access the Power BI and Fabric REST APIs in the background. With the appropriate authorisation, Claude can read models, generate DAX queries and interpret results. The result is a genuine analysis dialogue: Claude generates meaningful DAX queries during the dialogue, executes them and interprets the results - even for open questions such as "Show me an overview of my database".

The "magic" lies in the combination of modelling knowledge and tool logic. And it works surprisingly well.

Claude Chat: List Workspaces

Claude Chat: List Semantic Models

Claude Chat: Load Model Definition

Claude Chat: Summary of Data Inventory

Claude Chat: Data Inventory Summary Result

Claude Chat: Data Inventory Summary Result 2

Claude Chat: Contribution Margin per Customer

Claude Chat: Contribution Margin Result

Claude Chat: Interpretation Question

Claude Chat: Interpretation Result

Requirements and challenges

For an LLM like Claude to work reliably, it needs a clearly structured, comprehensible data model: speaking names, descriptions, consistent star schema, hidden irrelevant objects and synonyms in the linguistic schema. Everything that also helps people to use an OLAP model efficiently is all the more important for LLMs.

More complex model logic – and what LLMs get wrong without context

In more complex models, an LLM without context quickly reaches its limits. Example: Our central fact table in the HR analytics model is a headcount table that shows the number of employees per month. It is not possible to add up these numbers over months. Without a hint, the LLM makes the same mistake as an inexperienced person. However, if the context is provided - for example in the chat or via descriptions in the model - Claude creates correct DAX queries.

Claude Chat: Master Fact Table

Claude Chat: Master Fact Table Result

Context window is often still overestimated

Another obstacle is the so-called context window - the "working memory" of the model.
It includes all input and output tokens of a chat. If this limit is exceeded, the LLM forgets earlier parts of the conversation - and thus loses important context.

Many people overestimate how much information an LLM can process simultaneously - especially when uploading files or complex model definitions. The GPT-4o example shows this: With a Plus or Team licence, only 32K tokens are available, while Enterprise users receive up to 128K. That sounds like a lot - but it's not enough for complete PDFs or model structures. Instead, only specific sections are loaded. Questions about the overall context quickly reach their limits.

For comparison, the texts of "Romeo and Juliet" = approx. 30K-40K tokens, "The Medicus" = approx. 300K-400K tokens.

My example model (4 tables, 70 columns, 20 measures incl. TMSL, annotations etc.) is already around 20K tokens. A more realistic model such as our HR analytics model easily exceeds 1 million tokens - unadjusted.

Only a few LLMs (e.g. Gemini 1.5 Pro with 2M tokens) can currently keep up with this.

So it's no wonder that Microsoft's "AI Instructions" in Copilot enable targeted hiding of model areas - an important step towards greater efficiency.

AI-Instructions

Limitations, risks and dangers

LLMs will not replace data & analytics experts or power users in the foreseeable future - but they will jeopardise those who ignore their opportunities. Because those who use LLMs competently will gain a real advantage. At the same time, LLM answers need to be checked. Especially in complex models, the risk of hallucinations, misinterpretations or incorrect queries is still high.

Another problem arises with large, complex models: Even if the entire model fits into the context window, LLMs often struggle with open questions such as "Summarise the dataset for me" - simply because they cannot make a sensible selection from thousands of measures, columns and calculation groups. Intuition and prioritisation are (still) human strengths.

Powerful tools, powerful risks: Why governance is crucial

LLMs in combination with tools such as the Power BI and Fabric REST APIs can be powerful - but also risky. One wrong prompt and the model could delete all workspaces, for example. The risk of errors and misuse increases massively if there are no clear rules. That's why clean governance, graded authorisations and clear responsibilities are needed. An LLM uses what it gets - it has no moral authority.

Claude Chat: Delete Workspaces

Claude Chat: Nuke World

Claude Chat: Nuke World 2

Claude Chat: Nuke World 3

Claude Chat: Nuke World 4

Data protection is also critical: with "Chat with your data" with Claude Desktop, all content - workspaces, models, DAX results - flows to the LLM provider, in this specific case to Anthropic. What is to be processed therefore leaves the company's own environment - a risk that should not be underestimated.

Secure data with open-source LLMs?

With large providers such as OpenAI, Microsoft, Anthropic or Google, the data leaves the company's own system environment - a risk for many companies. Only open source LLMs, such as Ollama, offer the option of working locally.

Especially in Europe, the topic of data protection is rightly highly sensitive. The use of personal data in accordance with the GDPR must be checked particularly carefully - this alone is often a showstopper for tools such as "Chat with your data".

Challenges for data and analytics experts

The development of LLMs is fascinating - and at the same time worrying. We are at a point where it would be wise to consciously slow down the pace and take a closer look at the risks and ethical responsibility.

Like any disruptive technology, this one brings enormous opportunities - but equally great dangers.
It is up to us to utilise them responsibly - not driven by greed, hunger for power or excess, but by reason, attitude and a sense of proportion.

Chat instead of Code

For the first time, it is possible to analyse data using natural language - and identify patterns that might have remained hidden manually. What Power BI once promised with "Q&A", and which I ridiculed for a long time, is now a reality.

Despite all my reservations, I am convinced that this development can no longer be stopped. "Chat with your data" fulfils the needs of many data professionals: analysing data using natural language without having to dive deep into SQL, DAX or BI tools.

New tasks for data and analytics teams

The role of data & analytics experts is changing: instead of just delivering dashboards, in future it will be about designing semantic models in such a way that they are equally understandable and usable for people and LLMs.

I see three key success factors in the short term for the use of LLMs for "Chat with your data":

Simple and self-explanatory semantic models
Clear structures are required for an LLM to interact meaningfully with a model:

Speaking names, meaningful descriptions and synonyms
(e.g. "turnover" instead of cryptic "UMZ_SUM")
Use case-related perspectives
Instead of a central DWH model, we recommend splitting it into smaller models such as "Sales", "Finance" or "HR" - adapted to the user's perspective
Instructions for use directly in the model
e.g. e.g. via AI instructions for Copilot or through descriptive texts on tables and measures

Clean governance and structured authorisation processes
Technical protection is essential to minimise risks:

Clear access restrictions via tenant settings and security groups
Transparent authorisation assignment - especially for decentralised tools such as Power BI
Tool-supported control and documentation, e.g. through Purview integration

LLM expertise among users
Not only the model, but also the person must be "ready":

Good prompting and context engineering are crucial for helpful answers
Critical questioning of results remains mandatory - even with seemingly plausible answers
Understand LLM as a partner, not as a magician - don't get lazy

Without expertise, LLMs remain a blunt tool

Paradoxically, despite LLMs, data expertise is becoming more important than ever in the short term. Anyone who uses LLMs without reflection risks incorrect analyses and wrong decisions. LLMs are not intended to replace us, but to support us. That's why I would currently only recommend "Chat with your data" to people who are able to critically review answers - through their own analyses, DAX validation or a sound understanding of the model.

Only when this experienced group of users has optimised the model for AI use can the tool be gradually released for less experienced users - always with the note: "AI-generated content may be incorrect."

Because we are still(!) talking about "Copilot", not "Pilot".

Marco Nielinger

Management Consultant

noventum consulting GmbH
Münsterstraße 111
48155 Münster

+49 2506 93020

Go back