Learn

There's still no free lunch in information retrieval

AI models are powerful, but as is they are often useless in a business context. They don't know your specific compliance rules, your internal knowledge base, or your customer's context.

Summary

AI models are powerful, but as is they are often useless in a business context. They don't know your specific compliance rules, your internal knowledge base, or your customer's context.

The solution is Retrieval-Augmented Generation (RAG). Instead of just answering from its pre-trained knowledge, the system first retrieves relevant knowledge from your data and then appends that context to the user's prompt, giving the LLM the source material it needs.

RAG has shed new light on the field of information retrieval. But while LLMs might challenge the old "garbage in, garbage out" reality (they can often produce seemingly OK output from messy input, but can also hallucinate garbage from good input), there is one thing they haven’t fundamentally changed: every information retrieval costs someone time and thought. The only question is who pays, and when.

In this post, we'll explore how this cost has historically been paid in traditional information systems and what LLMs truly change about this economy.

The human economy of retrieval

By “cost,” we mean more than money or computation. The real price is human effort: the time, design, and judgment it takes to make information findable, trustworthy, and usable. This human economy, deciding who pays this effort and when, is what LLMs are now changing.

This cost can be paid in different ways. A classic framework to approach it is schema-on-write vs. schema-on-read. As Hrishi Olickel elegantly put it, think of schema-on-write as putting clothes straight into the wardrobe and schema-on-read as sorting a messy pile when you need something. The choice decides whether you pay upfront or at retrieval..

This illustrates that a system distributes its cost between upfront effort and usage effort. We can break this down into three categories:

Indexing cost: This is the upfront effort to prepare, structure, and design information so it can be found later. This includes everything from designing a database schema to building the user interface of an application.
Querying cost: This is the user's effort at the moment of retrieval. This includes navigating file folders, writing an SQL query, or filtering records in an application.
Quality: A compromised retrieval quality is a cost in itself. . Search experts describe this as recall (finding all relevant information) and precision (finding only relevant information). Any tradeoff that decreases precision or recall results in a lower quality.

Let's make this concrete. Imagine someone wants to find all recipes using carrots in their favorite cookbook. The "cost" to retrieve this information can be paid in three ways:

The high querying cost way: Go through every single page of the cookbook. The indexing cost is zero, but the querying cost (the user's effort) is massive.
The quality cost way: Skim through the cookbook, relying on memory or chance. This has a very low querying cost, but the resulting quality is low: they will almost certainly miss recipes.
The high indexing cost way: The author creates an ingredient index at the back. This index costs the author time to build, but in return, every reader benefits from near-zero querying cost and high quality.

It happens that the latter is the most common, because it just makes sense for this use case. The author pays the price once, and every reader benefits.

The cost of retrieval in traditional systems

Just as the cookbook author made a strategic choice about cost, every system that stores and retrieves information decides where the human effort will sit. Some invest in structure upfront so retrieval is effortless later, while others defer that effort to the user.

Here’s how this tradeoff traditionally looks across the various layers of systems a user relies on to access data in a business context:

File system

This is the most basic approach. A human creates folders and names files based on their own memory and conventions (e.g., Recipes/CarrotSoup.txt). Retrieval means manually browsing these folders and opening files.

Indexing cost: Very low.
Querying cost: Very high. You browse manually, open each file, and scan for "carrot."
Quality: Low. Success depends on memory, luck, and discipline.

Databases

The relational database approach demands upfront structure. A developer first designs a rigid schema (e.g., recipes and ingredients tables) that defines how data must be stored. Information is retrieved by writing a formal query in a language like SQL.

Indexing cost: High. Requires schema design and data preparation.
Querying cost: Moderate. You must learn SQL and the database schema.
Quality: High. The system is deterministic and logical, returning a precise and complete answer.

Keyword search

Each recipe is a document. The system scans the text, breaks it into words (tokens like "carrot"), and builds an inverted index: a map that links each token back to all the recipes that contain it. When a user searches, the system uses this index to find all matching documents and then ranks them using a statistical algorithm like TF-IDF or BM25, to determine which document is most relevant.

Indexing cost: Moderate. The main cost is the continuous human effort to configure, tune, and maintain the index schema, analyzers, and relevance models.
Querying cost: Low. The user just types a keyword, like "carrot."
Quality: Low. Results are based on statistical relevance, not real-world understanding. It's probabilistic and matches word statistics, not user context.

Vector search

Instead of indexing words, this method captures meaning. Each recipe is fed into an AI embedding model, which converts its semantic meaning into a numerical list called a vector. Recipes with similar meanings (like "Carrot Soup" and "Pumpkin Soup") are stored close to each other in this "vector space."

Indexing cost: Moderate. Requires choosing an embedding model, computing embeddings, and maintaining a specialized vector index.
Querying cost: Low. The user's query is also converted into a vector to find the closest semantic matches.
Quality: Low. This method moves retrieval from matching exact words to matching semantic meaning. It allows for richer discovery but offers less precision.

Ontology-defined knowledge base

This is the most structured approach, sitting at the "schema-on-write" extreme. It's the core of systems built for absolute, business-critical, and highly complex use cases (think of data platforms like Palantir) where the relationships between data points are just as important as the data itself.

The "schema" for this system is an ontology: a formal model that defines the domain's logic. It models the concepts, relationships, and rules that give the raw data its context and meaning. It explicitly defines the "objects" (entities like Recipe or Ingredient), the "relationships" that connect them (like hasIngredient), and the "rules" that govern them . These relationships are treated as first-class, queryable parts of the model, not just as structural links.

Information is stored by instantiating these formal concepts. You don't just index the text "Carrot Soup"; you explicitly define:

An object: Carrot Soup (Type: Recipe)
An object: Carrot (Type: Ingredient, Category: Root Vegetable)
A relationship: Carrot Soup → hasIngredient → Carrot
Rules: Recipe must have at least one Ingredient via hasIngredient
Indexing cost: very high. Requires deep collaboration between domain experts and data engineers to define the ontology and map data.
Querying cost: moderate. The user no longer searches for text, they explore relationships (e.g. "Find all Recipes that haveIngredient Carrot.")
Quality: very high. Results are precise, deterministic, and explainable because the "meaning" is the index itself.

Applications (CRM, Recipe App, etc.)

This is the "peak" indexing cost approach. Developers encode structure and logic directly into product features. The retrieval mechanism is abstracted away entirely by the user interface: a search box, a filter, or a button.

Indexing cost: very high. The schema, backend logic, and interface are all built upfront.
Querying cost: very low. Retrieval is abstracted away by a button or filter.
Quality: Very high. The application is a deterministic system; retrieval logic is pre-defined and trusted, guaranteeing the result quality.

Recap of Costs

System	Indexing cost	Querying cost	Quality
File management system	🟩 Very low	🟥 Very high	🟥 Very low
Databases	🟥 High	🟨 Moderate	🟩 High
Keyword search	🟨 Moderate	🟩 Low	🟥 Low
Vector search	🟨 Moderate	🟩 Low	🟥 Low
Knowledge Base / Ontology	🟥 Very high	🟨 Moderate	🟩 Very high
Applications	🟥 Very high	🟩 Very low	🟩 Very high

Deciding where to pay the retrieval cost

This trade-off is the core design choice for any information system. A system's design is ultimately an economic bet on its expected usage patterns: how flexible it needs to be, how frequently it will be retrieved and how reliable its answers must be.

Paying a high upfront indexing cost only makes sense if it can be amortized over many future retrievals. The real choice is who pays the effort (the builder or the user) and when (upfront or at the moment of retrieval).

Choose rigidity or flexibility

This is a direct consequence of your bet on frequency.

Choose rigidity when retrieval patterns are known, shared, and frequent (e.g., "Show me this customer's order history"). You pay the high upfront indexing cost to build a reliable system, betting that this cost is worth the low-cost experience for thousands of future retrievals.
Choose flexibility when retrieval patterns are unknown or exploratory (e.g., "What's our most unusual sales trend this quarter?"). You pay a lower indexing cost and shift the effort to the user, betting it's cheaper than building a rigid system for every possible question.

Choose deterministic truth or probabilistic discovery

This choice is about the fidelity of the answer.

Choose deterministic truth when results must be verifiably "true" and complete (e.g., an accounting ledger). This requires a high indexing cost to build a system with perfect data integrity.
Choose probabilistic discovery when you need a "good" or "relevant" answer, not a single, perfect one. Systems like keyword and vector search trade logical precision for speed and discovery.

Remember "pay upfront" means "pay to maintain"

The "pay once" model is a myth. A high upfront indexing cost, like a database schema, isn't a one-time payment. It commits you to a high ongoing cost for maintenance, governance, and updates every time business needs change. The true cost is upfront design + continuous governance.

What LLMs (and RAG) change

LLMs do not remove the economics of information retrieval. They simply change where and how the costs occur. They can be applied directly to our traditional systems to augment them, and they also require, enable and benefit from new architectures like RAG.

LLMs impact on traditional retrieval systems

Let's first look at how LLMs augment the traditional systems themselves:

File system: An LLM can scan documents to suggest tags or summaries. This adds a new (but small) indexing cost, which in turn slightly reduces the querying cost by making files more findable. The fundamental problem (high-cost manual browsing) largely remains.
Databases: LLMs can help lower the indexing cost by transforming unstructured data into structured rows. Text-to-SQL drastically lowers the querying cost. The user no longer needs to learn SQL, but the cost doesn't disappear. It shifts to the lower quality of a plausible-sounding but potentially flawed answer.
Keyword Search: LLMs can act as powerful "assistant indexers." By pre-reading documents to extract entities, add summaries, and generate synonyms, they enrich the index. This requires a higher upfront indexing cost (to configure and manage these enrichment pipelines) but significantly improves the quality for the same low-effort user query.
Vector Search: This system is already an LLM-native approach. Its costs are a moderate human indexing cost (to choose a model and manage the embedding pipeline) and a "rather low" cost (due to the probabilistic nature of the results).
Ontology-defined knowledge: LLMs can help lower the very high indexing cost by transforming unstructured sources (raw text, unstructured documents) into structured entities. However, defining the core ontology still requires human domain experts. They also drastically lower the querying cost. Because the ontology provides a rich semantic layer, it is uniquely accessible to natural language, giving the LLM a map of meaning and context that a raw database schema lacks. This shifts the user's effort from "learning the ontology" to "phrasing the question perfectly".
Applications (CRM, etc.): LLMs can marginally reduce the indexing cost (the human cost of development) by using coding agents to help build application logic faster. On the user side, conversational interfaces can replace fixed filters, lowering the querying cost. However, this introduces a significant risk: it trades the application's "very high" (deterministic) quality for the "rather low" (probabilistic) quality of a conversational interface.

RAG simply redistributes the retrieval cost

Retrieval-Augmented Generation (RAG) is not a new retrieval system, but a new architecture built on top of the traditional ones.

A RAG pipeline is a two-step process:

Retrieve: Use a system (like vector search or keyword search) to find relevant context.
Generate: Feed that context, along with the user's prompt, to an LLM to synthesize a new answer.

This new architecture doesn't eliminate the human cost. It further distributes it:

Indexing cost remains: You still pay the full human indexing cost for whichever retrieval system you use (e.g., the human effort of collaborating with domain experts to define the ontology).
Querying cost is transformed: The cost shifts from technical skill (like writing SQL) to cognitive skill (effective prompt engineering). The user no longer just finds information. They must guide the LLM to use its retrieval tools correctly and validate the results. This is a very different, and often still high, form of human effort.
Quality cost is transformed (and often hidden): This is the most critical change. The risk is no longer just "Did I miss a document?" (recall). The risk is now synthetic and semantic: "Did the LLM use the right facts, ignore the wrong ones, and synthesize them correctly without a plausible-sounding hallucination?"

There is no "best" RAG, only the "right cost"

The current debate over different RAG approaches, such as vector RAG, agentic RAG or ontologyRAG isn't a question of which technology is "best." These are simply different economic strategies for distributing the human cost of retrieval. Each approach directly inherits the cost structure (Indexing, Querying, and Quality) of the underlying retrieval system it's built on.

There is still no free lunch in information retrieval. The human cost is always paid. It’s just a matter of aligning its distribution with the job:

Deploying a "catch-all" RAG pipeline over a single vector space is an economic bet. It trades a low indexing cost for a very high, often hidden, quality cost (probabilistic answers, hallucinations) in situations that may require deterministic truth.
Conversely, building a massive, perfect company-wide ontology just to answer simple, low-value queries (like "What is our time off policy?") is also a bet. It represents the opposite mismatch: paying an enormous, unnecessary indexing cost for a task a simple keyword search could have solved.

The right architecture is an economic bet based on what the system is meant to solve. This understanding of the expected query patterns, retrieval frequency, and acceptable level of quality determines the correct tradeoff and dictates what indexing cost is the right one to pay.

AI models are powerful, but as is they are often useless in a business context. They don't know your specific compliance rules, your internal knowledge base, or your customer's context.

In this post, we'll explore how this cost has historically been paid in traditional information systems and what LLMs truly change about this economy.

The human economy of retrieval

This illustrates that a system distributes its cost between upfront effort and usage effort. We can break this down into three categories:

Indexing cost: This is the upfront effort to prepare, structure, and design information so it can be found later. This includes everything from designing a database schema to building the user interface of an application.
Querying cost: This is the user's effort at the moment of retrieval. This includes navigating file folders, writing an SQL query, or filtering records in an application.
Quality: A compromised retrieval quality is a cost in itself. . Search experts describe this as recall (finding all relevant information) and precision (finding only relevant information). Any tradeoff that decreases precision or recall results in a lower quality.

Let's make this concrete. Imagine someone wants to find all recipes using carrots in their favorite cookbook. The "cost" to retrieve this information can be paid in three ways:

The high querying cost way: Go through every single page of the cookbook. The indexing cost is zero, but the querying cost (the user's effort) is massive.
The quality cost way: Skim through the cookbook, relying on memory or chance. This has a very low querying cost, but the resulting quality is low: they will almost certainly miss recipes.
The high indexing cost way: The author creates an ingredient index at the back. This index costs the author time to build, but in return, every reader benefits from near-zero querying cost and high quality.

It happens that the latter is the most common, because it just makes sense for this use case. The author pays the price once, and every reader benefits.

The cost of retrieval in traditional systems

Here’s how this tradeoff traditionally looks across the various layers of systems a user relies on to access data in a business context:

File system

Indexing cost: Very low.
Querying cost: Very high. You browse manually, open each file, and scan for "carrot."
Quality: Low. Success depends on memory, luck, and discipline.

Databases

Indexing cost: High. Requires schema design and data preparation.
Querying cost: Moderate. You must learn SQL and the database schema.
Quality: High. The system is deterministic and logical, returning a precise and complete answer.

Keyword search

Indexing cost: Moderate. The main cost is the continuous human effort to configure, tune, and maintain the index schema, analyzers, and relevance models.
Querying cost: Low. The user just types a keyword, like "carrot."
Quality: Low. Results are based on statistical relevance, not real-world understanding. It's probabilistic and matches word statistics, not user context.

Vector search

Indexing cost: Moderate. Requires choosing an embedding model, computing embeddings, and maintaining a specialized vector index.
Querying cost: Low. The user's query is also converted into a vector to find the closest semantic matches.
Quality: Low. This method moves retrieval from matching exact words to matching semantic meaning. It allows for richer discovery but offers less precision.

Ontology-defined knowledge base

Information is stored by instantiating these formal concepts. You don't just index the text "Carrot Soup"; you explicitly define:

An object: Carrot Soup (Type: Recipe)
An object: Carrot (Type: Ingredient, Category: Root Vegetable)
A relationship: Carrot Soup → hasIngredient → Carrot
Rules: Recipe must have at least one Ingredient via hasIngredient
Indexing cost: very high. Requires deep collaboration between domain experts and data engineers to define the ontology and map data.
Querying cost: moderate. The user no longer searches for text, they explore relationships (e.g. "Find all Recipes that haveIngredient Carrot.")
Quality: very high. Results are precise, deterministic, and explainable because the "meaning" is the index itself.

Applications (CRM, Recipe App, etc.)

Indexing cost: very high. The schema, backend logic, and interface are all built upfront.
Querying cost: very low. Retrieval is abstracted away by a button or filter.
Quality: Very high. The application is a deterministic system; retrieval logic is pre-defined and trusted, guaranteeing the result quality.

Recap of Costs

System	Indexing cost	Querying cost	Quality
File management system	🟩 Very low	🟥 Very high	🟥 Very low
Databases	🟥 High	🟨 Moderate	🟩 High
Keyword search	🟨 Moderate	🟩 Low	🟥 Low
Vector search	🟨 Moderate	🟩 Low	🟥 Low
Knowledge Base / Ontology	🟥 Very high	🟨 Moderate	🟩 Very high
Applications	🟥 Very high	🟩 Very low	🟩 Very high

Deciding where to pay the retrieval cost

Choose rigidity or flexibility

This is a direct consequence of your bet on frequency.

Choose rigidity when retrieval patterns are known, shared, and frequent (e.g., "Show me this customer's order history"). You pay the high upfront indexing cost to build a reliable system, betting that this cost is worth the low-cost experience for thousands of future retrievals.
Choose flexibility when retrieval patterns are unknown or exploratory (e.g., "What's our most unusual sales trend this quarter?"). You pay a lower indexing cost and shift the effort to the user, betting it's cheaper than building a rigid system for every possible question.

Choose deterministic truth or probabilistic discovery

This choice is about the fidelity of the answer.

Choose deterministic truth when results must be verifiably "true" and complete (e.g., an accounting ledger). This requires a high indexing cost to build a system with perfect data integrity.
Choose probabilistic discovery when you need a "good" or "relevant" answer, not a single, perfect one. Systems like keyword and vector search trade logical precision for speed and discovery.

Remember "pay upfront" means "pay to maintain"

What LLMs (and RAG) change

LLMs impact on traditional retrieval systems

Let's first look at how LLMs augment the traditional systems themselves:

File system: An LLM can scan documents to suggest tags or summaries. This adds a new (but small) indexing cost, which in turn slightly reduces the querying cost by making files more findable. The fundamental problem (high-cost manual browsing) largely remains.
Databases: LLMs can help lower the indexing cost by transforming unstructured data into structured rows. Text-to-SQL drastically lowers the querying cost. The user no longer needs to learn SQL, but the cost doesn't disappear. It shifts to the lower quality of a plausible-sounding but potentially flawed answer.
Keyword Search: LLMs can act as powerful "assistant indexers." By pre-reading documents to extract entities, add summaries, and generate synonyms, they enrich the index. This requires a higher upfront indexing cost (to configure and manage these enrichment pipelines) but significantly improves the quality for the same low-effort user query.
Vector Search: This system is already an LLM-native approach. Its costs are a moderate human indexing cost (to choose a model and manage the embedding pipeline) and a "rather low" cost (due to the probabilistic nature of the results).
Ontology-defined knowledge: LLMs can help lower the very high indexing cost by transforming unstructured sources (raw text, unstructured documents) into structured entities. However, defining the core ontology still requires human domain experts. They also drastically lower the querying cost. Because the ontology provides a rich semantic layer, it is uniquely accessible to natural language, giving the LLM a map of meaning and context that a raw database schema lacks. This shifts the user's effort from "learning the ontology" to "phrasing the question perfectly".
Applications (CRM, etc.): LLMs can marginally reduce the indexing cost (the human cost of development) by using coding agents to help build application logic faster. On the user side, conversational interfaces can replace fixed filters, lowering the querying cost. However, this introduces a significant risk: it trades the application's "very high" (deterministic) quality for the "rather low" (probabilistic) quality of a conversational interface.

RAG simply redistributes the retrieval cost

Retrieval-Augmented Generation (RAG) is not a new retrieval system, but a new architecture built on top of the traditional ones.

A RAG pipeline is a two-step process:

Retrieve: Use a system (like vector search or keyword search) to find relevant context.
Generate: Feed that context, along with the user's prompt, to an LLM to synthesize a new answer.

This new architecture doesn't eliminate the human cost. It further distributes it:

Indexing cost remains: You still pay the full human indexing cost for whichever retrieval system you use (e.g., the human effort of collaborating with domain experts to define the ontology).
Querying cost is transformed: The cost shifts from technical skill (like writing SQL) to cognitive skill (effective prompt engineering). The user no longer just finds information. They must guide the LLM to use its retrieval tools correctly and validate the results. This is a very different, and often still high, form of human effort.
Quality cost is transformed (and often hidden): This is the most critical change. The risk is no longer just "Did I miss a document?" (recall). The risk is now synthetic and semantic: "Did the LLM use the right facts, ignore the wrong ones, and synthesize them correctly without a plausible-sounding hallucination?"

There is no "best" RAG, only the "right cost"

There is still no free lunch in information retrieval. The human cost is always paid. It’s just a matter of aligning its distribution with the job:

Deploying a "catch-all" RAG pipeline over a single vector space is an economic bet. It trades a low indexing cost for a very high, often hidden, quality cost (probabilistic answers, hallucinations) in situations that may require deterministic truth.
Conversely, building a massive, perfect company-wide ontology just to answer simple, low-value queries (like "What is our time off policy?") is also a bet. It represents the opposite mismatch: paying an enormous, unnecessary indexing cost for a task a simple keyword search could have solved.

Matthieu Blandineau

Head of Marketing at Blue Morpho

Share on :

Matthieu Blandineau

Head of Marketing at Blue Morpho

Share on :

Matthieu Blandineau

Head of Marketing at Blue Morpho

Share on :

Learn

Nov 24, 2025

RAG Design Patterns: From Vectors to Ontologies

RAG has grown into a design discipline. It’s no longer about adding external knowledge to a model. It’s about choosing the right retrieval architecture for your domain.

Read article

Learn

Nov 24, 2025

RAG Design Patterns: From Vectors to Ontologies

RAG has grown into a design discipline. It’s no longer about adding external knowledge to a model. It’s about choosing the right retrieval architecture for your domain.

Read article

Learn

Nov 24, 2025

RAG Design Patterns: From Vectors to Ontologies

RAG has grown into a design discipline. It’s no longer about adding external knowledge to a model. It’s about choosing the right retrieval architecture for your domain.

Read article

Learn

Nov 18, 2025

The missing logic layer in enterprise AI

Imagine a warehouse fire destroys half your inventory. You ask your enterprise AI: "Which orders do I prioritize?" Current AI will likely pull up your safety protocols or summarize the fire report. It cannot tell you which client has the strictest penalty clauses, which inventory is currently in transit, or where you have available truck capacity to reroute shipments.

Read article

Learn

Nov 18, 2025

The missing logic layer in enterprise AI

Read article

Learn

Nov 18, 2025

The missing logic layer in enterprise AI

Read article

Start turning
Knowledge Into action

Sign up now to start building your first agents, or contact us to explore the hard questions Blue Morpho can answer from your data.

Start turning
Knowledge Into action

Sign up now to start building your first agents, or contact us to explore the hard questions Blue Morpho can answer from your data.

Start turning
Knowledge Into action

Sign up now to start building your first agents, or contact us to explore the hard questions Blue Morpho can answer from your data.

There's still no free lunch in information retrieval

The human economy of retrieval

The cost of retrieval in traditional systems

File system

Databases

Keyword search

Vector search

Ontology-defined knowledge base

Applications (CRM, Recipe App, etc.)

Recap of Costs

Deciding where to pay the retrieval cost

What LLMs (and RAG) change

LLMs impact on traditional retrieval systems

RAG simply redistributes the retrieval cost

There is no "best" RAG, only the "right cost"

The human economy of retrieval

The cost of retrieval in traditional systems

File system

Databases

Keyword search

Vector search

Ontology-defined knowledge base

Applications (CRM, Recipe App, etc.)

Recap of Costs

Deciding where to pay the retrieval cost

What LLMs (and RAG) change

LLMs impact on traditional retrieval systems

RAG simply redistributes the retrieval cost

There is no "best" RAG, only the "right cost"

Related articles

Related articles

Related articles

RAG Design Patterns: From Vectors to Ontologies

RAG Design Patterns: From Vectors to Ontologies

RAG Design Patterns: From Vectors to Ontologies

The missing logic layer in enterprise AI

The missing logic layer in enterprise AI

The missing logic layer in enterprise AI