RAG Is Numerator Management

May 17, 2026May 17, 2026

RAG is not intelligence.

RAG is numerator management.

That distinction matters because the AI industry often talks about retrieval-augmented generation as though it solves the hallucination problem by “grounding” the model. That language is useful, but it is not precise enough for the advanced student.

In the Reality Equation, Reality is Actual over Expectation.

Reality = Actual / Expectation

Actual is the numerator.

Expectation is the denominator.

Expectation is complex. The real component is subconscious prediction. The imaginary component is ideas.

A large language model belongs primarily in the real component of the denominator. It is a synthetic subconscious prediction machine. It predicts words, images, code, arguments, structures, examples, and answers.

That is the part everyone has been celebrating.

And rightly so.

The prediction machine is extraordinary.

But prediction is not Actual.

Prediction is not Reality.

Prediction is one component of the denominator.

So when people say, “The model hallucinated,” they are often describing a numerator problem. The prediction machine produced an output, but the output was not disciplined by Actual.

The citation did not exist.

The quotation did not appear in the source.

The date was wrong.

The person never said that.

The contract did not contain that clause.

The medical record did not support that summary.

The financial transaction did not occur.

The model predicted something plausible, but the numerator did not agree.

That is not mysterious.

That is what happens when prediction is asked to behave as though it were Actual.

RAG is one attempt to correct this.

Retrieval-augmented generation brings external material into the system. It retrieves documents, records, passages, files, facts, snippets, or database entries and gives the prediction machine something to work with besides its own learned patterns.

In ordinary AI language, this is called grounding.

In the Reality Equation, it is numerator management.

That phrase is more exact.

RAG tries to supply something that can serve as Actual inside the artificial laboratory.

Not cosmic Actual.

Not the Immutable Past.

Not the universal condition as She gives it.

Laboratory Actual.

Declared Actual.

This is the crucial distinction.

In lived human Reality, Actual is not something we manage. Actual is what happened. Actual is past tense. Actual is immutable. We do not alter it, substitute it, retrieve it, relabel it, chunk it, embed it, or store it in a vector database.

Actual is given.

Human beings never encounter pure Actual directly. We never step outside the eternal now and inspect the past as it is in itself. We receive Reality as the quotient. Actual has already participated in the ratio before consciousness begins.

That is our luxury.

We do not have to build the numerator.

But in artificial systems, especially in the laboratory, we do.

We have to decide what will count as Actual for the experiment.

That may be a dataset.

It may be a set of labeled images.

It may be a database of customer records.

It may be a corpus of contracts.

It may be a library of research papers.

It may be a folder of invoices.

It may be a knowledge base.

It may be a medical chart.

It may be the accepted answer key.

It may be a set of historical outcomes.

Whatever it is, we are placing something in the numerator and declaring, “For this bounded artificial system, this is what actually happened.”

That is numerator management.

And the numerator is never innocent.

A dataset can be mislabeled.

A document can be stale.

A citation database can be incomplete.

A customer record can be out of date.

A medical chart can contain an error.

A contract can be missing an amendment.

A transcript can omit context.

A retrieved passage can be relevant but insufficient.

A search result can be highly ranked and still wrong.

A vector database can retrieve something semantically similar but factually inappropriate.

So when we say RAG grounds the model, we have to be careful.

RAG does not hand the system Actual in the full metaphysical sense.

RAG hands the system a managed numerator.

Sometimes that numerator is excellent.

Sometimes it is polluted.

Sometimes it is too narrow.

Sometimes it is too broad.

Sometimes it is outdated.

Sometimes it is mislabeled.

Sometimes it is retrieved correctly but interpreted poorly.

Sometimes the correct passage is present but the model’s prediction still overwhelms it.

That is why RAG does not eliminate hallucination by itself.

It only changes the architecture of the error.

Without RAG, the model may hallucinate because it is relying too heavily on prediction.

With RAG, the system may fail because the numerator was wrong, incomplete, stale, or poorly connected to the prediction machine.

The problem has moved.

It has not disappeared.

This is why the phrase “declared Actual” is so important.

In the artificial laboratory, Actual is not simply Actual. It is what the system has been given permission to treat as Actual.

That declared Actual must be inspected.

Who created it?

Who labeled it?

What does it omit?

What time period does it cover?

What category system does it assume?

What errors does it contain?

What was excluded?

What was overrepresented?

What has changed since it was stored?

What is being retrieved?

What is not being retrieved?

What is being treated as authoritative?

What is merely nearby?

These are numerator questions.

They are not agent questions.

They are not prompt questions.

They are not merely model questions.

They are numerator questions.

This matters because the current AI conversation keeps trying to solve too much on the left-hand side of the equation. We build agents and expect them to compensate for a malformed right-hand side.

But the agent is just a function.

The agent acts on what it receives.

If the numerator is weak, the agent cannot magically turn prediction into Reality.

If the declared Actual is polluted, the quotient is polluted.

If the quotient is polluted, the function acts on polluted Reality.

That is how errors enter history.

A research agent that retrieves the wrong paper may still write a beautiful summary.

A legal assistant that retrieves an outdated clause may still produce confident analysis.

A medical summarizer that retrieves the wrong chart note may still sound clinically fluent.

A customer service system that retrieves an obsolete policy may still answer politely.

A financial assistant that retrieves an incomplete transaction history may still produce a clean report.

The fluency is not the issue.

The numerator is.

This is why RAG systems should be judged less by how impressive the final answer sounds and more by how well the numerator was managed.

What did the system retrieve?

Was it the right material?

Was it complete enough?

Was it current?

Was it authoritative?

Was it actually used?

Did the model distinguish retrieved Actual from predicted filler?

Did the final output preserve the boundary between what was retrieved and what was inferred?

Did the system know when the numerator was insufficient?

That last question is especially important.

A mature RAG system should not merely retrieve and answer.

It should know when the declared Actual is inadequate.

It should say, in effect:

The numerator is too weak.

The retrieved material does not support the requested conclusion.

The source base is incomplete.

The evidence is insufficient.

The Actual supplied to this system does not permit that answer.

That is a higher standard than ordinary retrieval.

It is numerator discipline.

This also explains why RAG can be powerful in narrow domains and fragile in broad ones.

If the domain is narrow, the numerator can be managed well. A company’s current return policy. A restaurant’s menu. A product catalog. A set of installation manuals. A defined contract library. A known body of invoices.

In those cases, the declared Actual can be bounded, updated, checked, and retrieved with some confidence.

The quotient becomes more stable.

The agent can act more safely.

But if the domain is broad, open-ended, contested, rapidly changing, or conceptually ambiguous, numerator management becomes much harder.

What counts as Actual?

Which sources are authoritative?

Which documents are current?

Which facts are settled?

Which claims are disputed?

Which labels are reliable?

Which passages matter?

Which context is missing?

The larger the world, the harder it is to manage the numerator.

This is why the dream of a universal agent is so difficult. It is not merely that the agent needs more tools. It is that the numerator becomes impossible to manage cleanly across all domains.

Humans hide this difficulty because Reality is given to us. We do not consciously build the numerator every time we act.

Artificial systems do not have that luxury.

They must be supplied with something to treat as Actual.

And whatever they are supplied with will shape the quotient.

This is also why data quality is not a boring operational issue.

Data quality is numerator quality.

Bad data does not merely make AI less accurate. It distorts the artificial Reality on which the agent acts.

A mislabeled X-ray is not a small clerical problem. It is a false numerator.

A stale policy document is not merely outdated content. It is a false numerator.

A hallucinated citation is not merely a bad sentence. It is a missing numerator masquerading as Actual.

An incomplete customer record is not merely an inconvenience. It is a partial numerator.

A poorly chunked document is not merely a technical artifact. It can break the system’s access to declared Actual.

This is why serious AI systems require numerator governance.

Not just model governance.

Not just prompt engineering.

Not just agent design.

Numerator governance.

What is allowed to count as Actual?

How is it updated?

How is it labeled?

How is it retrieved?

How is it verified?

How is uncertainty represented?

How are conflicts handled?

How are stale records removed?

How are source hierarchies maintained?

How does the model know when the numerator is insufficient?

These questions are not optional.

They determine whether the system is constructing anything close to synthetic Reality or merely decorating prediction with retrieved text.

That is the weakness of many current RAG systems. They retrieve material, paste it into the context window, and hope the prediction machine behaves.

Sometimes it does.

Sometimes it does not.

But the deeper issue is that retrieval alone is not the same as numerator management.

Numerator management requires structure.

It requires authority.

It requires boundaries.

It requires freshness.

It requires conflict resolution.

It requires the system to know the difference between declared Actual and prediction.

Without that, RAG becomes theater.

The model quotes a passage.

The answer sounds grounded.

The user relaxes.

But the quotient may still be malformed.

The advanced student should therefore learn to see RAG as a laboratory imitation of the numerator.

It is not the numerator in the cosmic sense.

It is not the Immutable Past.

It is a constructed attempt to give the artificial system something that can function like Actual for a defined purpose.

That is both powerful and dangerous.

Powerful because it allows prediction to be disciplined by source material.

Dangerous because the source material may be treated as more complete than it is.

In creative workflows, this may not matter much. If the model is generating fictional stories, brand images, slogans, or mood boards, the human may be able to accept prediction as Actual. The numerator can be light because the domain permits acceptance.

But in truth-bound workflows, the numerator is everything.

Research.

Law.

Medicine.

Finance.

Engineering.

Compliance.

Insurance.

Contracts.

Public records.

In these domains, prediction cannot simply become Actual by acceptance.

Actual must discipline prediction.

RAG is one of the main ways artificial systems try to do that.

But RAG must be understood as numerator management, not as a magic cure for hallucination.

Once we see this, the design priorities become clearer.

First, define the declared Actual.

Second, inspect its quality.

Third, understand its limits.

Fourth, retrieve from it carefully.

Fifth, force the model to distinguish retrieved Actual from predicted completion.

Sixth, prevent the agent from acting when the numerator is insufficient.

That is the mature sequence.

Prediction comes from the real component of the denominator.

Ideas appear as the imaginary component.

Declared Actual sits in the numerator.

Synthetic Reality is the quotient.

The agent is the function applied afterward.

If the numerator is poorly managed, everything downstream is compromised.

So the next time someone says, “We solved the hallucination problem with RAG,” the advanced student should pause.

RAG does not solve hallucination.

RAG manages the numerator.

And the quality of the numerator determines the quality of the artificial Reality the system can construct.

RAG Is Numerator Management

Like this:

Leave a ReplyCancel reply

Share this:

Like this:

Leave a ReplyCancel reply

Discover more from John Rector