Search & Retrieval12 min read·

Cited Answers, Not Just Search Results

How Context Repo's reason tool turns retrieved document evidence into synthesized answers with citations, gaps, and conflicts, and why that costs more than returning ranked chunks.

By Context Repo Team

A search result is not an answer. It is a set of places where an answer might live.

That distinction matters when an AI agent is working from a knowledge base. A ranked list of chunks can be useful, but the user often asked a direct question: "What does the policy say?", "Which setup steps are required?", "Do these documents disagree?", "What is missing?"

If the agent receives only search results, it still has to read, compare, cite, and decide whether the evidence is enough. If it does that work loosely, the final response can look confident while resting on weak or incomplete context.

Context Repo's reason flow is built for the next step after retrieval. It turns retrieved document evidence into a single answer with sources, gaps, and conflicts. It is not a replacement for Deep Search. It is a higher-cost layer on top of retrieval for moments when the user needs a grounded answer, not just a hit list.

What reason returns

At the product level, reason returns five things:

  • answer: synthesized prose answering the user's question from the gathered evidence.
  • citations: source chunks the answer relies on, with chunkId, documentId, documentTitle, and similarity score.
  • gaps: specific things the repository does not establish about the question.
  • conflicts: contradictions or disagreements surfaced across gathered sources.
  • meta: operational metadata such as how many chunks were gathered, how many citation objects were dropped during validation, and latency.

The shape is deliberately different from deep_search. Deep Search gives an agent passages to inspect. reason gives the agent or user a composed answer plus the evidence trail needed to verify it.

{
  answer: string,
  citations: Array<{
    chunkId: string,
    documentId: string,
    documentTitle: string,
    score: number,
  }>,
  gaps: string[],
  conflicts: string[],
  meta: {
    chunksGathered: number,
    citationsDropped: number,
    latencyMs: number,
  },
}

reason is read-only. It does not create documents, update prompts, write summaries back to the repository, or persist a reasoning trace. It operates on document content, not prompt content.

The three-stage flow

The implementation is intentionally simple to describe at the architecture level:

  1. Gather evidence from the document chunk retrieval system.
  2. Synthesize an answer from that evidence in one server-managed model call.
  3. Validate citations before returning sources to the client.

Rendering diagram…

That flow is what separates cited answering from ordinary retrieval. The system does not ask the model to search the whole repository by itself. Retrieval happens first. The model receives a bounded evidence set. Then the server checks the citation objects the model emits against that same evidence set.

Stage 1: gather evidence before answering

reason starts with the same progressive disclosure retrieval layer that powers Deep Search. The query is embedded, the document chunk index is searched for the authenticated user, and the results are scoped before synthesis.

The caller can optionally restrict the answer to:

  • a single document with documentId;
  • a single collection's documents with collectionId;
  • a caller-selected gather breadth with limit;
  • a maximum evidence budget with maxTokens on the REST surface.

The important point is that the answer begins from document chunks the user is allowed to access. The retrieval layer enforces user isolation, applies document and collection scope, and returns chunks with document titles and similarity scores.

This matters because a synthesis model should not be asked to "remember" the user's repository. It should be asked to answer from the evidence the system has already gathered.

Stage 2: synthesize under evidence constraints

After gathering, reason builds a bounded evidence block from the retrieved chunks and sends the user's actual question with that evidence. The question matters. Without it, the model can only summarize nearby text. It cannot judge whether the retrieved chunks answer the thing the user asked.

The synthesis step is asked to produce structured output:

  • an answer;
  • cited chunk references;
  • gaps;
  • conflicts;
  • an internal sufficiency judgment about whether the evidence addresses the question.

The article version of that behavior is simple: the answer is supposed to be evidence-bound, not creative. If the gathered chunks do not answer the question, the correct result is a gap, not a confident summary of adjacent facts.

The synthesis model is managed server-side. Callers do not choose a model, temperature, or reasoning settings for this endpoint. That keeps the public contract focused on the result: cited answers over repository documents.

Stage 3: validate citation objects server-side

Citation validation is the most important reliability step.

The synthesis step emits citation objects containing chunk IDs. Before returning them, the server compares those IDs against the chunks that were actually gathered. A citation object survives only if it points to a gathered chunk. Valid citation objects are deduplicated and enriched with document metadata. Invalid citation objects are dropped, and the count is exposed through meta.citationsDropped.

That does not make reason magic. It does not prove that every sentence is perfect, and it does not create evidence that was not retrieved. It does give clients a concrete source list that is tied back to the retrieval set, rather than trusting arbitrary citation text.

This is the difference between "the model wrote a source-looking token" and "the server accepted this chunk as one of the gathered sources."

How reason abstains

Good reasoning systems need a way to say "the repository does not establish that."

reason has two abstention paths.

First, if retrieval returns no useful evidence, the system skips synthesis and returns a gap. That saves cost and avoids asking the model to answer from weak context.

Second, if synthesis reports that the gathered evidence does not address the question, the server ensures the response includes a gap even if the model did not fill one in. This prevents an unrelated summary from being presented as if it answered the user.

That behavior is intentionally conservative. The goal is not to answer every question. The goal is to answer when the repository supports an answer and to show the missing pieces when it does not.

Why cited answers are better than a hit list

Ranked chunks are useful for exploration. They are less useful when the user wants a conclusion.

A hit list leaves four jobs for the agent:

  1. Decide which chunks matter.
  2. Compose the actual answer.
  3. Attach sources to claims.
  4. Notice missing or conflicting evidence.

reason moves those jobs into a controlled server flow. The agent receives a single answer plus a structured source list and explicit uncertainty fields. That improves the workflow in several practical ways.

The answer is easier to verify

The citations array points back to exact document chunks. A client can show the sources, inspect the chunk IDs, or route a follow-up through deep_read if it wants to display the source passage in more detail.

The output is not just "trust me." It is "here is the answer, and here are the chunks that were accepted as sources."

Missing evidence is first-class

Many retrieval systems treat missing evidence as an awkward edge case. Context Repo treats it as part of the answer shape.

The gaps array is useful when a document library is incomplete, stale, or ambiguous. It tells the user what the repository did not prove, which is often as important as the answer itself.

Conflicts are not silently hidden

If two gathered sources disagree, the answer should not quietly pick one and bury the other. The conflicts field gives the synthesis layer a place to surface disagreement instead of smoothing it away.

That is valuable for policies, specs, internal runbooks, and evolving documentation where two sources may both exist but only one is current.

Scope is explicit

Because reason can be scoped to one document or one collection, users can ask questions inside a clear boundary. That makes it useful for project collections, client-specific documentation, or one manual where the answer should not pull from the whole repository.

Why it costs more

reason costs more than returning search results because it does more work.

Deep Search performs retrieval and returns chunks. reason performs retrieval, builds evidence, runs a synthesis pass, parses structured output, validates citation objects, applies gap safeguards, and formats the final response.

That extra work is worth paying for when the desired output is an answer with sources. It is not always worth paying for when the user only needs to find a document or inspect the top passages.

The product trade-off is straightforward:

TaskBetter surfaceWhy
Find a prompt, document, or collectionfind_itemsReturns catalog-level matches across saved artifacts
Inspect passages inside documentsdeep_search, then deep_read or deep_expandReturns chunks and lets an agent navigate surrounding context
Answer a question with sourcesreasonRetrieves evidence, synthesizes an answer, and returns citations, gaps, and conflicts

The higher-cost path is not the default answer to every retrieval problem. It is the path for answer synthesis.

Where reason fits in an agent workflow

A practical agent workflow often looks like this:

  1. Use find_items when the agent does not know which artifact matters.
  2. Use deep_search when the agent needs candidate passages inside documents.
  3. Use deep_read or deep_expand when the agent wants to inspect source context manually.
  4. Use reason when the agent needs a concise, cited answer from document evidence.

Those tools are complementary. reason does not remove the need for search. It packages a common search-plus-synthesis workflow into a consistent endpoint so every client does not have to recreate it differently.

Honest boundaries

reason is only as grounded as the evidence available to it. If the relevant document has not been stored, if the question is outside the repository, or if retrieval does not gather the right chunk, the answer cannot recover that missing context.

The response shape is designed to make those limits visible:

  • empty or weak evidence becomes a gap instead of an answer;
  • unsupported areas are listed in gaps;
  • disagreements can be listed in conflicts;
  • citation objects are accepted only when they match gathered chunks;
  • metadata shows how much evidence was gathered and whether citation objects were dropped.

That is the right standard for a context repository. The system should not pretend to know more than the stored documents establish.

What users can depend on

The public contract is simple:

  • reason answers questions over document content only.
  • It is read-only and does not persist the answer.
  • It reuses Context Repo's document retrieval layer for evidence gathering.
  • It can be scoped to a document or collection.
  • It returns an answer, citations, gaps, conflicts, and metadata.
  • It validates returned citation objects against the gathered evidence set.
  • It costs more than search because it adds synthesis and validation work.

That is why cited answers are different from search results. Search finds possible evidence. reason turns gathered evidence into an answer and leaves the source trail attached.