You wrote the perfect prompt in Cursor. Two days later you need it in Claude, and your AI does not know where to look. You drop a 200-page PDF into ChatGPT, ask a single question, and the model spends its whole context window re-reading the whole document. Both of these are retrieval problems. They are not the same retrieval problem.

Context Repo is a [context repository](/resources/what-is-an-ai-context-repo-for-agents) for AI agents and humans. Inside it, retrieval splits into two surfaces on purpose: one for finding the right artifact, one for finding the right passage inside that artifact. Under the hood both run on the same [Convex vector index](https://docs.convex.dev/search/vector-search), using OpenAI `text-embedding-3-small` for embeddings, but they answer different questions and they return different shapes.

```mermaid
flowchart LR
  Q(("Agent query"))

  subgraph Catalog["find_items (catalog)"]
    direction TB
    FI["Vector or literal<br/>match on titles,<br/>descriptions, content<br/>previews"]
  end

  subgraph Content["deep_search (content)"]
    direction TB
    DS["Vector match on<br/>document chunks<br/>(document, section,<br/>paragraph)"]
  end

  subgraph CatalogOut["Item-level results"]
    direction TB
    P["Prompts"]
    D["Documents"]
    C["Collections"]
  end

  subgraph ContentOut["Chunk-level results"]
    direction TB
    CH["chunkId, content,<br/>level, chunkIndex,<br/>parentId, score"]
  end

  Q --> FI
  Q --> DS
  FI --> P
  FI --> D
  FI --> C
  DS --> CH

  classDef client fill:#27272a,stroke:#52525b,stroke-width:1px,color:#fafafa,rx:10,ry:10
  classDef item   fill:#18181b,stroke:#06b6d4,stroke-width:1.5px,color:#fafafa,rx:10,ry:10
  class FI,DS client
  class P,D,C,CH item
  style Q fill:#3b82f6,stroke:#60a5fa,stroke-width:2.5px,color:#ffffff,font-weight:600
  linkStyle default stroke:#52525b,stroke-width:1.5px
```

This article walks through both surfaces, when each one earns its place, and the small set of facts (1536-dim vectors, a 256-result vector-store cap, a document-only chunk index) that shape what they can do.

## What is the difference between find_items and deep_search?

Every retrieval question in a context repository falls into one of two shapes:

1. **"Which artifact is relevant here?"** This is catalog-level. The unit of return is a prompt, a document, or a collection. An agent uses it when it knows the topic it is after but does not know which item in your repository holds the answer.
2. **"What does this artifact say about X?"** This is content-level. The unit of return is a chunk inside one document. The agent uses it once it has identified the right document (or a small candidate set) and needs the specific passage.

`find_items` answers the first question. `deep_search` answers the second. They share an embedding model and an index, but they are separate tools with separate inputs because they are separate jobs.

## How does find_items search the catalog?

`find_items` takes a natural-language query and returns matches across prompts, documents, and collections. The MCP tool signature, simplified:

```ts
find_items({
  query: string,
  type?: 'prompts' | 'documents' | 'collections' | 'all',
  semantic?: boolean,   // default: true
})
```

### Returns

Each result is item-level: title, ID, type, similarity score, and a short highlight (the first ~150 characters of matching content with the query terms bolded). The agent uses the list to decide which artifact to read in full. The MCP tool is the only path that surfaces prompts in semantic results; `deep_search` operates on document chunks only.

### When to reach for it

Two specific reasons to use this surface:

- **The agent does not know the title yet.** "Find the document about HIPAA compliance" can match a document called "Compliance Notes 2026" because the body mentions HIPAA. Semantic search bridges the title-to-content gap.
- **The agent wants to weight prompts vs documents vs collections.** Pass `type: 'prompts'` to retrieve only matching templates. Pass `type: 'documents'` to retrieve only matching reference material. Pass `'all'` (the default) when you want the full catalog.

Set `semantic: false` when you remember the exact phrase. Literal substring match across title, description, and the first ~4 KiB of each document's content preview is faster, has zero embedding cost, and beats vector search when the query is a token the document literally contains ("error code E2049" does not need a vector index).

The matching REST surface is `GET /v1/search` with the same `q`, `type`, and `semantic` query parameters; the MCP tool is a thin wrapper over it.

## How does deep_search read inside documents?

`deep_search` is what makes Context Repo a real retrieval system, not a search bar with a vector index bolted on. Signature:

```ts
deep_search({
  query: string,
  documentId?: string,    // search inside one document
  collectionId?: string,  // search inside one collection's documents
  limit?: number,         // server default: 10
  sessionId?: string,     // dedup across paginated reads
})
```

Scope: documents only. Prompts are stored as single embeddings in a separate index and are reachable only through `find_items` or `search_prompts`. That is a deliberate split: a prompt is a unit you read whole; a document is a unit you navigate.

### Returns

Each match is chunk-level. The fields on every result:

- A stable `chunkId`.
- The chunk's `content` (the text).
- A `level`: one of `"document"`, `"section"`, or `"paragraph"`.
- A `chunkIndex` for sibling ordering under the same parent.
- A `parentId` (or null at the root).
- A similarity `score`.

The chunks are pieces of the document hierarchy Context Repo builds at ingest time. A textbook chapter becomes a `section` chunk; the section under that becomes a deeper `section`; the paragraph under that becomes a `paragraph` chunk. The structure is always parent-pointer. (Once an agent has a `chunkId`, calling `deep_read` returns the same chunk with richer navigation metadata: `sectionPath`, `prevSiblingId`, `nextSiblingId`, `headingText`, and `wordCount`.)

### When to reach for it

This is the right shape for retrieval-augmented generation (RAG) inside large documents. Instead of dropping a 50,000-token PDF into the model's context, the agent runs `deep_search`, picks the top three chunks by score, reads them, and decides whether it needs more.

`sessionId` is the dedup ledger. Pass one back on the next call and Context Repo will not re-return chunks it already gave you in this session. On the MCP transport an auto-session is created and reused per caller, so iterative exploration just works; on REST you mint a sessionId via `POST /v1/pd/session` and pass it forward yourself.

## How does deep_expand navigate document chunks?

A chunk on its own is often not enough. The matching paragraph might assume context from its parent section ("As mentioned in §3, the policy applies to..."). The agent needs to walk the tree.

`deep_expand` is the navigator:

```ts
deep_expand({
  chunkId: string,
  direction: 'up' | 'down' | 'next' | 'previous' | 'surrounding',
  count?: number,   // for 'surrounding' direction
})
```

The five directions:

- **`up`**: read the parent chunk. Useful when the matching paragraph needs the surrounding section's setup.
- **`down`**: read the child chunks. Useful when the match landed on a section heading and the agent needs the section body.
- **`next`** and **`previous`**: read sibling chunks under the same parent. Useful for continuing a list or a sequential narrative.
- **`surrounding`**: read a window of same-parent siblings around the target. On a sparse hierarchy where the target is the only child under its parent, `surrounding` automatically falls back to the last/first chunks of the parent's neighbouring sections, so the agent still gets meaningful context.

The agent picks the direction based on what it needs. The cost is one tool call per navigation step, but each step returns only the requested chunk. Context-window cost stays bounded by intent.

## Why two retrieval surfaces, not one

Concretely, here is the failure mode we avoided by splitting them:

- A single combined endpoint returns "document X scored 0.71, document Y scored 0.69, chunk inside X scored 0.66." The agent has to reason about whether to read X in full or just the matching chunk, and ranking does not really sort that out.
- Or the endpoint returns only chunks, and the agent loses the catalog view it needs to choose between candidate documents.
- Or the endpoint returns only documents, and the agent has to pull the entire candidate document just to confirm the match.

By keeping `find_items` catalog-level and `deep_search` chunk-level, an agent can run a two-step retrieval pattern that mirrors how humans search:

1. Find the right artifact.
2. Find the right passage within it.

It maps to the way the agent already thinks about RAG, and it keeps the model's context window honest.

## What embedding model and vector store does Context Repo use?

We are pre-launch and have not published benchmark numbers; we will not invent them here. The shape that matters:

- **Vectors are 1536-dimensional**, computed with OpenAI `text-embedding-3-small`. The Convex vector index pins the dimension; mismatched embeddings fail silently, which is why we treat it as a hard contract.
- **Embeddings live in Convex's native vector index**, in the same backend that holds prompts and documents. No second store to keep in sync.
- **Embedding generation happens at ingest time** for stored content. Query embeddings are computed at call time.
- **The vector store has a Convex-imposed cap of 256 results per query.** Context Repo caps user-visible limits below that, so callers cannot trip the underlying bound by accident.
- **The chunk hierarchy is three levels**: `document`, `section`, `paragraph`. Section chunks can nest; paragraphs are the leaves.

## When to use each surface

| Question | Surface |
|---|---|
| Which document covers our HIPAA policy? | `find_items` (catalog) |
| What does the HIPAA policy say about audit logs? | `deep_search` (content) inside the doc found above |
| Show me the paragraph after that one | `deep_expand` with `direction: 'next'` |
| What is the section heading this paragraph lives under? | `deep_expand` with `direction: 'up'` |
| Find prompts about code review | `find_items` with `type: 'prompts'` |
| Search inside one collection only | `deep_search` with `collectionId` |
| Search the literal phrase "error code E2049" | `find_items` with `semantic: false` |

Internalize that table and the rest is mechanical.

## How this connects to the rest of Context Repo

Both surfaces are exposed as [MCP tools](/resources/how-mcp-servers-connect-ai-agents-to-knowledge-bases) and as REST endpoints. The MCP server at [contextrepo.com/mcp](https://contextrepo.com/mcp) advertises both tools (plus `deep_read` and `deep_expand` for chunk navigation). The REST API at `/v1` exposes:

- `GET /v1/search` for catalog retrieval (matches the `find_items` tool).
- `POST /v1/pd/search` for chunk-level retrieval (matches `deep_search`).
- `GET /v1/pd/read/:chunkId` for chunk inspection (matches `deep_read`).
- `POST /v1/pd/expand` for hierarchy navigation (matches `deep_expand`).
- `POST /v1/pd/session` to mint a sessionId for cross-call dedup.

The dashboard search bar at [contextrepo.com/dashboard/search](https://contextrepo.com/dashboard/search) routes user queries through the same underlying endpoints, so a search you ran from Cursor and a search you ran from the browser return the same ranked results (modulo your auth scope).

There is no separate vector database to keep in sync with the row store. One less moving part, one fewer reason for retrieval to drift from truth.

## Where to read next

- [What Is an AI Context Repo for Agents?](/resources/what-is-an-ai-context-repo-for-agents): category framing
- [Prompt and Document Management for AI Agents](/resources/prompt-and-document-management-for-ai-agents): what we are searching across
- [How MCP Servers Connect AI Agents to Knowledge Bases](/resources/how-mcp-servers-connect-ai-agents-to-knowledge-bases): the protocol layer
- [Using Context Repo with Claude, Cursor, and ChatGPT](/resources/using-context-repo-with-claude-cursor-and-chatgpt): real workflows that lean on both retrieval surfaces
- [Deep search documentation](/docs/mcp/tools-reference): full MCP tools reference
- [REST API endpoints](/docs/api/endpoints): programmatic access
