What embedding model does Context Repo use?

OpenAI `text-embedding-3-small` at 1536 dimensions. Both queries and stored content use the same model so the vector spaces line up. The dimension is pinned in the Convex vector index; a mismatch fails silently, so we treat it as a hard contract.

Is search keyword-based or semantic?

Both, depending on the surface. `find_items` accepts a `semantic` boolean and defaults to true (vector similarity). Set it to false for literal substring match across titles, descriptions, and indexed document body text via hierarchical chunks (full-body matching, eventually consistent with chunking). `search_prompts` is literal-only across title, description, and content. `deep_search` is always vector similarity inside document chunks.

Can I scope retrieval to a single collection?

`deep_search` accepts a `collectionId` (and a `documentId`) parameter, so an agent can run content-level retrieval inside one project's documents. `find_items` is workspace-wide today and filters by `type` (prompts, documents, collections) rather than by collection.

Semantic Search and Deep Search: Two Retrieval Layers

Q: What is the difference between find_items and deep_search?

`find_items` is a catalog-level search across your prompts, documents, and collections. It returns item-level matches with titles, IDs, and short highlights. `deep_search` searches inside document content and returns ranked chunks with hierarchy metadata, so an agent can jump to the exact passage that answers a question.

Q: How does deep_expand navigate document chunks?

Five directions from any chunk: `up` (parent), `down` (children), `next` and `previous` (siblings under the same parent), and `surrounding` (a window of same-parent siblings, with an automatic fallback to the neighbour section when the target is the only child).

You wrote the perfect prompt in Cursor. Two days later you need it in Claude, and your AI does not know where to look. You drop a 200-page PDF into ChatGPT, ask a single question, and the model spends its whole context window re-reading the whole document. Both of these are retrieval problems. They are not the same retrieval problem.

Context Repo is a context repository for AI agents and humans. Inside it, retrieval splits into two surfaces on purpose: one for finding the right artifact, one for finding the right passage inside that artifact. Under the hood both run on the same Convex vector index, using OpenAI text-embedding-3-small for embeddings, but they answer different questions and they return different shapes.

Rendering diagram…

This article walks through both surfaces, when each one earns its place, and the small set of facts (1536-dim vectors, a 256-result vector-store cap, a document-only chunk index) that shape what they can do.

What is the difference between find_items and deep_search?

Every retrieval question in a context repository falls into one of two shapes:

"Which artifact is relevant here?" This is catalog-level. The unit of return is a prompt, a document, or a collection. An agent uses it when it knows the topic it is after but does not know which item in your repository holds the answer.
"What does this artifact say about X?" This is content-level. The unit of return is a chunk inside one document. The agent uses it once it has identified the right document (or a small candidate set) and needs the specific passage.

find_items answers the first question. deep_search answers the second. They share an embedding model and an index, but they are separate tools with separate inputs because they are separate jobs.

How does find_items search the catalog?

find_items takes a natural-language query and returns matches across prompts, documents, and collections. The MCP tool signature, simplified:

find_items({
  query: string,
  type?: 'prompts' | 'documents' | 'collections' | 'all',
  semantic?: boolean,   // default: true
})

Returns

Each result is item-level: title, ID, type, similarity score, and a short highlight (the first ~150 characters of matching content with the query terms bolded). The agent uses the list to decide which artifact to read in full. The MCP tool is the only path that surfaces prompts in semantic results; deep_search operates on document chunks only.

When to reach for it

Two specific reasons to use this surface:

The agent does not know the title yet. "Find the document about HIPAA compliance" can match a document called "Compliance Notes 2026" because the body mentions HIPAA. Semantic search bridges the title-to-content gap.
The agent wants to weight prompts vs documents vs collections. Pass type: 'prompts' to retrieve only matching templates. Pass type: 'documents' to retrieve only matching reference material. Pass 'all' (the default) when you want the full catalog.

Set semantic: false when you remember the exact phrase. Literal substring match across title, description, and indexed document body text (matched through the same hierarchical chunks deep search reads, so body coverage is eventually consistent with chunking) is faster, has zero embedding cost, and beats vector search when the query is a token the document literally contains ("error code E2049" does not need a vector index).

The matching REST surface is GET /v1/search with the same q, type, and semantic query parameters; the MCP tool is a thin wrapper over it.

How does deep_search read inside documents?

deep_search is what makes Context Repo a real retrieval system, not a search bar with a vector index bolted on. Signature:

deep_search({
  query: string,
  documentId?: string,    // search inside one document
  collectionId?: string,  // search inside one collection's documents
  limit?: number,         // server default: 10
  sessionId?: string,     // dedup across paginated reads
})

Scope: documents only. Prompts are stored as single embeddings in a separate index and are reachable only through find_items or search_prompts. That is a deliberate split: a prompt is a unit you read whole; a document is a unit you navigate.

Returns

Each match is chunk-level. The fields on every result:

A stable chunkId.
The chunk's content (the text).
A level: one of "document", "section", or "paragraph".
A chunkIndex for sibling ordering under the same parent.
A parentId (or null at the root).
A similarity score.

The chunks are pieces of the document hierarchy Context Repo builds at ingest time. A textbook chapter becomes a section chunk; the section under that becomes a deeper section; the paragraph under that becomes a paragraph chunk. The structure is always parent-pointer. (Once an agent has a chunkId, calling deep_read returns the same chunk with richer navigation metadata: sectionPath, prevSiblingId, nextSiblingId, headingText, and wordCount.)

When to reach for it

This is the right shape for retrieval-augmented generation (RAG) inside large documents. Instead of dropping a 50,000-token PDF into the model's context, the agent runs deep_search, picks the top three chunks by score, reads them, and decides whether it needs more.

sessionId is the dedup ledger. Pass one back on the next call and Context Repo will not re-return chunks it already gave you in this session. On the MCP transport an auto-session is created and reused per caller, so iterative exploration just works; on REST you mint a sessionId via POST /v1/pd/session and pass it forward yourself.

How does deep_expand navigate document chunks?

A chunk on its own is often not enough. The matching paragraph might assume context from its parent section ("As mentioned in §3, the policy applies to..."). The agent needs to walk the tree.

deep_expand is the navigator:

deep_expand({
  chunkId: string,
  direction: 'up' | 'down' | 'next' | 'previous' | 'surrounding',
  count?: number,   // for 'surrounding' direction
})

The five directions:

up: read the parent chunk. Useful when the matching paragraph needs the surrounding section's setup.
down: read the child chunks. Useful when the match landed on a section heading and the agent needs the section body.
next and previous: read sibling chunks under the same parent. Useful for continuing a list or a sequential narrative.
surrounding: read a window of same-parent siblings around the target. On a sparse hierarchy where the target is the only child under its parent, surrounding automatically falls back to the last/first chunks of the parent's neighbouring sections, so the agent still gets meaningful context.

The agent picks the direction based on what it needs. The cost is one tool call per navigation step, but each step returns only the requested chunk. Context-window cost stays bounded by intent.

Why two retrieval surfaces, not one

Concretely, here is the failure mode we avoided by splitting them:

A single combined endpoint returns "document X scored 0.71, document Y scored 0.69, chunk inside X scored 0.66." The agent has to reason about whether to read X in full or just the matching chunk, and ranking does not really sort that out.
Or the endpoint returns only chunks, and the agent loses the catalog view it needs to choose between candidate documents.
Or the endpoint returns only documents, and the agent has to pull the entire candidate document just to confirm the match.

By keeping find_items catalog-level and deep_search chunk-level, an agent can run a two-step retrieval pattern that mirrors how humans search:

Find the right artifact.
Find the right passage within it.

It maps to the way the agent already thinks about RAG, and it keeps the model's context window honest.

What embedding model and vector store does Context Repo use?

We are pre-launch and have not published benchmark numbers; we will not invent them here. The shape that matters:

Vectors are 1536-dimensional, computed with OpenAI text-embedding-3-small. The Convex vector index pins the dimension; mismatched embeddings fail silently, which is why we treat it as a hard contract.
Embeddings live in Convex's native vector index, in the same backend that holds prompts and documents. No second store to keep in sync.
Embedding generation happens at ingest time for stored content. Query embeddings are computed at call time.
The vector store has a Convex-imposed cap of 256 results per query. Context Repo caps user-visible limits below that, so callers cannot trip the underlying bound by accident.
The chunk hierarchy is three levels: document, section, paragraph. Section chunks can nest; paragraphs are the leaves.

When to use each surface

Question	Surface
Which document covers our HIPAA policy?	`find_items` (catalog)
What does the HIPAA policy say about audit logs?	`deep_search` (content) inside the doc found above
Show me the paragraph after that one	`deep_expand` with `direction: 'next'`
What is the section heading this paragraph lives under?	`deep_expand` with `direction: 'up'`
Find prompts about code review	`find_items` with `type: 'prompts'`
Search inside one collection only	`deep_search` with `collectionId`
Search the literal phrase "error code E2049"	`find_items` with `semantic: false`

Internalize that table and the rest is mechanical.

How this connects to the rest of Context Repo

Both surfaces are exposed as MCP tools and as REST endpoints. The MCP server at contextrepo.com/mcp advertises both tools (plus deep_read and deep_expand for chunk navigation). The REST API at /v1 exposes:

GET /v1/search for catalog retrieval (matches the find_items tool).
POST /v1/pd/search for chunk-level retrieval (matches deep_search).
GET /v1/pd/read/:chunkId for chunk inspection (matches deep_read).
POST /v1/pd/expand for hierarchy navigation (matches deep_expand).
POST /v1/pd/session to mint a sessionId for cross-call dedup.

The dashboard search bar at contextrepo.com/dashboard/search routes user queries through the same underlying endpoints, so a search you ran from Cursor and a search you ran from the browser return the same ranked results (modulo your auth scope).

There is no separate vector database to keep in sync with the row store. One less moving part, one fewer reason for retrieval to drift from truth.

Where to read next

What Is an AI Context Repo for Agents?: category framing
Prompt and Document Management for AI Agents: what we are searching across
How MCP Servers Connect AI Agents to Knowledge Bases: the protocol layer
Using Context Repo with Claude, Cursor, and ChatGPT: real workflows that lean on both retrieval surfaces
Deep search documentation: full MCP tools reference
REST API endpoints: programmatic access