ToolUp.RAG
ToolUp.RAG
Retrieval-augmented generation runtime for apps built on ToolUp Platform. Vector store, retrieval pipeline, background ingestion + reembedding services, prompt builder for injecting retrieved chunks into AI system prompts.
Wraps ToolUp.AI — RAGServerApp is a flat superset of AIServerApp, which is itself a flat superset of ServerApp. Pick the tier that matches your needs.
When to use this companion
- AI assistant should ground its answers in user-uploaded documents, notes, narrative-committed content, or other text data.
- "What does our team's data say about X?" — answering questions with citations from the team's corpus.
- Semantic search over user-tenant data without operator access to the content.
- Continuous re-indexing as new documents arrive.
When NOT to use this companion
- Exact-match keyword search — use SQL
LIKE, full-text search (Postgres tsvector, etc.), or a dedicated search engine. RAG is for semantic relevance, not exact match. - Cross-tenant search — the SDK enforces per-scope isolation. Building a global search across all tenants requires an
IRetrievalPipelineimpl that intentionally widens scope access, defeating the security property. - High-throughput log analytics — vector stores aren't optimised for billion-row time-series. Use a proper log store (Loki, ClickHouse, Splunk).
- Single-shot embedding without retrieval — depend on
IEmbeddingProviderfromToolUp.Platform.Serverdirectly.
What's in the box
Two packages:
| Package | What it is |
|---|---|
ToolUp.RAG.Core |
Shared types: IngestionTypes (IngestionQueue, IIngestionStatusObserver). The minimum surface for a downstream companion (ToolUp.KnowledgeBase) to plug into the pipeline. |
ToolUp.RAG.Server |
Chunking, default in-memory vector store + BM25 index, retrieval pipeline, ingestion + reembedding services, RAG prompt builder, RAGCompose. |
Plus embedding-provider sub-companions:
ToolUp.EmbeddingProviders.Local— in-process TF-IDF; dev / CI / offline.ToolUp.EmbeddingProviders.OpenAI—text-embedding-3-smallagainst OpenAI API.
And vector-store sub-companions:
ToolUp.VectorStores.Hnsw— HNSW approximate-nearest-neighbour; lifts the ~50K-chunk ceiling ofInMemoryVectorStore.
Quick start
Add the packages:
<PackageReference Include="ToolUp.RAG.Server" />
<PackageReference Include="ToolUp.EmbeddingProviders.OpenAI" />
<PackageReference Include="ToolUp.AIProviders.Claude" />
Wire the server composition root:
open ToolUp.RAG
open ToolUp.EmbeddingProviders.OpenAI
let embedder = OpenAIEmbeddingProvider.create secretStore :> IEmbeddingProvider
RAGServerApp.create (aiProviderFactory, aiConfigStore, embedder)
|> RAGServerApp.withConfig serverConfig
|> RAGServerApp.withAuth authProvider
|> RAGServerApp.addModules modules
|> RAGServerApp.withAITools AITools.allTools
|> RAGServerApp.run
The agent loop now augments its system prompt with retrieved chunks from any registered scope. Documents uploaded via KnowledgeBase (or any VectorisationHandler-registered DataType) get auto-indexed on the post-save hook.
See getting-started.md for the full walkthrough.
Concepts
See concepts.md for the vector store, retrieval pipeline, chunking, ingestion + reembedding services, scope isolation, prompt-builder composition.
API reference
See api-reference.md for RAGServerApp, IRetrievalPipeline, IVectorStore, IEmbeddingProvider, VectorisationHandler, RAGPromptBuilder, and the per-deployment tuning knobs (withTopK, withMinScore, withMergeStrategy, etc.).
Extending
See extending.md for writing a new embedding provider, vector store, or retrieval tracer.
Scope isolation
The retrieval pipeline filters requested scopes against the caller's AccessContext.TeamId before any call into IVectorStore. A mismatched Team teamId returns an empty result rather than an error — prompt builders compose without fault handling. Platform and Deployment scopes are universally readable for authenticated callers (when PlatformKnowledgeBase is enabled).
There's no API path to retrieve across team boundaries. Custom IRetrievalPipeline impls that bypass scope filtering should not exist — the safe default is structural.