Every company wants "ChatGPT for our docs". Here's the actual build, with the choices we'd make today.
The 5-step pipeline
- Ingest docs (PDF/HTML/Markdown/Notion) into raw text
- Chunk into 500-token pieces with 50-token overlap
- Embed with text-embedding-3-small ($0.02 / 1M tokens)
- Store in pgvector (Supabase, free tier handles 10K docs)
- Retrieve top-k + send to GPT-4o-mini with a "cite your sources" prompt
Tools to use
- LangChain for the orchestration (or LlamaIndex if you prefer)
- Unstructured.io for PDF/DOCX parsing
- text-embedding-3-small (cheap, 1536 dims)
- pgvector + HNSW index
- GPT-4o-mini for the answer ($0.15 / 1M input tokens)
Cost at 1K docs / 10K queries/month
- One-time ingestion: ~$2
- Storage: $0 (Supabase free tier)
- Per query: ~$0.0015 → $15/month at 10K queries
That's it. $15/month for a production RAG system. Most "enterprise RAG vendors" charge $500-5000/mo for this.
The 80% you can build in a weekend
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain.vectorstores.pgvector import PGVector
Ingest docs = load_documents("company_docs/") chunks = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50).split_documents(docs)
Embed + store vector_store = PGVector.from_documents(chunks, OpenAIEmbeddings(model="text-embedding-3-small"), connection_string=PG_URL)
Query results = vector_store.similarity_search(question, k=5) context = "\n\n".join([d.page_content for d in results]) answer = ChatOpenAI(model="gpt-4o-mini").invoke(f"Answer using only this:\n{context}\n\nQuestion: {question}") ```
The 20% that takes you to production
- Reranking (Cohere Rerank, $0.001/query) — 10-15% accuracy boost
- Hybrid search (BM25 + vector) — best for technical docs
- Eval harness (RAGAS metrics) — catch regressions
- Source citations in the answer
- Streaming responses (better UX)
- Caching (Redis for repeated questions)
Hire a RAG pipeline developer →
Need this built for you?
Hire a vetted Nexora expert. Escrow-protected. Fixed price. From $65.
Browse automation services →