About
This site is an educational demo showing how to chat with documents using Retrieval-Augmented Generation (RAG).

1. Components
| Layer | Technology | Role |
|---|---|---|
| Frontend | Next.js + Tailwind CSS | React app on Vercel; renders UI and calls API endpoints |
| Backend | FastAPI (Docker) | Exposes /ingest and /chat endpoints |
| Vector DB | PostgreSQL + PGVector (via Supabase) | Stores document embeddings for similarity search |
| Embeddings | Google Gemini text-embedding-004 | Converts text chunks into high-dimensional vectors |
| LLM | Google Gemini Flashlight | Generates answers given user query + retrieved contexts |
2. Data Flow
- Ingestion (
/ingest):- FastAPI receives a PDF upload
- Splits into text chunks → requests embeddings from Gemini
- Stores resulting vectors in Supabase (Postgres+PGVector)
- Chat (
/chat):- FastAPI receives user query
- Runs similarity search in PGVector → retrieves top-k chunks
- Formats those chunks into prompt context
- Sends query+context to Gemini Flashlight → gets answer
- Returns JSON response back to Next.js UI
3. Deployment
- Frontend on Vercel (auto-deploy from GitHub)
- Backend on Render Web Service (Docker)
- Database managed by Supabase (Postgres+ PGVector)
- Models via Google Cloud’s Gemini APIs (free tier for demos)
4. Links
Educational prototype only.