About

This site is an educational demo showing how to chat with documents using Retrieval-Augmented Generation (RAG).

System Architecture Diagram

1. Components

LayerTechnologyRole
FrontendNext.js + Tailwind CSSReact app on Vercel; renders UI and calls API endpoints
BackendFastAPI (Docker)Exposes /ingest and /chat endpoints
Vector DBPostgreSQL + PGVector (via Supabase)Stores document embeddings for similarity search
EmbeddingsGoogle Gemini text-embedding-004Converts text chunks into high-dimensional vectors
LLMGoogle Gemini FlashlightGenerates answers given user query + retrieved contexts

2. Data Flow

  1. Ingestion (/ingest):
    • FastAPI receives a PDF upload
    • Splits into text chunks → requests embeddings from Gemini
    • Stores resulting vectors in Supabase (Postgres+PGVector)
  2. Chat (/chat):
    • FastAPI receives user query
    • Runs similarity search in PGVector → retrieves top-k chunks
    • Formats those chunks into prompt context
    • Sends query+context to Gemini Flashlight → gets answer
    • Returns JSON response back to Next.js UI

3. Deployment

  • Frontend on Vercel (auto-deploy from GitHub)
  • Backend on Render Web Service (Docker)
  • Database managed by Supabase (Postgres+ PGVector)
  • Models via Google Cloud’s Gemini APIs (free tier for demos)

4. Links

Educational prototype only.