Quick Start
Get CodePilot running and connect your first repository in five minutes.
This guide gets you from zero to chatting with a codebase in the shortest path possible. For detailed setup, see Installation.
Five-Minute Quickstart
Clone and Install
git clone https://github.com/lovesinghal31/codepilot.git
cd codepilot
pnpm installStart Services
Start the database and cache infrastructure:
docker compose up -dStart Ollama and pull the embedding model:
ollama pull nomic-embed-text
ollama pull qwen2.5-coder:3bConfigure Environment
Copy the environment files and set minimum required values:
cp .env.example .env
cp .env.api.example .env.api
cp .env.worker.example .env.worker
cp .env.web.example .env.webAt minimum, ensure DATABASE_URL and REDIS_HOST are set correctly in .env.api and .env.worker.
Run Migrations and Start
pnpm --filter @repo/db run db:generate
pnpm --filter @repo/db run db:migrate
pnpm devThe dashboard is now available at http://localhost:3000.
Connect a Repository
- Log in via GitHub OAuth on the dashboard
- Install the CodePilot GitHub App on your account
- Select a repository to connect
- CodePilot will automatically begin ingesting the repository
The worker will clone the repository, parse files, generate embeddings, and store vectors. You can monitor progress in the dashboard.
Chat with Your Code
Once ingestion is complete, navigate to the repository page and start asking questions:
- "What does the main API handler do?"
- "How is authentication implemented?"
- "Show me the database schema"
- "What design patterns are used in this codebase?"
CodePilot retrieves relevant code chunks via semantic search and generates context-aware answers using the local LLM.
What Happens Under the Hood
When you connect a repository, CodePilot performs the following pipeline:
- Clone — The worker clones the repo using a GitHub installation token via
simple-git - Scan — Files are scanned and filtered by language (TypeScript, JavaScript, Markdown, JSON)
- Parse —
ts-morphperforms AST-aware analysis to identify functions, classes, and components - Chunk — Code is split into meaningful chunks (function bodies, class definitions, component trees)
- Embed — Each chunk is embedded using
nomic-embed-text(768-dimensional vectors) - Store — Vectors are stored in PostgreSQL via pgvector with HNSW indexing for fast retrieval
When you ask a question:
- Your query is embedded using the same model
- pgvector performs cosine similarity search to find relevant chunks
- The top-matching chunks are fed to
qwen2.5-coder:3bas context - The LLM generates a response grounded in your actual code