CodePilot

Introduction

CodePilot is an AI-powered GitHub repository analysis platform that uses embeddings, semantic search, and Retrieval-Augmented Generation to let you chat with any codebase.

CodePilot is an AI-powered GitHub repository analysis platform. Connect a GitHub repository and CodePilot will clone it, parse the source code, intelligently chunk files, generate vector embeddings, and store them in PostgreSQL using pgvector — enabling AI-powered chat over any codebase through Retrieval-Augmented Generation (RAG).

What CodePilot Does

Connect a Repository

A developer connects a GitHub repository through the dashboard. CodePilot uses GitHub OAuth and GitHub App installations to securely access repository contents.

Ingest and Index

CodePilot clones the repository, parses source files using AST analysis (ts-morph), chunks code into meaningful segments (functions, classes, components), generates 768-dimensional vector embeddings via Ollama's nomic-embed-text model, and stores everything in PostgreSQL with pgvector using HNSW indexing.

Chat with Your Code

Ask natural-language questions about the codebase. CodePilot embeds your query, performs cosine similarity search against stored vectors, retrieves the most relevant code chunks, and feeds them to a local LLM (Ollama) for context-aware answers.

Key Features

Tech Stack

LayerTechnology
FrontendNext.js 16, React 19, Tailwind CSS, shadcn/ui
API ServerExpress 5, TypeScript, Zod validation
DatabasePostgreSQL 18 + pgvector, Prisma 7 ORM
QueueBullMQ + Redis
AI / EmbeddingsOllama (qwen2.5-coder, nomic-embed-text)
GitHubOctokit, GitHub App (webhooks, OAuth, REST API)
WorkerBullMQ consumers, ts-morph (AST parsing)
MonorepoTurborepo, pnpm workspaces
ContainerizationDocker, Docker Compose

Next Steps

On this page