Skip to content

MCP server for codebase intelligence — patterns, conventions, architecture, and rationale for AI coding agents

License

Notifications You must be signed in to change notification settings

PatrickSys/codebase-context

Repository files navigation

codebase-context

npm version license node

A second brain for AI coding agents. MCP server that remembers team decisions, tracks pattern evolution, and guides every edit with evidence.

Quick Start

Claude Desktop

Add to claude_desktop_config.json:

{
  "mcpServers": {
    "codebase-context": {
      "command": "npx",
      "args": ["-y", "codebase-context", "/path/to/your/project"]
    }
  }
}

VS Code (Copilot)

Add .vscode/mcp.json to your project root:

{
  "servers": {
    "codebase-context": {
      "command": "npx",
      "args": ["-y", "codebase-context", "${workspaceFolder}"]
    }
  }
}

Cursor

Add to .cursor/mcp.json in your project:

{
  "mcpServers": {
    "codebase-context": {
      "command": "npx",
      "args": ["-y", "codebase-context", "/path/to/your/project"]
    }
  }
}

Windsurf

Open Settings > MCP and add:

{
  "mcpServers": {
    "codebase-context": {
      "command": "npx",
      "args": ["-y", "codebase-context", "/path/to/your/project"]
    }
  }
}

Claude Code

No config file needed. Add to .claude/settings.json or run:

claude mcp add codebase-context -- npx -y codebase-context /path/to/your/project

What Makes It a Second Brain

Other tools help AI find code. This one helps AI make the right decisions — by remembering what your team does, tracking how patterns evolve, and warning before mistakes repeat.

Remembers

Decisions, rationale, and past failures persist across sessions. Not just what the team does — why.

  • Internal library usage: @mycompany/ui-toolkit (847 uses) vs primeng (3 uses) — and why the wrapper exists
  • "Tried direct PrimeNG toast, broke event system" — recorded as a failure memory, surfaced before the next agent repeats it
  • Conventions from git history auto-extracted: refactor:, migrate:, fix:, revert: commits become memories with zero manual effort

Reasons

Quantified pattern analysis with trend direction. Not "use inject()" — "97% of the team uses inject(), and it's rising."

  • inject(): 97% adoption vs constructor(): 3% — with trend direction (rising/declining)
  • Signals: rising (last used 2 days ago) vs RxJS BehaviorSubject: declining (180+ days)
  • Golden files: real implementations scoring highest on modern pattern density — canonical examples to follow
  • Pattern conflicts detected: when two approaches in the same category both exceed 20% adoption

Protects

Before an edit happens, the agent gets a preflight briefing: what to use, what to avoid, what broke last time.

  • Preflight card on search_codebase with intent: "edit" — risk level, preferred/avoid patterns, failure warnings, golden files, impact candidates
  • Failure memories bump risk level and surface as explicit warnings
  • Confidence decay: memories age (90-day or 180-day half-life). Stale guidance gets flagged, not blindly trusted
  • Epistemic stress detection: when evidence is contradictory, stale, or too thin, the preflight card says "insufficient evidence" instead of guessing
  • Search quality transparency: search_codebase includes searchQuality (ok/low_confidence, signals, confidence, next steps) so ambiguous retrieval is explicit instead of hidden

Discovers

Hybrid search (BM25 keyword 30% + vector embeddings 70%) with structured filters across 30+ languages:

  • Framework: Angular, React, Vue
  • Language: TypeScript, JavaScript, Python, Go, Rust, and 25+ more
  • Component type: component, service, directive, guard, interceptor, pipe
  • Architectural layer: presentation, business, data, state, core, shared
  • Circular dependency detection, style guide auto-detection, architectural layer classification

Measured Results

Tested against a real enterprise Angular codebase (~30k files):

What was measured Result
Internal library detection 336 uses of @company/ui-toolkit vs 3 direct PrimeNG
DI pattern consensus 98% inject() adoption detected, constructor DI flagged
Test framework detection 74% Jest, 26% Jasmine/Karma, per-module awareness
Wrapper discovery ToastEventService, DialogComponent surfaced over raw
Golden file identification Top 5 files scoring 4-6 modern patterns each

Without this context, AI agents default to generic patterns: raw PrimeNG imports, constructor injection, Jasmine syntax. With the second brain active, generated code matches the existing codebase on first attempt.

How It Works

The difference in practice:

Without second brain With second brain
Uses constructor(private svc: Service) Uses inject() (97% team adoption)
Suggests primeng/button directly Uses @mycompany/ui-toolkit wrapper
Generic Jest setup Your team's actual test utilities

Preflight Card

When using search_codebase with intent: "edit", "refactor", or "migrate", the response includes a preflight card alongside search results:

{
  "preflight": {
    "intent": "refactor",
    "riskLevel": "medium",
    "confidence": "fresh",
    "evidenceLock": {
      "mode": "triangulated",
      "status": "pass",
      "readyToEdit": true,
      "score": 100,
      "sources": [
        { "source": "code", "strength": "strong", "count": 5 },
        { "source": "patterns", "strength": "strong", "count": 3 },
        { "source": "memories", "strength": "strong", "count": 2 }
      ]
    },
    "preferredPatterns": [
      { "pattern": "inject() function", "category": "dependencyInjection", "adoption": "98%", "trend": "Rising" }
    ],
    "avoidPatterns": [
      { "pattern": "Constructor injection", "category": "dependencyInjection", "adoption": "2%", "trend": "Declining" }
    ],
    "goldenFiles": [
      { "file": "src/features/auth/auth.service.ts", "score": 6 }
    ],
    "failureWarnings": [
      { "memory": "Direct PrimeNG toast broke event system", "reason": "Must use ToastEventService" }
    ]
  },
  "results": [...]
}

One call. The second brain composes patterns, memories, failures, and risk into a single response.

Tip: Auto-invoke in your rules

Add this to your .cursorrules, CLAUDE.md, or AGENTS.md:

## Codebase Context

**At start of each task:** Call `get_memory` to load team conventions.

**CRITICAL:** When user says "remember this" or "record this":
- STOP immediately and call `remember` tool FIRST
- DO NOT proceed with other actions until memory is recorded
- This is a blocking requirement, not optional

Now the agent checks patterns automatically instead of waiting for you to ask.

Tools

Tool Purpose
search_codebase Hybrid search with filters. Pass intent: "edit" for preflight card
get_component_usage Find where a library/component is used
get_team_patterns Pattern frequencies, golden files, conflict detection
get_codebase_metadata Project structure overview
get_indexing_status Indexing progress + last stats
get_style_guide Query style guide rules
detect_circular_dependencies Find import cycles between files
remember Record memory (conventions/decisions/gotchas/failures)
get_memory Query memory with confidence decay scoring
refresh_index Re-index the codebase + extract git memories

Language Support

The Angular analyzer provides deep framework-specific analysis (signals, standalone components, control flow syntax, lifecycle hooks, DI patterns). A generic analyzer covers 30+ languages and file types as a fallback: JavaScript, TypeScript, Python, Java, Kotlin, C/C++, C#, Go, Rust, PHP, Ruby, Swift, Scala, Shell, and common config/markup formats.

File Structure

The MCP creates the following structure in your project:

.codebase-context/
  ├── memory.json         # Team knowledge (commit this)
  ├── intelligence.json   # Pattern analysis (generated)
  ├── index.json          # Keyword index (generated)
  └── index/              # Vector database (generated)

Recommended .gitignore: The vector database and generated files can be large. Add this to your .gitignore to keep them local while sharing team memory:

# Codebase Context MCP - ignore generated files, keep memory
.codebase-context/*
!.codebase-context/memory.json

Memory System

Patterns tell you what the team does ("97% use inject"), but not why ("standalone compatibility"). Use remember to capture rationale that prevents repeated mistakes:

remember({
  type: 'decision',
  category: 'dependencies',
  memory: 'Use node-linker: hoisted, not isolated',
  reason: "Some packages don't declare transitive deps."
});

Memory types: convention (style rules), decision (architecture choices), gotcha (things that break), failure (tried X, failed because Y).

Confidence decay: Memories age. Conventions never decay. Decisions have a 180-day half-life. Gotchas and failures have a 90-day half-life. Memories below 30% confidence are flagged as stale in get_memory responses.

Git auto-extraction: During indexing, conventional commits (refactor:, migrate:, fix:, revert:) from the last 90 days are auto-recorded as memories. Zero manual effort.

Pattern conflicts: get_team_patterns detects when two patterns in the same category are both above 20% adoption with different trends, and surfaces them as conflicts with both sides.

Memories surface automatically in search_codebase results, get_team_patterns responses, and preflight cards.

Known quirks:

  • Agents may bundle multiple things into one entry
  • Edit .codebase-context/memory.json directly to clean up
  • Be explicit: "Remember this: use X not Y"

Configuration

Variable Default Description
EMBEDDING_PROVIDER transformers openai (fast, cloud) or transformers (local, private)
OPENAI_API_KEY - Required if provider is openai
CODEBASE_ROOT - Project root to index (CLI arg takes precedence)
CODEBASE_CONTEXT_DEBUG - Set to 1 to enable verbose logging (startup messages, analyzer registration)

Performance

This tool runs locally on your machine.

  • Initial indexing: First run may take several minutes (e.g., 2-5 min for 30k files) to compute embeddings.
  • Subsequent queries: Instant (milliseconds) from cache.
  • Updates: refresh_index supports full or incremental mode (incrementalOnly: true) to process only changed files.

Links

License

MIT

About

MCP server for codebase intelligence — patterns, conventions, architecture, and rationale for AI coding agents

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •