context-mode

Your Claude sessions degrade as context fills up. context-mode sandboxes raw output, compresses what comes back with a self-learning pipeline, and routes tools automatically — so Claude stays sharp for the entire conversation. 30–60% less context consumed in typical sessions, more in research-heavy ones. Run ctx_stats to see your savings in tokens and dollars.

30–60%
Context reduction (typical sessions)
9
MCP tools
11
Languages
User prompt ──► PreToolUse hook ──► sandbox / block / nudge │ MCP Server ──► execute ──► compress (3 stages) ──► context │ │ │ Learner ◄── retrieval signals │ PostToolUse hook ──► event capture + signal files │ PreCompact hook ──► ≤2KB XML snapshot ──► resume

Install

Option 1 — npx (any machine):
npx --yes --package=github:scottconverse/context-mode context-mode
Option 2 — Marketplace (Claude Code / Cowork):
/plugin marketplace add scottconverse/context-mode /plugin install context-mode@scottconverse-context-mode
Start a new session. Done.

Six Core Capabilities

Sandbox Execution

Run code in isolated subprocesses. Only stdout returns to context. Raw file contents, command output, and web pages never bloat your context window. Supports JavaScript, TypeScript, Python, Shell, Ruby, Go, Rust, PHP, Perl, R, and Elixir.

Token Compression

3-stage pipeline: deterministic ANSI stripping, pattern-based matchers for 10 tool formats (jest, pytest, git log, cargo build, etc.), and session-aware relevance filtering. Passing tests collapse to counts. Compile steps summarize. Errors and failures are always preserved verbatim.

Self-Learning

A feedback loop tracks what compressed content Claude later retrieves. High retrieval rates raise retention; low rates increase compression. Learner accuracy, lifetime token savings, and estimated cost savings (Opus/Sonnet/Haiku) are all visible in ctx_stats.

Knowledge Base

Index documents into a local SQLite FTS5 database with BM25 ranking, trigram matching, and Reciprocal Rank Fusion. Search returns only relevant snippets. Fuzzy correction handles typos. 24-hour TTL cache for web pages.

Session Continuity

Hooks capture file operations, git commands, errors, tasks, and decisions into SQLite. Before context compaction, a priority-tiered XML snapshot (under 2KB) preserves what matters. Claude resumes from exactly where it left off.

Automatic Tool Routing

PreToolUse hooks intercept Bash, Read, Grep, WebFetch, and Agent calls. 23 matchers cover git, npm, pytest, cargo, docker, make, curl/wget, and more. Large-output commands redirect through the compression pipeline automatically.

Architecture

System Context (always-on) |-- CLAUDE.md --> tool rules + routing instructions |-- .claude/settings.json --> hook registrations + permissions | Cowork Session | |-- start.js bootstrapper (ensure-deps, MCP server spawn) | |-- SessionStart hook --> routing block + session guide injection |-- UserPromptSubmit hook --> prompt preprocessing + context injection |-- PreToolUse hook --> 18 declarative rules (routing-rules.js + engine) | |-- curl/wget --> block stdout floods; allow silent+file-output | |-- inline HTTP --> fetch()/requests.get() → ctx_execute sandbox | |-- gradle/maven --> build tools → ctx_execute sandbox | |-- git log/diff --> redirect unbounded through compressor | |-- npm test/jest/vitest --> redirect test runners through compressor | |-- pytest --> redirect through compressor | |-- npm/pip install --> redirect package managers through compressor | |-- cargo/docker/make --> redirect build tools through compressor | |-- Read/Grep --> once-per-session guidance nudge | |-- WebFetch --> deny; redirect → ctx_fetch_and_index | \-- Agent/Task --> inject routing block into subagent prompt |-- PostToolUse hook --> event capture (13 categories, 4 priorities) |-- SubagentStop hook --> subagent result summarization |-- PreCompact hook --> snapshot builder (≤2KB XML) | |-- MCP Server (stdio) | | | |-- ctx_execute / ctx_execute_file | | \-- PolyglotExecutor (subprocess isolation) | | |-- 11 language runtimes | | |-- 100MB stdout hard cap | | |-- 30s timeout, background mode | | \-- Windows taskkill / Unix kill -PGID | | | |-- ctx_index / ctx_search / ctx_fetch_and_index | | \-- ContentStore (SQLite FTS5) | | |-- Porter stemmer + trigram tokenizers | | |-- BM25 (title 5x weight) + RRF fusion (K=60) | | |-- Proximity reranking + Levenshtein fuzzy | | \-- Smart snippets (windowed extraction) | | | |-- Compressor (3-stage pipeline) | | |-- Stage 1: deterministic (ANSI, \r, BOM, blanks) | | |-- Stage 2: pattern-based (10 tool matchers) | | \-- Stage 3: session-aware (learner weights) | | | |-- Learner (self-improving feedback loop) | | |-- compression_log (decisions + retrieval tracking) | | |-- retention weights (retrievalRate × 3, 5min cache) | | \-- signal files ← PostToolUse ctx_search detection | | | |-- ctx_batch_execute (multi-command + multi-query) | |-- ctx_stats (token savings + cost + learner dashboard) | |-- ctx_doctor / ctx_purge | | | \-- Session DB (SQLite) | |-- session_events (1000 max, FIFO eviction) | |-- session_meta (compact count tracking) | \-- session_resume (snapshot storage) | \-- Skills: /context-mode, /ctx-stats, /ctx-doctor, /ctx-purge

MCP Tools

ToolPurpose
ctx_executeRun code in a sandboxed subprocess (11 languages)
ctx_execute_fileProcess files through sandbox — raw content stays out of context
ctx_batch_executeMultiple commands + searches in one call
ctx_indexIndex text/markdown/JSON into the knowledge base
ctx_searchBM25 + trigram search with RRF fusion
ctx_fetch_and_indexFetch URL, convert to markdown, index with 24h TTL cache
ctx_statsToken savings, compression breakdown, cost estimates (Opus/Sonnet/Haiku), and learner accuracy
ctx_doctorPlugin environment diagnostics
ctx_purgeDelete all indexed content

Technical Details

SpecValue
LanguageJavaScript (ES modules, Node.js ≥ 18)
DatabaseSQLite via better-sqlite3 with WAL mode
SearchFTS5 (Porter + trigram), BM25, RRF (K=60), proximity reranking
SandboxSubprocess isolation, 100MB cap, 30s timeout
Session1000 events max, SHA-256 dedup, FIFO eviction
SnapshotPriority-tiered XML, ≤2KB budget
Cache24h TTL for fetched URLs
ThrottlingProgressive: 2→1→blocked over 60s window
Compression3-stage pipeline: deterministic → pattern-based (10 matchers) → session-aware (learner weights)
LearnerRetrieval-based feedback loop, retention weights (5min cache), 7-day decay, signal files
Hooks6 events, 18 declarative routing rules (PreToolUse, PostToolUse, PreCompact, SessionStart, UserPromptSubmit, SubagentStop)
PlatformWindows, macOS, Linux (cross-platform .cmd wrapper)
SchemaPRAGMA user_version tracking, ordered migrations, automatic backup
Tests222 E2E (20 sections) + 58 adversarial + 76 compression + ~55 per-rule routing (vitest) = 410+ total
LicenseElastic License 2.0

Resources