PatentForge runs a complete AI patent analysis pipeline -- feasibility research, prior art search, claim drafting, compliance checking, and USPTO-formatted application generation. Pick Local mode (Ollama + Gemma 4 on your hardware; free; private) or Cloud mode (Anthropic Claude; faster; pay-per-call from your own account). Switch modes any time in Settings.
Version 0.5.0 · Windows, macOS, and Linux · Lean (cloud-only) and Full (Ollama + Gemma 4 bundled) editions.
No Node.js, Python, or git required. Everything is included.
One product, two modes. Local mode runs Ollama + Gemma 4 on your hardware — private, free, fully offline. Cloud mode uses your own Anthropic API key — faster, no GPU needed. Switch any time in Settings.
AI pipeline restates your invention in patent terminology, searches for related art, identifies potential issues, and organizes findings into a structured report. Streams live over SSE.
USPTO PatentSearch API integration with stop-word filtering and title-weighted scoring. Lazy-load actual patent claims text on demand.
3-agent pipeline (Planner, Writer, Examiner) generates independent and dependent patent claims. Edit individual claims, view as dependency tree, export to Word.
5-agent pipeline generates complete USPTO-formatted patent applications: background, summary, detailed description, abstract, and figure descriptions. Word export follows 37 CFR 1.52.
Four automated checks validate claims against 35 USC 112(a), 112(b), MPEP 608, and 101. Traffic-light PASS/FAIL/WARN results with MPEP citations and fix suggestions.
Watch the AI write its findings in real time. Stage progress indicators show exactly where you are in the pipeline. Re-run individual stages.
Download findings as styled HTML, Word (.docx with full formatting), or Markdown. CSV export for prior art. Bring structured research to your first attorney meeting.
Cloud-mode runs show estimated cost before kicking off — Approve before any Anthropic API call is made. Local mode bypasses the modal (it's free). Per-stage and per-run cost displays adapt: "Free" in Local, dollars in Cloud.
Pre-flight validation of hardware (RAM, disk, CPU), Ollama availability, model download status, and GPU detection. Skipped in Cloud mode and on Lean installs.
| Cloud mode | Local mode | |
|---|---|---|
| RAM | 4 GB | 16 GB min · 32 GB+ recommended |
| Disk | 1 GB free | 25 GB min · 50 GB+ recommended |
| CPU | 2 cores, 2018+ | 4 cores, 2018+ · 8+ recommended |
| GPU | Not required | Not required; accelerates inference if present |
| Network | Required (Anthropic API) | Optional (USPTO + web search only) |
| OS | Win 10+, macOS 12+, Ubuntu 22+ | Same |
Local mode auto-detects NVIDIA (CUDA), AMD (ROCm), and Apple Silicon (Metal) for GPU acceleration. Cloud mode is lightweight — the model runs on Anthropic, not your hardware.
| Local mode | Cloud mode | |
|---|---|---|
| AI | Ollama + Gemma 4 (on this machine) | Anthropic Claude (cloud API) |
| Cost | Free (your electricity) | Pay-per-call to your Anthropic account (~$0.10–$2 per analysis) |
| Privacy | Inference never leaves your hardware | Calls go to Anthropic per their API terms |
| API Key | None | Your own (encrypted at rest) |
| Speed | Depends on your hardware | Frontier-fast on any machine |
| Models | Gemma 4 (e4b / 26B / etc.) | Claude Haiku 4.5 / Sonnet 4.6 / Opus 4.7 |
| Cost-confirm | No modal (it's free) | Modal before each run with estimated $ |
Default is Local mode on Full installs (and on upgrades from PatentForgeLocal). Cloud mode is opt-in via the first-run wizard or Settings.
Six-service federated architecture managed by a Go system tray app. LLM calls route through a per-service LLMClient boundary that dispatches to Ollama (Local mode) or Anthropic via LiteLLM (Cloud mode). The Full installer bundles Node SEA binaries, portable Python, and Ollama; the Lean installer skips Ollama for a smaller, cloud-only artifact.
Invention intake, streaming output, stage progress, report viewer, claims editor, Provider chooser + conditional reveals in Settings, CostConfirmModal for Cloud-mode runs.
Project CRUD, Settings (provider + installEdition + encrypted API keys), feasibility run tracking, claim draft orchestration, SSE event forwarding, prior art search via USPTO, Word export.
6-stage sequential AI analysis pipeline. Each stage uses a markdown prompt template through the LLMClient boundary. Streams tokens in real time via SSE.
3-agent claim drafting pipeline: Planner (strategy), Writer (claims), Examiner (review). LangGraph state machine routed through LLMClient.
5-agent sequential LangGraph pipeline generates complete USPTO-formatted patent applications. Word export follows 37 CFR 1.52.
4 specialized checker agents validate claims against 35 USC 112(a), 112(b), MPEP 608, and 101. Per-check structured output through LLMClient.
Gemma 4 inference (Full edition; runs only when Provider=LOCAL)
Claude 4.x via LiteLLM with cost-confirm gating
Patent search + claims retrieval (both modes)
Optional research augmentation (both modes)
Ollama -- local LLM runtime
Google Gemma 4 -- open-weight language model (e4b dense / 26B MoE)
Anthropic Claude -- frontier cloud LLM (Haiku 4.5 / Sonnet 4.6 / Opus 4.7)
LiteLLM -- unified provider abstraction (Ollama + Anthropic via one client)
context-mode -- context window compression