Patent analysis your way —
run locally or in the cloud

PatentForge runs a complete AI patent analysis pipeline -- feasibility research, prior art search, claim drafting, compliance checking, and USPTO-formatted application generation. Pick Local mode (Ollama + Gemma 4 on your hardware; free; private) or Cloud mode (Anthropic Claude; faster; pay-per-call from your own account). Switch modes any time in Settings.

Download Latest Release View on GitHub

Version 0.5.0 · Windows, macOS, and Linux · Lean (cloud-only) and Full (Ollama + Gemma 4 bundled) editions.
No Node.js, Python, or git required. Everything is included.

User Manual README

Quick Start

  1. Pick an editionLean (no Ollama bundle, cloud-only) or Full (Ollama + Gemma 4 bundled; supports both modes).
  2. Download the installer from GitHub Releases.
  3. Run the installer and follow the first-launch wizard — Full installs ask "Local or Cloud?"; Lean installs skip straight to Cloud setup. Switch modes any time in Settings.

What PatentForge Does

Local or Cloud — Your Choice

One product, two modes. Local mode runs Ollama + Gemma 4 on your hardware — private, free, fully offline. Cloud mode uses your own Anthropic API key — faster, no GPU needed. Switch any time in Settings.

6-Stage Feasibility Analysis

AI pipeline restates your invention in patent terminology, searches for related art, identifies potential issues, and organizes findings into a structured report. Streams live over SSE.

Prior Art Discovery

USPTO PatentSearch API integration with stop-word filtering and title-weighted scoring. Lazy-load actual patent claims text on demand.

AI Claim Drafting

3-agent pipeline (Planner, Writer, Examiner) generates independent and dependent patent claims. Edit individual claims, view as dependency tree, export to Word.

Application Generation

5-agent pipeline generates complete USPTO-formatted patent applications: background, summary, detailed description, abstract, and figure descriptions. Word export follows 37 CFR 1.52.

Compliance Checking

Four automated checks validate claims against 35 USC 112(a), 112(b), MPEP 608, and 101. Traffic-light PASS/FAIL/WARN results with MPEP citations and fix suggestions.

Real-Time Streaming

Watch the AI write its findings in real time. Stage progress indicators show exactly where you are in the pipeline. Re-run individual stages.

Attorney-Ready Exports

Download findings as styled HTML, Word (.docx with full formatting), or Markdown. CSV export for prior art. Bring structured research to your first attorney meeting.

Cost-Confirm Modal (Cloud Mode)

Cloud-mode runs show estimated cost before kicking off — Approve before any Anthropic API call is made. Local mode bypasses the modal (it's free). Per-stage and per-run cost displays adapt: "Free" in Local, dollars in Cloud.

System Check (Local Mode)

Pre-flight validation of hardware (RAM, disk, CPU), Ollama availability, model download status, and GPU detection. Skipped in Cloud mode and on Lean installs.

The 6-Stage Research Pipeline

1

Technical Intake & Restatement

2

Prior Art Research

3

Patentability Assessment

4

Deep Dive Analysis

5

IP Landscape Assessment

6

Consolidated Report

System Requirements

Cloud modeLocal mode
RAM4 GB16 GB min · 32 GB+ recommended
Disk1 GB free25 GB min · 50 GB+ recommended
CPU2 cores, 2018+4 cores, 2018+ · 8+ recommended
GPUNot requiredNot required; accelerates inference if present
NetworkRequired (Anthropic API)Optional (USPTO + web search only)
OSWin 10+, macOS 12+, Ubuntu 22+Same

Local mode auto-detects NVIDIA (CUDA), AMD (ROCm), and Apple Silicon (Metal) for GPU acceleration. Cloud mode is lightweight — the model runs on Anthropic, not your hardware.

Local mode vs Cloud mode

Local modeCloud mode
AIOllama + Gemma 4 (on this machine)Anthropic Claude (cloud API)
CostFree (your electricity)Pay-per-call to your Anthropic account (~$0.10–$2 per analysis)
PrivacyInference never leaves your hardwareCalls go to Anthropic per their API terms
API KeyNoneYour own (encrypted at rest)
SpeedDepends on your hardwareFrontier-fast on any machine
ModelsGemma 4 (e4b / 26B / etc.)Claude Haiku 4.5 / Sonnet 4.6 / Opus 4.7
Cost-confirmNo modal (it's free)Modal before each run with estimated $

Default is Local mode on Full installs (and on upgrades from PatentForgeLocal). Cloud mode is opt-in via the first-run wizard or Settings.

PatentForge is a research tool, not a legal service. The author of this tool is not a lawyer. The AI systems that execute these prompts are not lawyers. No attorney-client relationship is created by using this tool. The output is intended for informational and educational purposes only -- to help you prepare for a consultation with a registered patent attorney, not to replace one.

AI-generated analysis may contain errors, omissions, or hallucinated references. Always consult qualified legal counsel before making patent filing decisions.

Legal guardrails built in: first-run clickwrap agreement, embedded disclaimers on every stage output, persistent disclaimer watermarks on every generated report, and (in Cloud mode) a cost-confirm modal before every Anthropic API call. See LEGAL_NOTICE.md for full details.

Architecture

Six-service federated architecture managed by a Go system tray app. LLM calls route through a per-service LLMClient boundary that dispatches to Ollama (Local mode) or Anthropic via LiteLLM (Cloud mode). The Full installer bundles Node SEA binaries, portable Python, and Ollama; the Lean installer skips Ollama for a smaller, cloud-only artifact.

React Frontend

:3000

Invention intake, streaming output, stage progress, report viewer, claims editor, Provider chooser + conditional reveals in Settings, CostConfirmModal for Cloud-mode runs.

React 18 • TypeScript • Vite • Tailwind CSS

NestJS Backend

:3001

Project CRUD, Settings (provider + installEdition + encrypted API keys), feasibility run tracking, claim draft orchestration, SSE event forwarding, prior art search via USPTO, Word export.

NestJS • Prisma ORM • SQLite

Feasibility Service

:3002

6-stage sequential AI analysis pipeline. Each stage uses a markdown prompt template through the LLMClient boundary. Streams tokens in real time via SSE.

Express • LLMClient (Ollama or LiteLLM/Anthropic) • 6 prompts (CC BY-SA 4.0)

Claim Drafter

:3003

3-agent claim drafting pipeline: Planner (strategy), Writer (claims), Examiner (review). LangGraph state machine routed through LLMClient.

Python • FastAPI • LangGraph • LiteLLM

Application Generator

:3004

5-agent sequential LangGraph pipeline generates complete USPTO-formatted patent applications. Word export follows 37 CFR 1.52.

Python • FastAPI • LangGraph • LiteLLM • python-docx

Compliance Checker

:3005

4 specialized checker agents validate claims against 35 USC 112(a), 112(b), MPEP 608, and 101. Per-check structured output through LLMClient.

Python • FastAPI • LangGraph • LiteLLM
Ollama (Local mode)

Gemma 4 inference (Full edition; runs only when Provider=LOCAL)

Anthropic (Cloud mode)

Claude 4.x via LiteLLM with cost-confirm gating

USPTO PatentSearch

Patent search + claims retrieval (both modes)

Ollama Cloud Web Search

Optional research augmentation (both modes)

Built With

Ollama -- local LLM runtime
Google Gemma 4 -- open-weight language model (e4b dense / 26B MoE)
Anthropic Claude -- frontier cloud LLM (Haiku 4.5 / Sonnet 4.6 / Opus 4.7)
LiteLLM -- unified provider abstraction (Ollama + Anthropic via one client)
context-mode -- context window compression