Keep up with your AI-generated code.

The old code-review model is out. Agents generate too much code, too quickly, for diff-by-diff review to keep pace. ponens is the review model that keeps up.

ponens makes each agent session reviewable at a higher level — a curated, verifiable reasoning trace of what was built and why. You can visualize it, grade it, and govern it with Computable Governance — so your quality bar and your governance keep pace with the code.

pip install ponens See a trace → Policy gallery →

Already 197 computable policies across 11 packs from 9 organizations.

From session to reasoning record

ponens reads the agent's raw session and distills it into a higher-level reasoning record — curated steps, the lineage of what changed, the gaps it left open, and a grade.

Agent session ~/payment-service

Youmake capture idempotent

Readpayment.py

Editpayment.py

Youyes

Bashpytest✓ 24 passed

Youok now handle refunds too

Editpayment.py

Bashgit commit -m "fix"

A reasoning record Open demo →

Flow · 4 steps

m1Understand the capture pathdone

m2Prove no double-chargeproved

m3Implement & testdone

m4Commit1 action

Lineage

assumes a stable idempotency key unverified Grade B · lineage 100% · policies 4/4 ✓

Governance you can compute

Best-practice rules — research before editing, tests before commit, proofs on high-stakes paths — as machine-checked formulas over the trace, not a checklist a human eyeballs.

The problem

A reasoning trace in prose can't be enforced. "I ran the tests before committing" is a sentence, not a guarantee. The practices you actually care about have no teeth when they live in a checklist someone skims under deadline.

The solution

ponens gives the trace a typed schema — atomic actions, artifacts, a lineage DAG, a residual surface — then encodes each policy as a formula in LTLf (linear temporal logic over finite traces), extended with scoped-past (P_chain, P_target) and structural operators. The CLI compiles each formula and evaluates it over the trace: a deterministic PASS / FAIL, run offline in CI. Same foundation as DECLARE / declarative process mining and runtime verification — pointed at AI agents.

policies.ltlevaluated over the trace · PASS / FAIL

tests_before_commit pure temporal

G(GitCommit → P(RunTests ∧ completed))

every commit is preceded by a test run that completed

research_before_edit scoped past

G(EditFile → P_target(ReadFile ∨ ReadDocumentation ∨ SearchCode ∨ AnalyzeCode))

no edit without first reading or searching the relevant code

reasoning_required_for_high_stakes scoped past

G(EditFile ∧ high_stakes_path → P_chain(VerificationResult(proved ∨ sat) ∨ Decomposition))

on a high-stakes path, an edit must be backed by a proof, a SAT result, or a decomposition

no_open_critical_residuals structural

¬∃ r ∈ residuals . r.severity = Critical ∧ r.status = Open

the trace can't ship with an unresolved critical gap

The full operator reference is in the Policy Language spec, and 40+ ready policies are in the gallery.

How review works with ponens

Developer + AI agent

writes the code

Emit the record

The agent turns its session into a reasoning trace.

Grade & govern

The quality grade and Computable Governance run in CI.

Open the PR

The record posts on the pull request, beside the diff.

AI auditor (LLM)

audits & certifies

Re-check the claims

An LLM re-derives each claim from the evidence in the trace.

Certify what holds

Marks the reasoning it can independently confirm.

Flag the rest

Escalates only what it can't certify to a human.

Reviewer

checks the work

Read the flagged gaps

Just what the audit couldn't certify — a short list.

Verify what matters

Check the consequential claims; skip the noise.

Approve & merge

Reasoning bound 1:1 to the commit — auditable later.

Quickstart

pip install ponens                              # Install

ponens agent                                    # The whole workflow, in one command
ponens emit -o trace.json                       # Capture the session as a trace
ponens trace meta set trace.json m3 --title "…" # Curate the narrative
ponens trace grade trace.json                   # The quality rubric
ponens trace check trace.json                   # Computable Governance — best-practice policies
ponens trace view trace.json                    # Read the reasoning, zoomable

Practical guides

Short, copy-pasteable how-tos for real tasks. All guides →

Author

Capture & curate a trace

Turn an agent session into a clean, honest reasoning record.

Reviewer

Review an AI-generated PR

Start from the declared gaps; verify what matters; skip the noise.

Maintainer

Add ponens to CI

Grade, gaps, and a viewer on every PR — plus a PASS/FAIL gate.

Maintainer

Govern a repo with policies

Best-practice rules checked over every trace, offline.

Open source — built in the open

ponens is MIT-licensed, and the moving parts grow with the community: the policy gallery, the reasoner registry, the agent adapters, and the trace & policy spec. Add a best-practice policy, register an automated-reasoning tool, write an adapter for your agent, or help shape the spec.

Star on GitHub Read CONTRIBUTING →