Argus
An M&A drafting and review assistant. A track-changes Hub for generating, revising, and redlining contracts — plus a NotebookLM-style chat grounded in 11,266 M&A corpus chunks.
Live demo
Drafting & Review Hub
Generate a contract from a prompt, revise your draft, or review an existing agreement — with AI-powered issue spotting, missing-clause proposals, and per-change accept/reject controls.
The problem
M&A diligence is repetitive, high-stakes, and document-heavy. Junior associates spend hours issue-spotting NDAs and SPAs against playbooks that rarely change. Argus asks: what if a retrieval pipeline could do the first pass — flagging weak clauses, proposing missing provisions, and citing its sources — so lawyers spend their time on judgment, not pattern-matching?
What I built
Argus has two surfaces. The Drafting & Review Hub is the main product: a two-pane editor where you can generate a contract from a prompt, revise an uploaded draft, or review an existing agreement — with per-clause accept/reject/edit controls, anchored chat, and four downloadable artifacts (redline.docx, clean.docx, memo.docx, register.json). The Ask the Corpus surface is a NotebookLM-style chat grounded in 11,266 corpus chunks indexed in Neon pgvector: a curated 22-category M&A playbook plus ~7,067 spans from CUAD (the Contract Understanding Atticus Dataset, Hendrycks et al., NeurIPS 2021) and ~4,177 spans from MAUD (the Merger Agreement Understanding Dataset, Wang et al., NLLP @ EMNLP 2023).
The anonymizer runs first on every input across both surfaces. A Gemini
2.5 Flash-Lite call replaces named entities with stable placeholders
(PARTY_A, ORG_001, etc.), maintained in a
per-session two-way map that lives in process memory only (24h TTL,
never persisted to disk or any third-party store). Pseudonymized text
is what reaches Cohere, Vertex, and Supermemory; the user sees their
original entity names restored via rehydration on the way back. A
regex PII firewall re-screens content before each Supermemory write
as a second layer of defense.
Pipeline
Stack
Generation
- Vertex Gemini 2.5 Flash
- Gemini 2.5 Flash-Lite (anonymizer)
Retrieval
- Cohere Embed v4 (1024-dim)
- Cohere Rerank 3.5
- Neon pgvector (HNSW)
Memory
- Supermemory
- kind=chat_exchange / context / review_summary
- Session-scoped, anonymized writes only
Backend
- Flask + gunicorn
- Cloud Run (min-instances=1)
- Neon Postgres
Export
- python-docx
- docx-revisions (native <w:ins>/<w:del>)
Infra
- Google Cloud Platform
- Cloudflare DNS + Pages
- Secret Manager
What I learned
The hardest part was the anonymizer-first data flow. Every piece of text the user submits has to be pseudonymized before it reaches Cohere, Gemini, or Supermemory — but the two-way pseudonym map must stay in process memory only. Persisting it anywhere recreates the PII problem you were trying to solve. The map lives behind a per-session lock with a 24h in-memory TTL and gets garbage-collected when the worker recycles.
Cohere Rerank 3.5 is genuinely better than vanilla vector search for legal text. The reranker understands that "indemnification cap" and "liability ceiling" are the same concept in a way that cosine distance often doesn't. Spent the most iteration time on the Hub side where retrieval quality determines whether the spotter flags real issues or generic ones.
Validation matters more than I expected. v1.1.1 caught four silent failures
in the Supermemory write path that all happened to mask one another:
the standalone chat surface was missing the write code entirely; the PII
heuristic substring-matched "ein" inside common English words; the
Supermemory SDK changed its API surface (client.memories.add →
client.add); and the test prompt I used didn't reference enough
entity names to exercise rehydration. Each one looked like a deploy
problem until logs proved otherwise.