phishprobe

Phishing detection — URLs, domains, emails, headers

v1.3.2

Linux

Quick Start

Install via jcli (recommended)

jcli install phishprobe

If you don't have jcli yet, install it first with curl -fsSL https://cli.johlem.net/tools/jcli/install.sh | bash.

Score a URL

phishprobe --url https://paypal-secure-login.example.com
phishprobe --url https://example.com --output json     # pipe-clean JSON for SIEM
phishprobe --url 'http://203.0.113.45/login' -v        # verbose: per-rule contribution

What it does

phishprobe is a rule-based phishing scorer. Feed it a URL (or in the future, a domain, email address, or raw RFC 2822 headers) and it returns a numeric PhishScore 0–100 plus a RiskLevel (Clean / Low / Medium / High / Critical). Pure-Rust, no Python, no network unless a future analyzer demands it. Absorbs the retired openclaw tool as phishprobe takedown — the score → draft → approve → send pipeline now lives in one binary.

URL analyzer (no network). Heuristics that don't require an upstream fetch: IP-as-host, percent-encoded characters in path, free TLDs (.tk/.ml/.ga/...), brand lookalikes (Levenshtein against an embedded top-200 brand list), userinfo in URL (http://attacker.example.com@victim.com), suspicious keywords (login/secure/verify/account).
Embedded Levenshtein. Pure-Rust implementation in src/scoring/levenshtein.rs — no external crate. Brand list ships as data/brands.txt via include_str!.
Pipe-clean JSON. --output json emits a stable schema with the score, level, matched rules, and contributing factors — suitable for SIEM / Splunk / Elastic ingest.
Severity-gated exit codes. Exit 0 on Clean/Low, 1 on Medium+, for CI-style gates around link-extracted artefacts.

Analyzers

Analyzer	Status	What it does
`--url <URL>`	working	Score a single URL via the pure-Rust URL analyzer
`--domain <DOMAIN>`	stub	RDAP age, registrar, parking detection — needs RDAP client wiring
`--email <ADDR>`	stub	Syntax + disposable-domain check
`--headers-file <PATH>`	stub	RFC 2822 header parsing — Received-chain analysis, SPF/DKIM/DMARC result lookup
`-f, --file <PATH>`	stub	Bulk mode — one input per line, auto-detect type
`--output ioc`	stub	Export IOCs (URLs, domains, IPs) in a SIEM-ingestable format

Subcommand: `phishprobe takedown`

The phishing-takedown pipeline (absorbed from openclaw). File-backed queue of abuse-mail drafts with a small state machine: pending → approved → sent (with failed as a recoverable side state). Storage: $PHISHPROBE_HOME/takedown/queue/<id>.json (chmod 600). Operators migrating from openclaw 1.x can keep $OPENCLAW_HOME set — it is honoured as a fallback.

Command	What it does
`takedown draft --target <URL>`	Build an abuse-mail draft and write it to the queue in `pending` state. Stub abuse contact (`abuse@<domain>`) until RDAP lookup lands.
`takedown approve --id <ID>`	Flip a pending item to `approved`.
`takedown queue [--id <ID>]`	List the queue, or show one item with `--id`.
`takedown send --id <ID>`	SMTP dispatch — stub. Will use `lettre` + the `[sender]` config block.
`takedown daemon`	Long-running queue watcher — stub.
`takedown detect --target <URL>`	Run urlrecon + phishprobe checks against a target — stub.

Every subcommand accepts --format terminal|json for pipe-clean output.

Scoring

The PhishScore is a 0–100 numeric weight summed across every rule that fired. The corresponding RiskLevel:

Score	Level	Interpretation
0–9	Clean	No phishing indicators
10–24	Low	One mild heuristic (e.g. free TLD)
25–49	Medium	Multiple weak heuristics or one strong
50–79	High	Brand lookalike + suspicious path tokens
80–100	Critical	IP-as-host + userinfo trick, or full brand impersonation pattern

False positives possible: phishprobe is intentionally aggressive on the URL analyzer side — operators will see Medium/High on legitimate URLs that share surface features with phishing (e.g., a real login page that uses secure in its path). The score is one signal in a triage pipeline, not a verdict.