Skill
archive-is
Bypass paywalls and look up web archives via archive.today. Hero command: find or create an archive for any URL with lookup-before-submit, Wayback Machine fallback, and agent-hints on stderr when called non-interactively.
When to use archive-is
Choose if
You want an agent to retrieve archived article text — especially behind paywalls — with one command that looks up existing snapshots first, submits a fresh capture only if needed, and falls back to the Wayback Machine when archive.today rate-limits or CAPTCHAs. The markdown output pipes cleanly into Claude/GPT prompts.
Avoid if
Your workflow needs a trustworthy archival source for legal evidence, academic citation, or anything where snapshot integrity matters — the README explicitly disclaims this use case (Wikipedia blacklisted archive.today in Feb 2026 over tampering evidence). Reach for Wayback Machine direct or perma.cc instead.
Risk Flags
- HIGH scope README explicitly warns: "On February 21, 2026, Wikipedia formally blacklisted archive.today after evidence of DDoS activity and snapshot tampering. This CLI is intended for personal paywall reading. Do NOT use it for legal evidence, academic citation, or anything requiring a trustworthy archive." Agents must not cite archive.today snapshots as authoritative sources.
- MEDIUM rate_limit archive.today frequently serves CAPTCHA for direct body fetches; the CLI falls back to Wayback Machine. Exponential backoff on 429. Default 10-second gap between bulk requests (configurable via --delay).
- LOW scope Read-and-archive only. No way to modify or delete existing snapshots.
Cost
Type: Free
Install
Default
npx -y @mvanhorn/printing-press install archive-is
Estimated time to first success: ~5 min
Dependencies
Minimum runtime: Node.js 18+ (or Go 1.26.3+ for source install)
Distribution
- Repository
- https://github.com/mvanhorn/printing-press-library
- License
- MIT