Skill

arxiv

Public Atom API for searching and fetching arXiv e-print metadata.

Verified: 2026-05-13 (printing-press-ingest-2026-05-13+enrich-capability-skill)

When to use arxiv

Choose if

You want an agent to search arXiv by query expression or category and fetch paper metadata (title, authors, abstract, categories, versions) without writing an Atom-API parser. Useful inside research-assistant, literature-review, or paper-summarization agents where a stable public API is preferred over web scraping.

Avoid if

You need the actual PDF body text — this CLI is metadata-only. Pair with a PDF-fetcher + text-extractor (e.g., /solve/pdf-text-extraction-mcp/) to get the full paper content. Also avoid if you need preprint sources beyond arXiv (bioRxiv, medRxiv, SSRN).

Risk Flags

  • LOW rate_limit arXiv's public Atom API has a soft rate limit (1 request per ~3s recommended). CLI surfaces 429s as exit code 7 but README discloses no concrete thresholds.
  • LOW scope Read-only: metadata + abstract only. Full PDF download is not the job — agents need a separate fetcher.
  • LOW data_quality Live-first; depends on the upstream arXiv Atom API being reachable. No local mirror.

Cost

Type: Free

Install

Default

npx -y @mvanhorn/printing-press install arxiv

Setup docs →

Estimated time to first success: ~3 min

Dependencies

Minimum runtime: Node.js 18+ (or pre-built binaries for macOS/Linux)

Distribution

Repository
https://github.com/mvanhorn/printing-press-library
License
MIT