Find forgotten subscriptions in your Brazilian credit card — what to install, with the recipe
Answer
You’re a Brazilian user (or building for Brazilian users) and you want an AI agent — Claude, ChatGPT, Cursor, anything that speaks MCP — to look through your credit-card and checking-account transactions and tell you which recurring charges are still active, which are duplicates, which are forgotten free trials that converted to paid, and which can be cancelled without losing anything you actually use.
There’s exactly one production-ready way to do this today, and it landed on 2026-05-07: install Cumbuca’s Open Finance Data MCP, authorize one bank through Open Finance (CPF + biometric/PIN at your own bank — no Cumbuca account, no shared credentials), and run the deterministic recurring-charge detection recipe below. Cumbuca is a Bacen-licensed Payment Institution (ITP) acting as a regulated proxy; the MCP is HTTP-transport, free during MVP, ~5 queries/day per user.
claude mcp add --transport http cumbuca https://mcp.cumbuca.com/mcp
For Claude Desktop: Settings → Connectors → Add custom connector → paste https://mcp.cumbuca.com/mcp. For ChatGPT (Developer Mode): Settings → Connectors → Create → paste the same URL. Once connected, the first tool call triggers the Open Finance redirect through your bank for consent — biometric or banking PIN — and the agent gets read access to your statements + credit-card transactions.
The MCP gives you the data. The audit logic — clustering charges by merchant + cadence + amount stability — runs agent-side. That logic is the editorial value of this page: a deterministic recipe (no ML, no merchant dictionary, no cross-user signal) that operates on Bacen-spec normalized Open Finance transaction shape, so it works on any compliant data source today and any future one.
The recipe — deterministic recurring-charge detection
This is the agent prompt + algorithm. Lifted as data-source-agnostic technique: it operates on the Bacen Open Finance normalized transaction shape (merchant_name, counterparty_cpf_cnpj, mcc, transaction_type, booking_date, amount, currency), not on any specific MCP’s response envelope. If a second BR Open Finance MCP ships, swap the data source — the algorithm doesn’t change.
Step 1 — Merchant normalization
Open Finance returns merchant strings the way the bank’s mainframe stored them: uppercase, trailing whitespace padding, and prefixed with the transaction type. Normalize before clustering.
def normalize_merchant(name: str | None) -> str:
if not name:
return ""
n = " ".join(name.upper().split()) # uppercase + collapse whitespace
for prefix in ("PIX ENVIADO ", "PIX AGENDADO ", "TED ENVIADO ",
"DEBITO AUT. ", "DEBITO AUTOMATICO ", "COMPRA "):
if n.startswith(prefix):
n = n[len(prefix):]
break
return n.strip()
That’s it — no dictionary lookup, no terminal-ID stripping, no fuzzy matching. The lift from raw "PIX ENVIADO SPOTIFY BRASIL " to "SPOTIFY BRASIL" is enough to cluster correctly in 90%+ of cases. Don’t over-engineer step 1.
Step 2 — Group key
Bucket transactions so each bucket represents one candidate recurring charge:
- Account transactions (checking, debit):
(merchant_normalized, counterparty_cpf_cnpj) - Credit-card transactions:
(merchant_normalized, mcc)
CPF/CNPJ is the strongest signal when the bank returns it (PIX and TED almost always do; legacy débitos sometimes don’t). MCC supplements when CPF/CNPJ is missing on the credit-card side.
Step 3 — Skip the obvious noise
Drop rows where transaction_type == "TARIFA". Bank fees aren’t subscriptions, and Brazilian banks generate enough of them (manutenção de conta, IOF, anuidade de cartão) to dominate a naïve clustering pass. If your data source uses a different label for fees, adjust accordingly — the principle is “exclude bank-imposed charges from subscription candidates.”
Step 4 — Minimum observations
Only consider buckets with ≥3 transactions. With fewer, you don’t have enough signal to distinguish a recurring charge from coincidence. For buckets that end up classified as irregular cadence (step 5), require ≥4 to dampen noise — irregular cadence with only three points is almost always a false positive.
Step 5 — Cadence detection by median interval
For each bucket, sort transactions by booking_date, compute consecutive intervals in days, take the median. Classify:
| Median interval (days) | Cadence |
|---|---|
| 6–8 | weekly |
| 13–16 | biweekly |
| 28–32 | monthly (the dominant case) |
| 88–95 | quarterly |
| 175–190 | semiannual |
| 350–380 | annual |
| anything else | irregular |
Median (not mean) is critical — one missed billing cycle skews the mean badly.
Step 6 — Amount stability via coefficient of variation
Subscriptions charge a stable amount; ad-hoc spending at the same merchant doesn’t. Use coefficient of variation CV = stdev(amount) / median(amount):
| CV threshold | Observations | Confidence |
|---|---|---|
CV ≤ 0.05 |
≥6 | high |
CV ≤ 0.20 |
≥3 | medium |
CV ≤ 0.50 |
≥3 | low |
CV > 0.50 |
any | reject (too noisy — not a subscription) |
The reject threshold is the workhorse. A merchant with CV > 0.5 is variable spending (a restaurant, a marketplace) showing up multiple times by coincidence — clustering only on cadence without amount stability gets you false positives like “you have a weekly subscription to iFood for R$48.31.”
Step 7 — Status classification by recency
For each kept bucket, compute days since last_seen (today minus most-recent observation):
| Days since last_seen | Status |
|---|---|
| ≤ 60 | active |
| 61–180 | paused (could be a service that pauses, or one whose last cycle is missing from your window) |
| > 180 | cancelled |
The 60-day boundary is the right call for monthly cadence (one missed cycle still counts as active; two missed cycles flips to paused). For quarterly or annual cadence, a savvy implementation extends the active window proportionally — but the simple rule above is correct in 95% of personal-finance cases and ships.
Step 8 — Output schema
Per detected subscription, emit:
{
"name": "SPOTIFY BRASIL",
"cadence": "monthly",
"amount_brl": 21.90,
"annual_cost_brl": 262.80,
"status": "active",
"confidence": "high",
"first_seen": "2025-09-12",
"last_seen": "2026-04-12",
"evidence_count": 8,
"counterparty_cnpj_cpf": "12345678000195",
"mcc": 4899
}
The agent then sorts by annual_cost_brl descending, segments by status (active first, then paused, then cancelled-recently as cleanup candidates), and presents the list. If the user asks “cancel Spotify,” the agent has the merchant identity and the recurring charge proof; cancellation is still a separate flow at the merchant or via the user’s bank, but the audit is the discovery step.
Bank-side gotcha (general Open Finance caveat)
Some Brazilian banks return ~60-day windows on long-lookback transaction queries without paginating the rest. You ask for 365 days, you get back 60 days of data with no totalPages or nextPageUrl to follow, and the next 305 days silently disappear from your audit. This is a bank-side quirk, not specific to any Open Finance MCP — it shows up wherever the bank’s API is queried directly.
Workaround: don’t query a single 365-day window. Query 12 explicit monthly windows (fromBookingDate and toBookingDate set to each calendar month’s first and last day) and dedup by transaction id. Each monthly window fits in one page on every bank we’ve observed, and the dedup pass makes the overlap-safe boundary handling trivial:
from datetime import date
from dateutil.relativedelta import relativedelta
def monthly_windows(months_back: int = 12) -> list[tuple[date, date]]:
today = date.today()
return [
(
(today - relativedelta(months=i+1)).replace(day=1),
(today - relativedelta(months=i)).replace(day=1) - relativedelta(days=1),
)
for i in range(months_back)
]
If your subscription audit is missing months 2–12 with only the most recent ~60 days populated, this is almost certainly the cause. Window per month, dedup by transaction_id, and the algorithm above runs against the full 12-month history.
Cumbuca MVP limits — what this audit does NOT do today
Be honest with the user before they install:
- Single account per setup. If they have credit cards across Itaú, Nubank, and Inter, they can only audit one of them at a time today. Multi-account aggregation isn’t in Cumbuca’s MVP.
- ~5 queries/day per user. The audit is one-shot — designed to be run when the user asks “which subscriptions am I paying for?” and consumed in one session. It’s not a continuous-monitoring pipeline. Polling every hour will hit the rate cap immediately.
- Statements + credit-card transactions only. No investments, no SCR/credit-bureau, no payment initiation, no FX/exchanges. Don’t promise net worth, debt aggregation across institutions, or anything that requires data outside this surface.
- Brazilian banks only. Open Finance is Bacen-regulated; this won’t work for users outside Brazil.
These aren’t workarounds you can engineer around at the agent layer — they’re the MVP scope. Re-evaluate when Cumbuca extends the surface or when a second BR Open Finance MCP ships.
Sample agent prompt
If you’re wiring this into a Claude Code or ChatGPT session and want to drop in a working prompt, this is enough:
You have access to the Cumbuca Open Finance Data MCP. The user has authorized one BR bank’s credit-card and account data feed. Pull the most recent 12 months of credit-card transactions in 12 monthly windows (
fromBookingDateper calendar month, dedup by transaction id). Normalize each merchant name (uppercase, collapse whitespace, strip leadingPIX ENVIADO/PIX AGENDADO/TED ENVIADO/DEBITO AUT./COMPRAprefixes). Group by(merchant_normalized, mcc)for card transactions and(merchant_normalized, counterparty_cpf_cnpj)for account transactions. Drop rows wheretransaction_type == "TARIFA". For each bucket with ≥3 transactions, compute the median interval (days) between consecutive transactions to classify cadence (28–32 → monthly is the dominant case), and compute coefficient of variationCV = stdev / medianof amounts (CV > 0.5→ reject;CV ≤ 0.20→ medium confidence;CV ≤ 0.05and ≥6 obs → high confidence). Classify status by recency: ≤60d active, 61–180d paused, >180d cancelled. Return a list sorted by annual cost descending, separated into active / paused / cancelled sections. Include name, cadence, amount, annual cost, evidence count, first/last seen, confidence.
That prompt plus the MCP install gives you a working subscription audit in one session.
Related personal-finance JTBDs (sketch — not all production-ready today)
Subscription audit is the wedge. The broader space of personal-finance JTBDs an AI agent could do once a user has authorized their BR bank data:
| JTBD | Doable today on Cumbuca MVP? | Why / why not |
|---|---|---|
| Subscription audit | ✅ yes | This page. Single account, transaction feed sufficient. |
| Account balance + recent activity | ✅ yes | Statement endpoint covers this. |
| Expense categorization + breakdown | ✅ yes | Transactions + agent-side categorization (MCC fallback rules + recurring-credit-payer detection cover most cases). |
| Income identification (salary, client payments) | ✅ yes | Recurring CREDITO entries from the same CPF/CNPJ across ≥2 calendar months. Same logic, different sign. |
| Transaction search / reconciliation | ✅ yes | Search the transaction feed. |
| Historical activity views (12 months) | ✅ yes (with monthly-window workaround) | Open Finance window is typically up to 12 months; bank-side pagination caps require windowing per the recipe above. |
| Multi-account net worth | ❌ not yet | MVP is single account. Wait for multi-account scope or a different MCP. |
| Investments analysis | ❌ not yet | Not in Cumbuca’s v1 surface. |
| Debt aggregation across institutions | ❌ not yet | Multi-account dependency + SCR/credit-bureau dependency. |
| Future planning / forecasting | ⚠️ partial | Doable client-side from the transaction history; quality limited by single-account scope. |
| Account reconciliation across bookkeeping | ⚠️ partial | Personal scope only today; business CNPJ accounts use a different OF surface. |
The ones marked ❌ aren’t blocked by bad design — they’re genuine MVP scope limits. If they matter for your use case, file the need; we’ll prioritize the next /solve/ page based on actual demand signal.
Methodological caveats
- This is documented-characteristics ranking, not precision/recall. The recipe is deterministic and the data source is one (Cumbuca, the only production-ready BR Open Finance personal-data MCP today). A real eval — labeled merchant ground-truth across cadence classes, scoring precision/recall by confidence tier — is Phase-2 follow-up tracked in
backlog.md. Today’s value is curated discoverability + a working recipe. - Cumbuca’s MVP scope can change. Re-check
last_verified: 2026-05-07before relying on this for critical workflows. The rate limit, single-account constraint, and data surface are all explicitly MVP-stage. - The recipe assumes BRL. All amount arithmetic above presumes a single currency. If a user has FX charges hitting BRL with daily exchange-rate variance, the CV check will flag the merchant as “rejected” even when the underlying foreign-currency amount is stable. Workaround: convert to a stable reference currency before clustering, if you have the FX rate data.
- Merchant normalization is intentionally minimal. No fuzzy matching, no merchant dictionary, no terminal-ID stripping. The simple regex + prefix strip handles the dominant case; over-engineering normalization without a labeled corpus to evaluate against produces silent regressions. Add complexity only when the corpus eval ships.
Update cadence
Re-run this ranking when: (a) a second BR Open Finance MCP becomes publicly available (the ranking gains a real comparison; recipe stays), (b) Cumbuca extends MVP scope (multi-account, investments, higher rate limit), (c) a Phase-2 corpus eval ships and updates the scorecard with measured numbers, (d) 90 days after first publish (2026-08-05).
Related
/mcp/cumbuca-of-data-mcp/— the data-source Capability detail page/solve/cnpj-enrichment-mcp/— adjacent Brazilian-data agent task (for business CNPJ enrichment, not personal bank data)/solve/nfs-e-extraction/·/solve/pdf-text-extraction-mcp/— other agent-loop /solve/ wedges in the BR data corpusauxiliar-mcp— the MCP server exposingfind_capability(jtbd: ["subscription-audit"])to in-loop agents