Incidents
Capital One breach (2019) — how SSRF leaked 100M+ records, and how to defend
In 2019, ~106M records leaked from Capital One. A single SSRF reached the cloud metadata endpoint, stole over-privileged IAM credentials, and copied S3. The attack chain as a defense map, plus fixes: allowlist destinations, IMDSv2, IAM least privilege.
We read real public incidents not as news reruns, but through "how would you defend against this". This article is based on public records (regulators, reporting, academic analysis, official statements), cited at the end.
- Target
- Capital One (major US bank)
- Disclosed
- July 29, 2019 (detected Jul 19 / intrusion in March)
- Class
- SSRF-initiated theft of cloud credentials and exfiltration
- Scale
- ~106M people in US/CA (incl. ~140k SSNs, ~80k linked bank account numbers)
- Root cause
- SSRF flaw + an unprotected metadata endpoint + an over-privileged IAM role
- Real fix
- defense in depth: destination allowlist, IMDSv2, IAM least privilege
What happened (plainly)
Capital One ran part of its application processing in the cloud. A component in front of it (running as a reverse proxy / WAF) had an SSRF flaw: it could be made to send a request, on the server's behalf, to a destination chosen from outside.
Cloud VMs have a special metadata endpoint reachable only from inside, which returns the temporary credentials assigned to that machine. Using SSRF, the attacker reached this internal endpoint and obtained the keys. The IAM role behind them (reported in public records as ISRM-WAF-Role) had broad permission to read storage, so the contents of ~700 storage buckets (S3) were copied out as-is.
'Only reachable from inside' becomes 'inside' when SSRF exists
The metadata endpoint was protected by the assumption "unreachable from outside". But SSRF makes the server itself proxy the access, breaking that assumption. "Internal-only, therefore safe" collapses with one SSRF at the entrance.
The attack chain is also a defense map
What matters is that this was a four-hop chain, and each hop had a place to stop it. Read it not as an attack recipe but as "where it could have been cut".
① Entry: SSRF
A flaw lets the server proxy a request to any destination.
⊘ stop: allowlist destinations
② Reach the internal metadata endpoint
An "internal-only" endpoint becomes reachable via SSRF.
⊘ stop: IMDSv2 (token required + hop limit)
③ Obtain IAM temporary credentials
The machine's role hands back temporary keys.
⊘ stop: least-privilege role (no read-all storage)
④ Bulk-copy storage (S3)
~700 buckets copied straight out.
⊘ stop: egress controls + anomalous-read detection
Disclosed timeline
2019-03
Unauthorized access to the cloud environment (determined later).2019-07-17
An external tip (a post on GitHub) flags the anomaly.2019-07-19
Capital One confirms the breach in internal investigation.2019-07-29
Public disclosure; a former AWS engineer is arrested by the FBI.2019-11
AWS announces IMDSv2 as defense in depth against SSRF.2020-08
The OCC issues an ~$80M penalty, citing inadequate risk assessment before cloud migration.
The root cause isn't one mistake — it's layers giving way
Pin this on "SSRF" alone and it repeats. In truth three layers failed in sequence.
As it was (at the time)
- No destination validation at the entry (any proxied request possible)
- The metadata endpoint returned keys without a token (legacy mode)
- The machine's IAM role could read storage broadly (over-privileged)
As it should be (prevention)
- Allowlist destinations and deny proxied access to internal addresses
- Metadata via IMDSv2: session token required + hop limit blocks proxied reach
- IAM role at least privilege: only the buckets and actions actually needed
Where the 'shared responsibility' line falls
The cloud provider secures the foundation, but fixing SSRF, designing IAM permissions, and protecting metadata are the customer's responsibility. The penalty itself cited "inadequate risk assessment before cloud migration". Riding a convenient platform isn't the problem — whether you've fully designed your side of the line is.
Preventing it in your environment
Priority-ordered fixes that work at any scale. If you have even one feature that "takes a URL and fetches it" or "delivers a webhook/image", this is about you.
Allowlist outbound destinations (stop SSRF at the door)
Where a server proxies access to a user-supplied URL, restrict it to allowed destinations only. Deny reach to internal addresses (the metadata endpoint, private networks) by default. The SSRF entry covers the validation pitfalls people miss (redirect following, DNS rebinding).
Enforce IMDSv2 for metadata
Restrict the cloud metadata endpoint to the token-required, hop-limited mode (IMDSv2 on AWS). This alone makes credential theft from SSRF far harder. The point is to disable the legacy mode.
Least-privilege IAM (shrink the blast radius)
So that leaked keys contain the damage, scope a machine/service's permissions to only the targets and actions needed. "Just allow all storage" turns one breach into total exfiltration.
Egress and anomaly detection
Restrict outbound traffic to needed destinations and detect anomalies like a burst of large reads. Even if the entry isn't stopped, keep a layer that notices the exfiltration.
Where this overlaps with ITD's design
ITD designs URL-handling features around verified domains only (SSRF-safe by design). This breach is the most eloquent case for why the principles we build on — validate the entry, least privilege, don't rely on "internal" assumptions — are necessary. Behind the convenience, you design your side of the line yourself.
Sources (public records)
The facts here are based on the following public information. We don't cover reproduction steps or payloads — only the defensive lessons.
- Krebs on Security, "What We Can Learn from the Capital One Hack" (2019) — krebsonsecurity.com
- ACM Transactions on Privacy and Security, "A Systematic Analysis of the Capital One Data Breach" — dl.acm.org
- CyberScoop, "US financial regulator fines Capital One $80 million over data breach" (2020) — cyberscoop.com
Read next
- Glossary: What is SSRF (the entry point, with validation pitfalls)
- Incident: Hit by a neglected CVSS 10.0
- Defense: Bake least privilege and machine monitoring into operations
FAQ
QWhat was the root cause of the Capital One breach?
Not a single cause but several defense layers failing in sequence. The entry was an SSRF flaw (a server could be made to proxy a request to any destination). From there the cloud metadata endpoint was reachable, and the IAM role behind it had broad permissions, so temporary credentials let the attacker read entire storage.
QDoes this matter for my small service?
Yes. SSRF arises from ordinary features that 'take a URL and fetch it' (previews, webhooks). On the cloud, credential theft from the metadata endpoint happens regardless of scale. The three fixes here (IMDSv2, IAM least privilege, destination allowlist) work for indie projects too.
QWould a WAF have prevented it?
No. In this breach the component acting as a WAF became the SSRF stepping stone itself. A WAF blocks some attacks but misconfigured it becomes a hole. 'We have a WAF, so we're safe' is a misconception — input validation, least privilege, and metadata protection are the real defenses.