Security Company CloudFlare leaks sensitive customer information for tens of thousands of websites

Cloudflare: Cloudflare Reverse Proxies are Dumping Uninitialized Memory

(It took every ounce of strength not to call this issue "cloudbleed")

Corpus distillation is a procedure we use to optimize the fuzzing we do by analyzing publicly available datasets. We've spoken a bit about this publicly in the past, for example:

https://security.googleblog.com/2011/08/fuzzing-at-scale.html


http://taviso.decsystem.org/making_software_dumber.pdf#page=11

On February 17th 2017, I was working on a corpus distillation project, when I encountered some data that didn't match what I had been expecting. It's not unusual to find garbage, corrupt data, mislabeled data or just crazy non-conforming data...but the format of the data this time was confusing enough that I spent some time trying to debug what had gone wrong, wondering if it was a bug in my code. In fact, the data was bizarre enough that some colleagues around the Project Zero office even got intrigued.

It became clear after a while we were looking at chunks of uninitialized memory interspersed with valid data. The program that this uninitialized data was coming from just happened to have the data I wanted in memory at the time. That solved the mystery, but some of the nearby memory had strings and objects that really seemed like they could be from a reverse proxy operated by cloudflare - a major cdn service.

A while later, we figured out how to reproduce the problem. It looked like that if an html page hosted behind cloudflare had a specific combination of unbalanced tags, the proxy would intersperse pages of uninitialized memory into the output (kinda like heartbleed, but cloudflare specific and worse for reasons I'll explain later). My working theory was that this was related to their "ScrapeShield" feature which parses and obfuscates html - but because reverse proxies are shared between customers, it would affect *all* Cloudflare customers.

We fetched a few live samples, and we observed encryption keys, cookies, passwords, chunks of POST data and even HTTPS requests for other major cloudflare-hosted sites from other users. Once we understood what we were seeing and the implications, we immediately stopped and contacted cloudflare security.

This situation was unusual, PII was actively being downloaded by crawlers and users during normal usage, they just didn't understand what they were seeing. Seconds mattered here, emails to support on a friday evening were not going to cut it. I don't have any cloudflare contacts, so reached out for an urgent contact on twitter, and quickly reached the right people.



After I explained the situation, cloudflare quickly reproduced the problem, told me they had convened an  incident and had an initial mitigation in place within an hour.

"You definitely got the right people. We have killed the affected services"

Source: https://bugs.chromium.org/p/project-zero/issues/detail?id=1139