Cloud Storage

When it comes to public cloud storage, traditional storage concepts such as hard disks and RAID arrays have been replaced by new, much more flexible options. Data stored within a cloud platform has become virtually independent from its underlying hardware implementation, and it benefits from nearly limitless redundancy options, many even as a default configuration.

Storage terminology has also changed. A much-used storage concept Amazon uses are the so-called buckets. It is easiest to see an Amazon bucket as an incredibly flexible, highly accessible and distributed folder. These buckets can be hosted in a region of choice if needed and options such as logging, and performance can be adjusted to match the requirements and budget of the customer.


This flexibility does not come without risks, however. Many cloud users knowingly or unknowingly allow public access to the buckets and their contents. In some case this is a misconfiguration, in other cases, it is simply the lack of understanding of the relatively new technology. Whatever the underlying reasons are, unsecured buckets have already led to many data breaches and will likely continue to do so in the future. An Amazon S3 bucket access misconfiguration by web company LocalBlox, for instance, caused a major incident in February 2018. This company stored a 1,2 TB file containing 48 million records of users’ internet behavior linked to their IP addresses inside a publicly accessible s3 bucket. As soon as the company was notified of the issue, they closed the access down. It is hard to know with certainty, however, if anyone else has downloaded a copy of the sensitive (and to the company also valuable) user data before the access lockdown and where that copy could have ended up. Once data has been publicly accessible for any length of time, it becomes nearly impossible to guarantee that the act of subsequently restricting the access, has contained all the data. The so-called “cat is out of the box” by then.

At the beginning of the year, some Ransomware cases made the news as well, where data hosted in publicly writable buckets was encrypted or copied and removed and where that data was then basically held for ransom. It turns out these publicly writable buckets are surprisingly common. Security researcher “Random Robbie,” wrote a script that scans accessible buckets and leaves a POC.txt file in the vulnerable folders. If you find one of these files amongst your data, you are advised to lock down access controls of the relevant services immediately.

Bucket enumeration

The security issue around cloud data is not new. Over the years, many tools such as S3Scanner and AWSBucketDump have been developed and updated, which scan the cloud platforms address ranges, looking for any publicly accessible buckets. Once such a bucket is found, most tools can even scan or dump the contents of the bucket, providing the interested party with an easy and automated way to access the data.

More and more of these tools have become available, and the latest trend is the use of certificate transparency logs for scanning efficiency. No longer do these tools need to brute force all entries on a predefined wordlist, the use of domain name permutations of certificate transparency logs makes the process much more targeted and with that, much quicker.

Security Measures

As mentioned before, the security issues around these buckets are not new. Vendors have now started to accept some of the responsibility, however, and some interesting, more proactive measures have been made available to cloud users recently.

Of course, the traditional security controls and processes still apply, but as can be observed, they often fail due to human error or a lack of understanding of the platform. Access rights need to be properly set and reviewed on a regular basis. Proactive scans using the mentioned enumeration scripts or broader vulnerability scans against the customers own environment need to be performed and monitored. Nothing new there. These policies simply need to be in place already.

Ethical Hacking Training – Resources (InfoSec)

One of the more interesting recent developments was the release of Amazon Macie in August 2017. Amazon Macie promises to automatically discover and classify data stored inside Amazon S3 buckets using Machine Learning technology. This might very well be the future. It has been clear that human error cannot be reduced to zero, so putting near-real-time automated controls in place that can contain the risks once such an error inevitably occurs is a good approach. Another option is to enable Amazons Default Encryption feature. This will automatically encrypt any file placed inside a bucket. Some other available features are Amazons permission checks and alarms and the use of Access Control Lists.

Of course, the monitoring of public access and API calls is also critical. Alerts should be set (and actioned) covering the dumping of large amounts of files or large files in general. A SIEM can assist in correlating the required security event data for these alerts via rules and set thresholds.


Data breaches via cloud storage are a problem that will not go away. We have looked at the many reasons why this topic is still such an issue. We have also looked at the mitigation options and some promising recent developments in this space. On top of all this, Amazon has been pro-actively contacting their users with publicly accessible data as well. When all these options are combined, they make up a solid, holistic security suite that should be more than sufficient to address the ongoing concerns. It is up to the Security Professional to implement and maintain these policies, however. Until then, it might be best for cloud providers to take over some of the configurations via default configurations. This seems to work up to a degree.