Security Concerns Around Zombie Cloud Infrastructure
One of the most important benefits of the use of cloud instances over traditional network configurations is that one can be literally setup within seconds, by the click of a few buttons. This ability has dramatically reduced deployment times for test, model, and production systems. It also allows for great flexibility both from a technical and a billing perspective.
Although it is very quick and easy to deploy new systems, it is not so simple to decommission existing systems. Within a large organization, which is usually more risk-averse, a guarantee will need to be given that the system really has no purpose anymore, now and in the future, before the virtual plug can be pulled. To get this guarantee on paper takes a lot of effort. No one wants to be responsible for turning off a perfectly fine running system that for instance executes a critical monthly report or task.
This has created a relatively new issue, which carries some serious security concerns with it: The Zombie Cloud Infrastructure. Systems that are only still in place, because it is safest to “just leave it running.”
Why is this a security concern?
Maintenance on these systems might have stopped because they are no longer relevant and no longer in production. They are slowly “forgotten” about, and if they were only used for a very short period, their existence might not even be (fully) documented. This is any security professional’s worst nightmare. These systems could be servers, but they could also be virtual firewalls, switches or entire (model) environments containing all these asset types.
Patches and Updates
If these systems are no longer maintained, they are no longer patched against the latest security threats. If for instance the next Shellshock type vulnerability is discovered and exploited, who makes sure the latest patches, released to protect a server against this, are deployed? Is there even visibility these systems are unpatched if they are not maintained? A regular internal vulnerability scan could assist here. Scanning an entire IP range might show some undocumented and unpatched systems. These systems can then be decommissioned or at least be patched adequately until their purpose is discussed internally.
Not all these systems will have a current security event feed into a monitoring platform. This is especially the case for test and model systems that were only meant to operate for a very short period. Unfortunately, an antivirus agent or a host-based Intrusion Detection System are usually not the first applications to be installed within a test environment. The lack of visibility from a security perspective means that if compromised, an attacker could use such a system as the perfect jump host or backdoor to further pivot into the compromised network. The potential lack of patching mentioned earlier even further increases this risk. Network based security devices should pick up some of the noise produced by an attack on these systems. A perimeter (Next-Gen) firewall or Network Based Intrusion Detection System could, for instance, pick up suspicious traffic traversing the network to and from such a target, regardless of the lack of host-based security monitoring. This definitely is not a complete solution though. It is only part of the well-known Defense in Depth Security Principle.
Another issue these systems bring along is the potential existence of unused Confidential Data. What data is present in these systems, who has access to it and who still uses it? These questions are not only critical aspects in the decision to decommission a redundant system, but they also have a detrimental impact on the security risks while the system is still operational. There might, for instance, be Personally Identifiable Information (PII) on these systems. In a secure environment, all data and access to that data should be accounted for.
Another example could be a file server migration where user data was moved from an old server to a new server. If the old server is never decommissioned, access permissions between the old and the new server data might become out-of-sync, creating a security issue if the old data is somehow still available.
Since the costs of setting up and operating a (dormant) cloud infrastructure are relatively low, system owners will “err on the side of caution” when they are faced with a decision to turn anything off. Availability is often key in an organization, only to be followed by any security issues. If a system owner is 99% sure a system has become obsolete, that is still not enough to justify the risk of causing an outage, no matter how low its potential impact. Because it is very hard to monetize (quantify) a complex and broad security risk, but very easy to simply accept the operational costs of the cloud infrastructure in question, a cost based argument is hard to make, although it should not be impossible.
As discussed, there are a few options to limit the impact of existing Zombie Cloud Infrastructure within an organization. One approach is to scan the internal (and perimeter if needed) network for unidentified systems and services. This is best practice and should already be done on a regular basis anyway because this will also identify obsolete services and unpatched systems.
Another prevention method could be the use of a comprehensive set of processes and guidelines covering subjects such as deployment, change management, documentation and decommissioning.
Finally, a combination of perimeter and internal traffic analysis tools such as IDS and NGFW devices should be able to identify suspicious traffic regardless of its source and destination.
In the end, the remedies for this “cloud” issue are not new. They were nearly the same 20 years ago when all infrastructure was purely physical. The reduction in deployment and operational costs and the elimination of aging hardware and service contracts, however have just increased the need to adhere to these already existing security best practices.