Before diving into some real world, cloud native application vulnerabilities, let’s recap the previous post, in which we presented cloud native application security challenges. Cloud native applications run in ever-changing environments; these are composed of a wide range of components gathered from multiple resources having varying security controls. When analyzing vulnerability structure in legacy monolithic apps, all at-risk code paths exist within the same code base (source and sink).
Vulnerable flow describes a path – a sequence of microservices – resulting in potentially exploitable code. Cloud native applications span multiple layers, e.g., containers, clusters, and clouds. A vulnerable microservice presents dangerous risk, but one atop a misconfigured container or cluster can be fatal.
Cloud native application security testing requires a fresh, unique approach – one that takes into account the following:
This post elaborates on how cloud native vulnerable flows differ from those within monolithic apps. And how do such flows differ from application layer vulnerabilities?
There are two fundamental differences. Both pertain to the sequence, order, and infrastructure in which a given line of code is executed.
Next we’ll look at some real world examples, where the execution context or infrastructure layers have either elevated risk or enabled a vulnerability. (All vulnerabilities described have been previously reported and fixed.)
Google cloud (“Dropping a shell in cloud SQL”) – This risk-borne flow is where a SQL injection on a DB service – atop a publicly exposed and misconfigured container – leads to remote code execution (RCE) on the host. Back then Google had a Cloud SQL service that enabled execution of arbitrary queries within a MySQL database. A researcher initially found a SQL injection on that service and elevated it to an RCE on the container. But it didn’t stop there. The container shared the network with the host, permitting full intervention in the traffic. Because the database runs in the Google Compute Engine, the researcher was able to manipulate Google’s metadata server by spoofing requests; these eventually caused the host to create a user with a predefined SSH public key. By combining these steps an attacker would have been able to execute commands as root on the host and access all its resources.
Shopify example (“SSRF in Exchange leads to ROOT access in all instances”) – Here a vulnerable microservice atop of a Kubernetes cluster was publicly exposed. At that time, there was a bug in the screenshotting functionality of Shopify Exchange. This led to a server-side request forgery (SSRF). Such a cloud native application vulnerability enabled the server-side app to make HTTP requests to an arbitrary domain of an attacker's choosing. By exploiting the SSRF, next it communicated with the GKE API to export Kubernetes environment attributes – including the Kubelet certificate and private key. Together with manipulating the kubectl “get secrets” param, the set permitted the execution of arbitrary commands in any container within the cluster.
Slack example (“TURN server allows TCP and UDP proxying to internal network”) – In examining cloud native application vulnerabilities, this flow is a combination of a vulnerable service atop cloud components. At the time it was discovered, Slack used TURN (Traversal Using Relays around NAT) ) protocol servers for its WebRTC infrastructure. Each was vulnerable and could be abused to relay TCP and UDP packets. Similar to SSRF, it isn’t limited to HTTP traffic. Deployed in a cloud native environment, such a server vulnerability could result in attackers connecting to the AWS metadata API and obtaining IAM temporary credentials. This could lead to an account takeover. Another option was to port-scan the Slack AWS infrastructure on the internal subnets, find server management applications, and possibly abuse such trusted services.
Starbucks example (“Hacking Starbucks and Accessing Nearly 100 Million Customer Records”) – To execute its website logic, Starbucks used an API endpoint on its external-facing service to process requests and transfer them internally to a designated service. The assumption might have been that the service wouldn’t be exposed to the internet and didn’t require input validations. This resulted in the internal microservice being vulnerable to directory traversal attacks, allowing an external user to explore all endpoints. This led to execution of arbitrary Microsoft Graph queries that included all accounts – nearly 100M customer records.
Yahoo example (“How I found 2.9 RCE at Yahoo! Bug Bounty program”) – At Yahoo there was an RCE on an internal microservice; it was called behind a message queue. This could be accessed even though the microservice didn’t directly expose any ports or services to the internet. What made this vulnerability a high risk is that the message queue itself was exposed. The same vulnerable code would have made it much less exploitable without the ability to reach it via the internet and, therefore, much less dangerous. Inspected independently, each microservice had no vulnerabilities. Yet, correlating them into one flow created a remote code execution vulnerability.
As shown in these examples, cloud native application vulnerabilities are not singular events, but rather are complex flows. They usually begin at a publicly open, external API or interface. The flow continues through queues, shared storage, and other ancillary services to internal microservices. The latter don't have the same security measures as do external microservices.
Where should an R&D team implement input validation? The answer is, on each microservice –or perhaps at every endpoint implementation. It’s no longer a matter of technology, but rather people and processes – this is where things break. An additional point is that the combination of layers – a misconfigured container, cluster, or cloud, combined with an application layer vulnerability – can create a much riskier and dangerous vulnerability as you’ve now seen in the multi-layer examples.
At Oxeye, we understand the application layer vulnerability landscape is changing. It’s getting more complex, is ever-changing, and requires a novel approach to security testing. Stay tuned for more cloud native application security testing insights.