Nov 15, 2022

Remote Code Execution in Spotify’s Backstage via vm2 Sandbox Escape (CVSS Score of 9.8)

Gal Goldstein
Security Researcher
Yuval Ostrovsky
Architect
Daniel Abeles
Head of Research

The Oxeye research team has been able to gain remote code execution in Spotify’s open source, CNCF-incubated project—Backstage, by exploiting a VM sandbox escape through the vm2 third-party library. We reported this RCE vulnerability via Spotify’s bug bounty program, and the Backstage team responded rapidly by patching it in version 1.5.1, and ranking the vulnerability with a CVSS score of 9.8.

Potential vulnerability impact – An unauthenticated threat actor can execute arbitrary system commands on a Backstage application by exploiting a vm2 sandbox escape in the Scaffolder core plugin.

What is Backstage?

Having more than 19,000 stars on Github, Backstage – a CNCF incubated project by Spotify, is one of the most popular open source platforms for building developer portals. It restores order to your microservices and infrastructure, thus enabling your product teams to ship high-quality code quickly without compromising autonomy. Backstage unifies all of your infrastructure tooling, services, and documentation to create a streamlined development environment from end to end. In addition to Spotify, it’s used by a variety of organizations, including American Airlines, Netflix, Splunk, Fidelity Investments and Epic Games.

Backstage can hold integration details to many organization systems, such as Prometheus, Jira, ElasticSearch, and others. Thus, successful exploitation has critical implications for any affected organization and can compromise those services and the data they hold.

Out of the box, Backstage includes:

  • Backstage software catalog for managing all your software (microservices, libraries, data pipelines, websites, ML models, etc.)
  • Backstage software templates for quickly spinning up new projects and standardizing your tooling with your organization’s best practices
  • Backstage techdocs for making it easy to create, maintain, find, and use technical documentation, using a "docs-like code" approach
  • A growing ecosystem of open source plugins that further expand its customizability and functionality

Every research project we spin up starts with mapping potential inputs to an application. A key Backstage feature that caught our attention is its software templates. 

Backstage templates

Backstage has three parts:

Core – Base functionality built by core developers in the open source project

App – A Backstage app instance that is deployed and tweaked. It ties together core functionality with additional plugins. It’s built and maintained by app developers, usually by a productivity team at a company.

Plugins – Additional functionality to make your Backstage app more useful. Plugins can be specific to a company or open sourced/reusable.

Software templates let an application owner quickly spin up new projects, components, and plugins in their organization according to its best practices and guidelines. 

Each template is defined by a YAML file that resembles a Kubernetes resource. It contains various fields that define how the component should behave. For example:

The data provided to the message parameter is a template rendered by Nunjucks by design, Backstage’s template rendering engine. It’s for JavaScript-based applications and is inspired by Jinja2, the Python template engine. 

Evaluating user-provided strings in a template engine can be dangerous since it exposes the application to template-based attacks. The severity of such an attack depends on the features the templating engine offers. Here, Nunjucks is extremely powerful but has been notorious for being unsafe. For example, this 2016 blog post describes popular techniques to bypass restrictions that Nunjucks imposes.

Abusing the template engine

In reviewing how to confine this risk, we noticed that the templating engine could be manipulated to run shell commands by using user-controlled templates with Nunjucks outside of an isolated environment. As a result, Backstage started using the vm2 JavaScript sandbox library to mitigate this risk.

In an earlier research paper, Oxeye found a vm2 sandbox escape vulnerability that results in remote code execution (RCE) on the hosting machine. Once we sorted out that payload, we wondered, Could we exploit it in Backstage?

Exploiting the vm2 sandbox escape

When attempting this exploit in our previous blog post, we sought to control properties outside of the sandbox context. When our basic payload executed, we tried invoking the getThis method on any CallSite objects in the array. (These are used to manipulate the call stack generated by the raised error.)

This time we initially tried a simple approach, storing the vm2 sandbox escape payload in a Backstage template YAML and then trying to trigger that. But calling the getThis method yielded an undefined result and caused our exploit to fail.

To understand why our exploit failed, we found the following in the v8 documentation:

To maintain restrictions imposed on strict mode functions, frames that have a strict mode function and all frames below (its caller, etc.) are not allowed to access their receiver and function objects. For those frames, getFunction() and getThis() returns undefined

 https://v8.dev/docs/stack-trace-api#customizing-stack-traces

To understand which of the functions in our call stack run in strict mode, we added the following code snippet to the functions that appeared in the call stack:

When we ran our payload an additional time with the added code snippet above we could see that the earliest function executed in strict mode within the vm2 sandbox was the “renderString2” function.

The following screenshot shows the “renderString2” function executed in strict mode:

To overcome this limitation, we decided to try and override the “renderString2” function with our own implementation in an attempt to force it to run in non-strict mode.

Something else of interest could help us with gaining code execution: given an error raised while rendering the template, Backstage called the renderString2 function a second time. This let us divide our payload into two stages. Initially we overwrote the renderString2 function with our own implementation. Then we triggered an error, causing Backstage to call the function again, this time invoking our overridden implementation.

This next screenshot shows Backstage calling the renderTemplate function (that calls renderString2) twice in the event of an error:

Our final payload was:

Here is an explanation:

  1. Access the range function constructor property, which provides access to the sandboxed Function constructor. We use that to create an immediately invoked function expression (IIFE) containing our exploit code (within Nunjucks’s rendering engine context).
  2. Override the renderString function to contain our own implementation, This is so we can run in the context of the VM instead of Nunjucks. This is essential to exploit the vulnerability (here env is the Nunjucks reference).
  3. Trigger an error by invoking an undefined function (triggerException). This causes NunjucksWorkflowRunner.render to call the SecureTemplater.render function a second time and occurs in the following code: https://github.com/backstage/backstage/blob/2c54d6446fefe30e8e6d81ce7029db74e594b9cb/plugins/scaffolder-backend/src/scaffolder/tasks/NunjucksWorkflowRunner.ts#L147.

From here, the exploit runs in the context of the vm2 sandbox.

  1. Save a copy of the original Error class.
  2. Override the original Error class with a new empty class.
  3. Implement the prepareStackTrace function under the newly created Error class.
  4. Create an instance of the saved Error class and access its stack property. This triggers the built-in Node error module to call our implementation of the prepareStackTrace function.
    https://github.com/nodejs/node/blob/main/lib/internal/errors.js#L140
  5. Access the CallSite object supplied in the traces array, on which we invoke the getThis function. This gets us an object created outside the sandbox, allowing us to execute an arbitrary system command.

When looking at the call stack as we ran our payload, none of the functions executed until our overridden renderString2 function was declared as strict. When we accessed index 2 of the trace array, we got a CallSite object created outside the sandbox. This enabled us to execute arbitrary code.

Exploiting Backstage instances in the wild

Once we had successfully executed our payload locally, we attempted to assess the potential impact of such a vulnerability if exploited in the wild.

We started by running a simple query for the Backstage favicon hash in Shodan; it resulted in more than 500 Backstage instances exposed to the internet. We then tried to assess how they could be exploited remotely without authenticating to the target Backstage instance.

The first thing we noticed was that Backstage is deployed by default without an authentication mechanism or an authorization mechanism, which allows guest access. Some of the public Backstage servers accessible to the internet did not require any authentication. 

Additionally, this warning message caught our attention when we reviewed Backstage authentication documentation:

With this in mind, next we tried setting up a local Backstage instance that requires authentication, following tutorial guidelines originally maintained by Backstage. We ended up with authentication only enforced on the client side; requests flowing to the backend API were not verified for authentication or for authorization.

When trying to send requests directly to the backend API server of some of the internet-exposed instances, we found a handful did not require any form of authentication or authorization. Thus we concluded the vulnerability could be exploited without authentication on many instances. 

Disclosure timeline

Takeaways

If you’re using Backstage in your organization, we strongly recommend updating it to the latest version to defend against this vulnerability as soon as possible. 

Moreover, if you’re using a template engine in your application, make sure you choose the right one in relation to security. Robust template engines are extremely useful but might pose a risk to your organization. 

The root of any template-based VM escape is gaining JavaScript execution rights within the template. By using "logic-less" template engines such as Mustache, you can avoid introducing server-side template injection vulnerabilities. Separating the logic from the presentation as much as possible can greatly reduce your exposure to the most dangerous template-based attacks. For more information about mitigating template-based vulnerabilities, see PortSwigger’s technical advisory.

And if you use Backstage with authentication, enable it for both the front and backend.

About Oxeye

Oxeye provides a cloud-native application security solution designed specifically for modern architectures. We enable you to quickly identify and resolve all application-layer risks as an integral part of the software development lifecycle (SDLC). Our seamless, comprehensive, and effective solution ensures touchless assessment, focuses on exploitable risks, and provides actionable remediation guidance. Built for DevOps and AppSec teams, Oxeye helps shift security left while accelerating development cycles, reducing friction, and eliminating risks. Visit www.oxeye.io to learn more.