Dec 22, 2021

How insecure application tracing and telemetry may lead to sensitive data and PII leakage

Ron Vider
CTO & Co-Founder

Cloud-Native Tracing Overview 

Nowadays, applications are more complex than ever. Development pace is high, and the need for agility is enormous. Production releases occur several times each day and are developed simultaneously by different teams. By leveraging microservices architecture, developers often break up complex applications into smaller loosely coupled components, which are deployed in a distributed manner, oftentimes as orchestrated containers.

While distributed applications provide many benefits, they also pose tough challenges, especially when attempting to debug code and figure out how the entire application operates. 

One of the common techniques used by development teams for debugging distributed applications is tracing and collecting telemetry data across the different microservices. Multiple popular open-source solutions help solve this problem - OpenTelemetry, Pixie, Jaeger, and OpenZipkin among others.

In many cases, trace information contains metadata related to internal communication within the application such as API endpoint, URLs, message queues, and 3rd party services. Normally, PII and other sensitive data should not be logged, however, insecure implementation and usage of telemetry and trace information may introduce significant risk to organizations who make use of such solutions. 

When Tracing Backfires

Recently, Oxeye’s research team discovered several scenarios where sensitive data was leaked through tracing and telemetry collection within cloud-native applications. In order to validate whether this was a unique issue in specific applications or a widespread phenomena, the team went on to research and analyze the prevalence of such data leakage in the wild. 

Our research revealed that many organizations which use popular application tracing platforms leave these platforms publicly open to the Internet, with no authentication mechanism to prevent unauthorized access. To make matters worse, in many cases, these publicly open tracing platforms contained sensitive application and user data.  

Among the many deployments we found open to the Internet, was one that belonged to a leading online payment services company. Deeper investigation demonstrated that by leveraging the tracing and telemetry data, it was possible for a malicious user to bypass the authentication mechanism, and login into the online payment service platform as other legitimate users, access their personal data and perform actions on their behalf.  

Technical Information

During the investigation of the Jaeger platform deployment belonging to the online payment services company mentioned above, our team discovered that malicious users were able to access external and internal microservices definitions, API endpoint data, as well as information related to connections between microservices and SQL database statements.

The tracing information which was discovered, exposed a service that included several endpoints, one of which was responsible for communication with the graph database, and contained sensitive technical information. The application used the data to verify the refresh token (a unique token used to obtain a renewed access token) for authenticated users. 

Further investigation revealed that an attacker may create a random user account, delete all of its session tokens (implemented as HTTP cookies), harvest other legitimate users’ refresh tokens from the tracing data, and use them to generate session tokens and authenticate as other legitimate users.  

After reporting our findings, the payment provider has blocked unauthenticated  public access to the Jaeger dashboard, resolving the root cause of this issue. 

Oxeye Pro Tips for Mitigation

  1. Avoid exposing application tracing and telemetry management interface to public Internet access. 
    Important note:
    Oxeye's research team noticed that Jaeger's official Github repository contains an all-in-one Kubernetes deployment example, which creates an external load balancer. The example’s load balancer exposes the UI to the public Internet in some scenarios. Further investigation showed an enormous number of real-world instances which follow this bad practice.
  1. Always use strong authentication for accessing application tracing and telemetry management interface. When available, use multi-factor authentication.
  1. Map the data sent to tracing and telemetry platforms, and apply sensitive data scrubbing on logs, application tracing and telemetry data - for example, organizations that leverage OpenTelemetry, should use the AttributeProcessor capability for scrubbing sensitive data. More information on why and how to use OpenTelemetry processors can be found in the following blog post from Splunk

In the meantime, I welcome you to see Oxeye in action - https://www.oxeye.io/get-a-demo