Gophers & Bees - parsing Golang structures in memory with eBPF

What is OpenTelemetry?

OpenTelemetry is a collection of APIs, libraries, agents, and instrumentation that standardize the collection and transfer of telemetry data (metrics, logs, and traces) from your services and applications. It is an incubating project of the Cloud Native Computing Foundation (CNCF) that aims to make robust, portable telemetry a built-in feature of cloud-native software.

With OpenTelemetry, you can capture distributed traces and metrics from your application using a single set of APIs, libraries, and conventions. These can then be analyzed using observability tools like Prometheus, Jaeger, and others.

The Backstory

Oxeye is a cloud-native application security solution that detects all application risks across modern, distributed cloud-native microservices architectures. Our solution inspects the whole environment - code, infrastructure, and runtime. By gathering all aspects of a living application, we provide a cross-dimensional analysis of all application security threats.

We harness OpenTelemetry to understand the connections between services within a Kubernetes cluster. Since the environments we inspect are as close to production, we can only inspect the final programmatic artifact, which in Golang apps - means dealing with binaries.

While looking for a solution for instrumenting Golang applications in runtime, we found a repository maintained by keyval.dev which does precisely this, instrumenting compiled Go binaries. OpenTelemetry later adopted this project as the de facto standard for auto instrumentation for Golang applications.

However, the Keyval implementation needed several key features to fit our needs, such as connecting different OpenTelemetry spans across different processes (aka context propagation) and supporting HTTP client instrumentation.

Thus we decided to implement it ourselves and contribute our implementation to the OpenTelemetry community.

Context Propagation 101

To properly understand the connection between different processes (or services, for that matter), we need the ability to “stitch” individual events (aka spans) together to build a chain. This chain of events is what OpenTelemetry calls a trace.

For example, in HTTP spans, context propagation is achieved by sending the trace context in a request header called “Traceparent” according to the W3C context standard. This way, every span in the chain can dictate which span created it, resulting in a directed acyclic graph of events.

In OpenTelemetry’s instrumentations, this header is inserted in each outgoing request. While adding this header is relatively trivial in dynamic languages such as Python, it is quite challenging when facing compiled binaries. Alas, how can we add this header in runtime, and more importantly, how can we do this without breaking the HTTP request?

Keyval operates in the context of an eBPF program that puts user-land probes on the proper functions to be notified when a new HTTP request is made. From there, we can read and write to the process memory.

Since Golang is open-source, we dove into the documentation, and we established the understanding that the request headers are represented in memory in the form of Golang Maps.

By understanding how the request headers are represented in memory and using the right tools, we can edit them to append the “Traceparent” header. Sounds easy, right?

Diving into Golang Maps

Maps

Golang maps resemble the traditional key-value pair storage. Every map is defined by an underlying structure called “hmap” that holds the map metadata.

For the sake of simplicity, we’ll focus on the following metadata components:

Buckets

A map bucket is a structure (called “bmap”) that acts as the data container - it holds both the keys and values. Each bucket can hold up to 8 key/value pairs:

In memory example

The following screenshot shows a breakpoint on Go’s HTTP client-request send method. In offset 56 from the request pointer (req), we can find a pointer to the map holding the HTTP headers of the outgoing request:

When we closely inspect the nearby memory of the headers map pointer (at 0xc00011ebd0), we can observe the “hmap” structure as follows:

Now let’s take a closer look at the pointer to the first bucket (at 0xc00014c580):

The following screenshot shows inspecting the contents of the first 16 characters from the pointer of the first key. In this example, we can see the “X-Requested-With” header key:

Once we better understood how the map data structure works under the hood, we started implementing the code both on the HTTP server and client instrumentations.

HTTP Server Instrumentation

For the HTTP server eBPF program - the code that implements our logic and is loaded into the kernel, we add a utility function that receives a pointer to the request headers and returns the location of the “traceparent” header if it is found in the request headers.

Using the knowledge of where the different parts of the map are placed, we use the standard eBPF helper function such as “bpf_probe_read” to iterate over all the request headers and attempt to find the “traceparent” header. Inspecting this header is crucial to propagate the context between HTTP services (according to the W3C trace context specification).

Once the “traceparent” header is found in the incoming request headers, we use it to create a span context with the same trace id and thus associate the newly created span with the same trace.

HTTP Client Instrumentation

The HTTP client eBPF program was not previously implemented by the Keyval project, so we had to implement it from scratch. This blog post will only focus on the context propagation part. We inject a new header into the outgoing request to propagate the context in the HTTP client. You might have guessed it by now, it is the “traceparent” header. We use the same knowledge of the Go map implementation to do the following steps:

Find a bucket that contains less than 8 elements by iterating the “tophash” part of the buckets.
Attempt to find a “tophash” byte with a null value - indicating that the cell is empty.
Once an empty cell is found, we insert a fake top hash value (of “0xEE”) to that cell to represent the newly injected key.
a. Calculating the correct top hash value is hard to achieve within the eBPF program context.
b. This is mainly because it requires invoking the “aeshash” algorithm used by Golang maps.
We inject a new key and value into the appropriate location in the bucket.
We increment the total number of elements in the map by one using the eBPF bpf_probe_write_user helper function by specifying the address where the number of elements is located.

Once this implementation was ready, we created a test application instrumented with the above-described logic. Our test environment consisted of two transactions:

From a Python application to the first Go HTTP server
From the first Go HTTP server to the second Go HTTP server

Without our additions to the Golang OpenTelemetry instrumentation, these transactions would appear as separate traces. After applying our changes, these transactions produced the following unified OpenTelemetry trace:

Contribution process

Resources

The pull requests we submitted to OpenTelemetry:
- https://github.com/open-telemetry/opentelemetry-go-instrumentation/pull/91
- https://github.com/open-telemetry/opentelemetry-go-instrumentation/pull/92

Oxeye’s blog posts on OpenTelemetry
- https://www.oxeye.io/resources/diving-into-opentelemetrys-specs
- https://www.oxeye.io/resources/contributing-open-source-code-to-opentelemetry-cncf

Gophers & Bees - Parsing Golang Structures in Memory with eBPF

What is OpenTelemetry?

The Backstory

Context Propagation 101

Diving into Golang Maps