eBPF Rootkit or EDR
Introduction
Invary validates the Runtime Integrity of systems. Runtime Integrity is the assurance that your system, application, or workload is operating as intended, without inadvertent modifications or unauthorized tampering or corruption. It ensures that the code being executed and the data in use remain in their original, trusted state during runtime. When a system lacks Runtime Integrity it isn’t working as designed, so any applications or workloads running on that system can’t be trusted to have data integrity or confidentiality.
Invary’s Runtime Integrity solution starts with validation of the operating system. Lack of Runtime Integrity at this base layer that sits between hardware and running applications can indicate that a threat actor has control over it via a rootkit or other kernel level malware, altering the operating system in memory at runtime to hide their activity. Invary validates operating system integrity by taking detailed measurements of running kernel code and data from memory, then appraising those measurements against a known-good baseline.
But how can Runtime Integrity be appraised when an operating system’s behavior has been customized for a particular environment? Different computers running the same kernel may behave differently because they have different hardware peripherals installed or operating system features enabled that require dynamic kernel modules or drivers to be loaded. Invary’s Kernel Runtime Integrity solution already handles this common type of customization. A newer and more complicated form of customization is used by applications that run part of their code directly in the kernel with a technology called eBPF.
We’re enhancing our Runtime Integrity solution beyond validation of the operating system kernel to also provide validation of eBPF programs running at the intersection of kernel and user space. We’ll examine eBPF, cover a bit about how it works and who’s using it, then take a deeper dive into two different examples of systems that lack Runtime Integrity due to the use of eBPF, one running modern endpoint detection and response (EDR), and another that’s infected with a rootkit built with eBPF.
Overview of eBPF
Initially created as a mechanism for efficient packet filtering in Linux, eBPF has evolved into a cross-platform, general-purpose framework to safely run parts of an application inside the operating system kernel, allowing applications to react to low-level system events with high efficiency and low latency. This capability makes eBPF particularly useful for applications that provide observability, security enforcement, performance monitoring, network packet processing, or application tracing.
With strong community backing and adoption by major tech companies and software vendors, eBPF is being ported to many different operating systems and execution platforms and quickly becoming a cornerstone technology in cloud-based environments.
How it works
Prior to eBPF, if a developer wanted to run custom code in the kernel, they would develop a kernel module or driver. Kernel modules are difficult to create and maintain and must be compiled against a specific version of the kernel. Software developers find eBPF appealing because eBPF programs are much easier to create and, if written properly, can be compiled once and run in many different versions of the kernel. Because eBPF programs are precompiled and packaged with their user space applications, systems administrators find eBPF programs easier to deploy and maintain than kernel modules.
When creating an application that makes use of eBPF, a developer identifies the low-level system events that their application needs to react to, and then writes one or more eBPF programs that can be attached to the hooks in the kernel that are called when those events occur. eBPF programs are compiled independently of the main application, into eBPF byte code. When a user space application runs, it makes system calls to load these eBPF programs into a secure virtual machine in the kernel which verifies them for memory safety, control flow integrity, and compliance with kernel safety rules. The virtual machine often uses just-in-time compilation to convert the loaded eBPF bytecode into efficient native instructions. eBPF programs share data with user space applications and other eBPF programs via data structures called maps which are constructed when an eBPF program is loaded.
After loading an eBPF program, an application makes another system call to attach the program to a specific kernel hook or event source. After attachment, when a callback or event is generated by that kernel hook, the eBPF program is executed. During execution, an eBPF program processes its triggering event, often invoking eBPF helper functions - predefined functions provided by the kernel that allow safe data manipulation and interaction with kernel features, and then reads or writes data to one or more maps. When an application is done with an eBPF program, it makes another system call to detach that eBPF program from its hook. When there are no more users of an eBPF object (program or map), the kernel will discard that object from memory.
Who’s using eBPF?
eBPF is used by commercial and open source applications and is leveraged by large platform and service providers for dynamic network services and monitoring:
- Common endpoint protection solutions deploy agents that use eBPF for security monitoring and enforcement on endpoints
- Monitoring platforms like DataDog and New Relic deploy agents that use eBPF for infrastructure, application performance, and security monitoring
- Kubernetes, a popular containerized application management system, is commonly used with eBPF-enabled plugins for software-defined networking, container security, monitoring, and traceability (via Cilium, istio, etc.)
- Web sites and services like LinkedIn, Netflix, and DoorDash use eBPF for observability and monitoring of their platforms
- Large web sites like Walmart, Yahoo, Alibaba, and the New York Times use eBPF for load balancing
- DDoS protection services like Cloudflare use eBPF for load balancing, packet inspection, and modification
- Public cloud platforms like Google Cloud, Amazon Web Services, and Oracle Cloud use eBPF internally to network, monitor, and secure their infrastructure
- Open source and commercial application tracing tools based on eBPF are used by software development teams to debug and improve the performance of their applications
- A number of Linux subsystems and facilities allow extensibility or customization with eBPF. For example, some Linux distributions, including Debian and Ubuntu, ship with the systemd service manager which, by default, loads cgroup device control eBPF programs.
- eBPF has even made its way onto Android phones
You may be running services with access to eBPF in production if you:
- protect your servers with endpoint or workload security
- monitor infrastructure or applications
- make use of container-based application frameworks like Kubernetes
eBPF is an appealing target for threat actors
The rapid adoption of eBPF in critical systems and applications increases its attractiveness as a target. eBPF support has been included in Linux kernels for a while now and most standard distributions ship with it enabled.
eBPF programs run directly in the kernel, granting them access to low-level operations and data. While eBPF programs are sandboxed and can’t call arbitrary kernel functions, they’re given access to protected eBPF helper functions. Some of these helper functions, like bpf_probe_write_user and bpf_override_return, are quite powerful and have legitimate value when implementing use cases like runtime process protection and security policy enforcement, but can also be used by threat actors to obscure their activity by hiding malicious files, modifying log entries and otherwise concealing themselves from detection by user space security monitoring programs.
eBPF provides significant capabilities for customizing the network stack by allowing applications to inject logic into different parts of the packet processing pipeline in order to implement solutions like low-latency network proxy and DDoS protection. However, these same capabilities can be used to create hidden command and control facilities (using xdp to monitor ingress) or to stealthily exfiltrate data (using tc for egress).
Because eBPF programs are small, each focused on a single type of low-level event, many applications that use eBPF load a large number of eBPF programs. A typical eBPF-based EDR agent can require 50-100 programs. Systems administrators and security analysts can inspect the list of running eBPF programs using utilities like bpftool, but it’s not easy to determine which running eBPF programs are legitimate and which may not be. Additionally, bpftool depends on a system call that can be hijacked by a threat actor (using the bpf_override_return technique described above) to omit specific programs from the listing, so it’s quite possible for a system to be running malicious eBPF programs without the knowledge of its operators.
The wide adoption of eBPF, its low-level access and powerful helper functions, along with the difficulty systems administrators face when trying to monitor dynamically loaded programs and understand their impact on the system all make eBPF an attractive target for threat actors who want to create stealthy malware.
Runtime Integrity for eBPF
When validating the Runtime Integrity of an operating system kernel, we measure millions of data points from kernel data structures, objects, function pointers, code sequences and their relationships, then appraise that measurement against a known-good baseline. If an appraisal fails, indicating that the system lacks Runtime Integrity, the generated appraisal report highlights the specific parts of the kernel (including memory locations) that have been modified. For an in-depth view of how we do that, read Runtime Integrity Measurement Overview by Invary CTO Wesley Peck.
Unlike the kernel itself, eBPF programs are dynamically loadable (and can even be created and compiled at runtime). Though they execute in kernel space and can change its behavior, eBPF programs are essentially specially constructed parts of user space applications. As such, validation of eBPF Runtime Integrity is a bit more complicated because there’s no strict baseline. Instead, we break down eBPF Runtime Integrity into the following:
- Validation of the underlying eBPF functions pointers, data structures, and code segments in the kernel that make up the eBPF virtual machine, verifier, etc. – at measurement time
- Validation of the code and data integrity of running eBPF programs and maps to guarantee that they haven’t been tampered with after loading
- Validation of the helpers and functions used by running eBPF programs to determine if they change the intended operation of the kernel – whenever an eBPF program is loaded
In the examples below, we’ll look mostly at the last type of validation.
eBPF rootkit or EDR?
In an earlier article, Rootkit or EDR, Invary CEO Jason Rogers examined two systems that lack Runtime Integrity. One was impacted by a kernel rootkit, the other was running a popular EDR agent. Both the rootkit and the EDR software modified the intended operation of the kernel, resulting in failed appraisals. In this article, we will explore a similar comparison for eBPF, analyzing eBPF-based malware and eBPF programs loaded by a widely used endpoint security solution.
The rootkit
The rootkit shown below is an eBPF-based rootkit that we’ve modified for testing purposes.
This rootkit has the following capabilities:
- Hide its user space process and binary on disk
- Hide its eBPF programs from the bpf syscall (and therefore from bpftool)
- Read and write protected operating system files like /etc/passwd
- Passively map network connections to and from the system and probe the network to discover other running services
- Run a stealthy command and control server that responds to obfuscated https requests addressed to a legitimate web service running on the system
- Exfiltrate data (file contents, in-memory data, etc.) in obscured https responses
- Attempt to recover credentials for some running services like postgres and docker
The appraisal report generated just after the rootkit is implanted shows that 72 new eBPF programs were loaded. When an eBPF program is written, its developer has to specify which of 30+ types of program it is. The declared type of an eBPF program limits which helpers it can call and where it can be attached in the kernel. The rootkit loaded kprobe, tracepoint, xdp, and sched_cls type programs. The first two types are used to observe or modify low-level kernel events, while the latter two types allow customization of the networking stack.
Overall, the appraisal report shows that the system lacks Runtime Integrity due to 25 findings. Some of these findings indicate the use of certain helper functions that have changed the behavior of the operating system:
eBPF program 'kretprobe__64_s' (kprobe) uses the helper 'override_return' which violates Runtime Integrity by allowing system calls to be intercepted and their return values overwritten, changing the operation of the kernel.
and
eBPF program 'sql_db_query_co' (kprobe) uses the helper 'probe_write_user' which violates Runtime Integrity by allowing memory of a user space process to be overwritten - this may corrupt user memory.
Other findings indicate that the system is now running eBPF programs that customize the network stack:
eBPF program 'xdp_ingress_add' (xdp) is an eXpressDataPath program that violates Runtime Integrity by modifying the operation of the network stack.
and
eBPF program 'egress' (sched_cls) is a Traffic Control classifier program that violates Runtime Integrity by modifying the operation of the network stack.
Each of these findings can be expanded to show additional details that allow you to understand more about the program, how it was loaded, and what it does (we’ll cover that in more detail in the next section when we examine the EDR agent).
In this case, we know that the kretprobe__64_s program is involved in hiding the rootkit, that sql_db_query_co is involved in snooping credentials, and that the xdp_ingress_add and egress programs are involved in command and control and data exfiltration.
The EDR agent
The EDR agent starts multiple processes, several of which load eBPF programs (we’ve slightly obfuscated program names below). The appraisal report, generated after the agent was started, shows that 104 new kprobe, tracepoint, rawtracepoint, and lsm programs were loaded.
Like the rootkit, the EDR agent loads a large number of programs to monitor low level system events (various kernel probes and tracepoints). It also loads a Linux Security Module (lsm) program. It does not load anything that is allowed to modify the kernel’s network packet processing pipeline.
The EDR agent also initially failed the Runtime Integrity appraisal, this time due to 24 findings, all of which look like:
eBPF program 'sys_kill_prog' (kprobe) uses the helper 'override_return' which violates Runtime Integrity by allowing system calls to be intercepted and their return values overwritten, changing the operation of the kernel.
When these findings are expanded, the appraisal report shows more context:
- Details about the eBPF program and any maps it uses
- Program name
- Program type
- Loaded, translated and jited instruction counts
- helpers used
- BPF id and tag, subprogram tags
- map ids, names, and types used
- sha256 hashes of loaded, translated and jited instructions
- Details about the user space application that loaded it
- Full path of on-disk binary
- File size
- sha256 hash
- Details about the process that loaded it
- Command executed
- Full command line
- Login user and effective user ids
- Parent process details
This information is helpful when determining whether or not you intend to allow the behavior on your system. With it, we can see that the name of the user space application matches the name of our EDR software agent. The process that loaded the eBPF program was executed from the correct path, with the correct command line, by the expected user. Finally we can verify that its on-disk sha256 hash matches the one provided by our endpoint security provider.
Looking into the 24 findings, we can verify that in each case the eBPF program was properly part of the EDR agent software. With deeper inspection, it appears that the EDR agent’s use of the potentially dangerous override_return helper is primarily about protecting its own processes from being tampered with or bypassed. Our confidence in these findings is heightened by the fact that we have also verified the integrity of the kernel itself.
If you decide that one or more eBPF programs that have violated integrity are in fact dependable, you can update the configuration of the Invary Runtime Integrity agent to allow the eBPF program to be considered part of the intended behavior of your system. In this case, we would recommend adding a rule which will allow the use of this helper in any eBPF program loaded from a user space application with this path and on-disk hash.
After updating the configuration and restarting the agent, future appraisals will pass.
Key takeaways
Runtime Integrity is critical for ensuring that systems, applications, and workloads operate as intended without unauthorized tampering or corruption. Without it, the integrity and confidentiality of data and processes cannot be trusted. Invary's Runtime Integrity solution addresses this issue by validating the operating system kernel and extending its capabilities to assess eBPF programs, a growing area of customization within modern systems.
The article highlights the dual-edged nature of eBPF: while it enables efficient, high-performance customization for observability, security, and performance monitoring, it also poses a significant security risk due to its access to kernel-level operations. Through case studies involving an eBPF-based rootkit and a widely used EDR agent, the piece demonstrates how Runtime Integrity violations can occur and how Invary's solution identifies and appraises these instances. By providing detailed appraisals, Invary enables organizations to distinguish legitimate from malicious eBPF programs, reinforcing security without compromising operational functionality.