Emulators offer transparency and control of the emulated subject and enable a precise observation (O_{t_i}) of internal operations in manifold dimensions. Furthermore, multiple instances of an emulator can be created easily, enabling horizontal scaling of the fuzzing process.

However, running firmware of embedded devices in an emulator presents several challenges, which are carved out well by Wright et al. (2021). Most notable for fuzzing is the fidelity and the effort needed to adapt an emulator to a specific target.

Figure 2 shows an architecture model for embedded systems. While the application logic is contained in the application layer, potential operating systems are located within the system software layer. However, there are embedded systems without a dedicated operating system, often referred to as bare-metal systems. The system software layer then may contain bootloader, drivers, and Hardware Abstraction Layer (HAL) modules. Executing the application within an emulator can be realized by either replacing the hardware layer with a system emulator or by moving only the application into a user-mode emulator.

Fig. 2
figure 2

Embedded systems architecture model according to Noergaard (2012)

In this section, the most notable approaches are presented that enable embedded fuzzing in an emulator.

User mode emulation fuzzing

User applications that are built for running in an operating system can potentially be executed very easily in an emulator, because of the well-defined operating system interfaces at the application layer. User mode emulation enables fuzzing of binary-only applications with coverage guidance.

It is also possible to transfer user applications from (in particular Linux-based) embedded systems into a user mode emulator like QEMU to perform coverag-guided fuzzing, independently from the instruction set architecture. However, accesses to the hardware that embedded applications normally rely on need to be treated adequately by the emulator.

All investigated fuzzing frameworks in this category use a custom kernel for this purpose, also depicted in Fig. 3. The thick boxes depict the parts that originate from the actual target.

Fig. 3
figure 3

Scheme of fuzzing applications in a user-mode emulator

Chen et al. (2016) developed the Firmadyne framework, which allows for automated dynamic analysis of Linux-based embedded firmware images. It extracts the root filesystem from a binary firmware image and utilizes a custom kernel to run the image within the QEMU full-system emulator. With this setup, dynamic analysis of the user applications in the firmware can be performed, which is demonstrated by providing a set of known exploits that can be tried on the emulated device. Even though the full-system mode of QEMU is used, Firmadyne should be considered to enter at the application layer, because it deploys its own customized kernel and only the user space applications from the firmware are executed. The custom kernel partially compensates for missing hardware emulation, for example, by providing an emulated NVRAM that embedded devices often use.

The Firmadyne framework is enhanced by Kim et al. in FirmAE (Kim et al. 2020). They claim that the Firmadyne framework could only get 16.28% of their tested set of firmware images up and running for dynamic analysis. To solve this problem, they introduced heuristics to configure boot parameters, kernel parameters, network interfaces, and file systems correctly. With these modifications, they were able to automatically run 79.36% of the aforementioned set of firmware images within QEMU.

FirmFuzz (Srivastava et al. 2019) is an automated introspection and analysis framework for IoT firmware. It is designed for embedded devices that offer user interfaces through a webpage and are based on Linux. The QEMU system emulator is set up with a customized kernel in conjunction with fake peripheral drivers to compensate for potential missing hardware emulation. A headless browser is used to communicate with the device automatically through a virtual network interface to find user interfaces. After the static analysis of the firmware, a generation-based fuzzer is set up. Seed input data is generated, using the contextual information that is gathered from the firmware image. The target is monitored for faults by the modified Linux kernel within the emulator.

FIRM-AFL (Zheng et al. 2019) is based on AFL and Firmadyne. The idea is to speed up fuzzing within QEMU by letting the target user process run in the user-mode as long as possible. When necessary, the user process is translated to the full-system emulator of the appropriate device hardware. As a result, the overhead of a full-system emulation is largely omitted. The authors state that with this mechanism, the fuzzing process can be sped up by a factor of ten. However, it is required that the target device runs a POSIX-compatible operating system and the hardware can be emulated by QEMU.

Transferring embedded applications from Linux-based devices into an emulator by providing a customized kernel can be successful in some cases, in particular when the target application does not rely on special hardware peripherals. Nevertheless, there remain many embedded systems to which this does not apply, and which demand a different approach for emulation-based fuzzing.

Full-system emulation fuzzing

Once an embedded system can be emulated adequately, code coverage, fault states, and other meta information of the execution can be obtained easily. The next section is about methods that enable full-system emulation of embedded devices. For a correct emulation of embedded firmware, all hardware peripheral accesses must be treated in the emulator.

Peripheral emulation

A hardware access manifests itself in read and write operations on the hardware address space. Additionally, hardware interrupts are a mechanism to let hardware peripherals trigger code areas from the firmware. Implementing software equivalents of hardware peripherals and providing them on their expected locations in the hardware address space is a way to enable emulation. When all peripherals from a target device can be emulated, an unmodified firmware image can be executed and fuzzing can be enabled with little effort, as depicted in Fig. 4.

Fig. 4
figure 4

Scheme of fuzzing embedded applications in a full-system emulator

The QEMU system mode is a popular full-system emulator, which already provides configurations for several microcontrollers and peripherals and supports a large variety of architectures. TriforceAFL (Hertz and Newsham 2021) combines AFL with QEMU and enables emulation-based coverage-guided fuzzing for targets that can be emulated with QEMU. If the desired target device is not supported, the implementation and configuration can be very laborious and requires deep knowledge of the hardware.

Herdt et al. (2020) present a different solution for emulating the whole hardware of an embedded system. They apply libFuzzer to a SystemC virtual prototype. SystemC is defined as IEEE-1666 standard (Group S-SCSW 2011) and provides a set of C++ libraries to define virtual prototypes. Virtual prototypes are models of the entire hardware system and allow an accurate simulation. They are an established way of testing systems during their development in the industry. Fuzzing is performed on the virtual hardware by using a fully booted state of the system, which is preserved by a fork-server mechanism. However, the complete system must be described in SystemC, which requires deep insights into the SUT and can again require a lot of manual work.

Clements et al. (2020) present HALucinator to address the problem of emulating peripherals by using the HAL as an entry point. First, it locates HAL functions in the firmware through binary analysis. Second, it intercepts the execution of the HAL functions and instead mimics its expected behavior. Handlers for each HAL function must be implemented manually once. Beside correct emulation, HALucinator can intercept functions that provide random values and is able replace them by deterministic functions, which can render fuzzing more efficient.

Kim et al. (2019) proposed RVFuzzer for detecting input validation bugs in robotic vehicles. Robotic vehicles are cyber-physical systems managed in real-time by a microcontroller. It needs to control actuators, process sensor data, and react to control commands. A careful validation of incoming control commands is therefore required, especially if they are received from an unencrypted broadcast medium. RVFuzzer tries to detect (sequences of) control commands that bring the robotic vehicle into an unstable state. Therefore, the control program is connected to a physical simulation of the robotic vehicle, and input commands as well as environment parameters are mutated. Instabilities are detected by observing whether the presumed state in the control program deviates too much from that in the simulation.

Peripheral proxying

When deep knowledge about the SUT is missing, hardware accesses of the firmware must be treated differently. An alternative solution is to forward each hardware access to the real device. Therefore, a proxy application is introduced to route appropriate values and triggered interrupts between the actual hardware and the emulation, as shown in Fig. 5.

Fig. 5
figure 5

Scheme of embedded fuzzing with peripheral proxying

PROSPECT (Kammerstetter et al. 2014) uses TCP/IP connection to forward hardware accesses, Avatar (Zaddach et al. 2014) a debugging connection, and SURROGATES (Koscher et al. 2015) routes hardware accesses through a dedicated FPGA to the actual hardware.

Regarding mobile system drivers, Talebi et al. (2018) developed Charm that enables fuzzing of device drivers by forwarding hardware peripheral accesses through a USB-based connection. Since the drivers need to be modified for this method, Charm works only with open source drivers.

Avatar has a successor, Avatar(^{2}) (Muench et al. 2018), which is not only intended for hardware access rerouting, but more for orchestrating different frameworks to enable dynamic analysis. Its flexibility is proven by Muench et al. (2018).

They enable coverage-guided fuzzing on a wide variety of devices by using PANDA (Dolan-Gavitt et al. 2015) as the emulator, Avatar(^{2}) (Muench et al. 2018) for forwarding non-emulatable hardware accesses, and Boofuzz (Pereyda 2017) as the fuzzer. Furthermore, they uncover the issue of silent memory corruptions that can occur in embedded devices without Memory Management Units (MMUs) or operating systems that take care of memory accesses. These are memory corruptions that do not result in a crash of the device upon occurrence and are therefore are not easily observable. To detect silent memory corruptions, they present heuristics that can be applied to an emulator, regardless of the manner of hardware access treatment. When using these heuristics all, occurring memory corruptions of a device can be discovered.

Peripheral proxying offers a solution for emulating an embedded device without excessive implementation effort. However, the forwarding of peripheral accesses to the real hardware can present a bottleneck, depending on the number of requests to the hardware. Additionally, manual configuration and setup of the proxying mechanism is required.

Peripheral modeling

Where implementing virtual hardware requires too much effort and peripheral proxying is too slow for fuzzing, automated hardware modeling can be a solution. The idea is to learn how to respond to hardware accesses such that the firmware continues its execution. The peripheral model is thereby directly connected to the MMIO address space and can be supported by the fuzzer, as depicted in Fig. 6.

Fig. 6
figure 6

Scheme of embedded fuzzing with peripheral modeling

Gustafson et al. (2019) present a semi-automated re-hosting framework, called PRETENDER. They solve the modeling of hardware peripherals by means of preliminary observation and recording of the behavior of the real device with Avatar(^{2}). As a result, not only accesses to the hardware are recorded, but also the timings and orders of interrupts. Next, a rather complex step of categorizing MMIO registers and initializing State Approximation model occurs. This should allow for smart responses to hardware accesses of the firmware. Finally, human interaction is needed to define the entry point of the fuzzing data. The authors state that PRETENDER allows for a survivable execution, which can just be sufficient for a dynamic analysis of the device.

Spensky et al. (2021) refined this approach with Conware, which can also learn hardware peripheral behavior by first recording interactions between the firmware and the real hardware peripheral and subsequently extracting models for each of them. The extracted models can then be used for a full-system emulation. In contrast to PRETENDER, Conware claims to be more generic and can even model peripheral behavior that has not been recorded directly.

Another hardware-agnostic approach for embedded fuzzing is presented by Feng et al. (2020). Their framework P(^{2}) IM responds to each peripheral access (a read from the MMIO address space) with input data from the fuzzer. Therefore, the MMIO registers are categorized into Control Registers, Status Registers, Data Registers, and Control-Status Registers by observing how the firmware accesses the registers. Depending on the category, interaction with the registers is treated differently. Most important is the treatment of Data Registers, where P(^{2}) IM directly injects input data from the fuzzer. Thereby, the fuzzer itself models all of the peripheral input generically, omitting the need for finding and choosing the correct input vector for the target. The interrupt emulation is implemented quite pragmatically by sequentially firing one interrupt per 1000 executed basic blocks. When the initially supplied fuzz input buffer is exhausted, the execution is terminated and the code coverage is fed back to the fuzzer. The explorative nature of the fuzzer is used to improve the hardware peripheral modeling successively. The framework allows existing fuzzers to be added as a drop-in component, offering AFL as default. However, peripherals that use DMA are not modeled by P(^{2}) IM, as this would require insights on the internal design of the target device.

For automatic emulation of DMA input channels in P(^{2}) IM, Mera et al. (2020) present the drop-in solution DICE. It observes the behavior of running firmware in the emulator and recognizes candidates for DMA input channels heuristically. In principle, it searches for pointers to the internal RAM that are written to memory-mapped IO-registers. The authors claim that, during their tests, DICE did not create any false positive categorization and successfully detected 21 out of 22 actively used DMA input channels. With negligible overhead, it enables fuzzing of DMA input processing firmwares without further hardware knowledge.

Johnson et al. (2021) present a more targeted peripheral modeling approach with Jetset. In this case, an analyst manually defines a goal address in the firmware that should be reached, and Jetset tries to derive the necessary hardware peripheral responses to reach this address with symbolic execution. For instance, the transition from kernel space to user space can be used as such a goal address. The explicit goal address allows Jetset to mitigate path explosion during symbolic execution.

Zhou et al. (2021) enable peripheral modeling in their tool µEmu by mixing symbolic and concrete execution to calculate appropriate responses to hardware accesses. First, all hardware peripheral dependent inputs are treated symbolically. To avoid path explosion, symbolically calculated values are cached and reused during concrete execution. When invalid execution states are reached, the responsible cached values and the state itself are marked as invalid and different paths are taken by future symbolic executions. This way, the hardware peripherals are enhanced iteratively.

Scharnowski et al. (2020) refine the mechanism of P(^{2}) IM. Instead of putting a memory-mapped register into a category, their framework Fuzzware handles each individual access to a memory-mapped register by additionally considering the program counter on each access. On the first occurrence of an access, the emulator is reset to the instruction right before accessing the memory-mapped register and Dynamic Symbolic Execution (DSE) is used to determine whether and how the value affects the further execution. Accordingly, the individual memory-mapped register access is assigned just enough random input bits to ensure that all dependent branches can be reached. This leads to a minimal consumption of input bits from the fuzzer while fuzzing the whole peripheral interaction. The authors claim that DMA could also be modeled with further effort, but this is considered out of scope of their work.

Sandbox emulation fuzzing

In cases where a full-system emulation is not feasible, lightweight sandbox emulation can be a solution. Thereby, the binary code is executed from a manually chosen point with a manually created context. The idea is to fuzz functions that do not communicate with peripherals at all, meaning that the hardware peripherals do not need to be emulated. This technique is almost hardware-independent since only a simulator for the respective instruction set is required. Fuzzing a function from a binary firmware file within a sandbox can be realized as shown in Fig. 7.

Fig. 7
figure 7

Scheme of embedded fuzzing through sandbox emulation

Miasm is a reverse engineering tool to analyze, modify, and partially emulate binary programs. It offers features such as assembling and disassembling for various architectures, emulation with Just-In-Time (JIT) and symbolic execution. In combination with Python-AFL, Miasm can be used to perform fuzzing (Guedou 2017). Therefore, a sandbox is created by Miasm, input data needs to be mapped to appropriate memory addresses, and registers need to be initialized correctly. This technique is mainly interesting for penetration testers, who reverse engineer binaries and want to perform fuzzing of interesting functions in this way. If the source code is available, it is easier to perform fuzzing of hardware-independent functions by compiling them into a user application and using a general purpose fuzzer.

The Unicorn CPU Simulator (Nguyen and Dang 2015) was used by Nathan in Voss (2021) in a similar way.

Maier et al. (2020) present BaseSAFE, where they also used the Unicorn CPU Simulator to fuzz different layers of a smartphone baseband chip on manually selected target functions and manually created memory contexts. The downside of these sandbox emulation fuzzing approaches is the constrained, manual selection of the target function and manual creation of the execution context.

A semi-automated approach of supplying an execution context to the target code is presented by Harrison et al. (2020) with their tool PartEMU. They present required steps that allow experts to set up and configure an emulator to enable dynamic analysis of TrustZones from embedded systems. Therefore, it is explained when hardware and software components should be emulated or reused, and how specific emulation stubs can be implemented. Nevertheless, developing such an emulation-based execution context can involve huge manual effort and requires expert knowledge.

Ruge et al. (2020) present Frankenstein, a highly specialized framework for fuzzing wireless modem firmware in an emulated environment. They run the firmware of a Broadcom Bluetooth chip within QEMU user mode. Through sophisticated reverse engineering, about 100 locations in the code have been determined, where the execution needs to be redirected and substituted manually. This hooking is required to ensure correct emulation of the firmware. With this setup, they were able to fuzz the Bluetooth modems of popular mobile phones from Apple and Samsung and unveiled several security problems. However, the setup is highly customized and requires a lot of manual effort to adapt it to other embedded firmware.

An automated sandbox-based fuzzing tool for IoT Firmware is presented by Gui et al. (2020) with FIRMCORN. First, the firmware image is disassembled and detected functions are rated based on the memory operations they contain and the use of predetermined sensitive functions, such as read, strcpy, and execve. For high rated functions, a context dump (memory and register values) at the starting point of the function is gathered from the actual device. This allows specific fuzzing of potential vulnerable functions within the CPU emulator Unicorn. An automated mechanism detects crashes of the emulator, which result from missing emulated hardware, and skips these crashing functions during further virtual execution. They state that the tool is developed for Linux-based devices only, but it should be possible to extend it to further platforms.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.


This article is autogenerated using RSS feeds and has not been created or edited by OA JF.

Click here for Source link (https://www.springeropen.com/)