Investigate Complex Firmware: Techniques for Analyzing Hard-to-Analyze Systems

No Thumbnail Available

Date

2025

Journal Title

Journal ISSN

Volume Title

Publisher

University of Georgia

Abstract

Embedded systems play a critical role in modern technology. They perform specific tasks and communicate over networks to achieve broader goals. However, they remain highly vulnerable due to insufficient defenses and the widespread use of open-source software. Many vendors stripped application binaries before embedding them into the firmware. However, current IoT exploits show that these measures are not enough. External firmware analysis is critical because vendors often neglect comprehensive security testing. However, stripped binaries pose significant challenges for vulnerability detection techniques such as emulation fuzzing. This dissertation presents cmdFuzz, a novel approach to enhancing command-line argument emulation fuzzing on stripped binaries. Our design integrates three key modules: (1) determining the entry point of a stripped binary for the emulator fuzzer, (2) identifying command-line argument memory for fuzzing input, and (3) constructing command-line argument grammar to guide the fuzzing mutator. Implemented using the Qiling framework with unicorn-AFL integration, cmdFuzz employs code pattern recognition in main() for argument validation and an algorithm that floods memory with repetitive patterns to locate command-line argument storage. Our approach also leverages large language models (LLMs) such as GPT-3.5 to generate complex argument grammars using extensive knowledge of open-source software. Evaluations in stripped firmware demonstrate the effectiveness of cmdFuzz, successfully enabling fuzzing across various test cases and validating the efficacy of LLM-generated grammars. Furthermore, this dissertation explores the impact of code obfuscation on the effectiveness of fuzz testing. While obfuscation can safeguard proprietary software from reverse engineering, it alters control and data flow, which significantly affects the outcomes of fuzz testing. We present obfFuzz, the first empirical study to systematically investigate the challenges of fuzz testing of obfuscated software. Using AFL++, we compare fuzzing performance in unobfuscated and obfuscated binaries across real-world programs, including pdfinfo, exif, tiffinfo, and md2roff. Our study reveals that (1) control flow obfuscation increases complexity, reducing code coverage by 60%, (2) data flow obfuscation leads to inefficient mutations, extending crash discovery time by 70%, (3) obfuscation expands binary size and slows execution, impairing fuzzing efficiency, and (4) obfuscation transformations introduce excessive complexity and potential bugs, further degrading fuzzing effectiveness. This dissertation advances the state of the art in vulnerability detection within embedded systems by tackling fuzz testing constraints in stripped binaries and exploring the challenges associated with fuzzing obfuscated binaries. The results highlight the need for tailored fuzz testing strategies to address obfuscation complexities while ensuring that security assessments remain valid in the case of complex binaries.

Description

Keywords

Binary Analysis, Stripped Binary, Fuzzing, Obfuscated Binary, Emulation

Citation

Endorsement

Review

Supplemented By

Referenced By

Copyright owned by the Saudi Digital Library (SDL) © 2026