Auburn Program Understanding Lab, Auburn University

Advancing Cybersecurity through Binary Program Analysis, Harnessing the Power of Reverse Engineering, Deep Learning, and Virtual Reality.

Projects

Binary Program Analysis

With the binary program analysis team we have two primary efforts. The first is in the area of trying to establish ground truth for program disassembly, control flow graph representation, and pointer analysis. The second effort is in using machine learning models to do type inference on disassembly representations. Both of these efforts deal with trying to build very low level program understanding from the ground up.

Oxide – A Modular Binary Analysis Framework

The Oxide binary analysis framework was initially developed by Dr. Mulder at Sandia National Labs and is now open source. This framework is designed to allow scientists with different sets of domain expertise to more easily collaborate and focus on rapid prototyping and research in the area of binary program analysis.

Immersive Analytics for Program Understanding

We are working in developing Virtual Reality based tools to aid in reverse engineering and program understanding. The first area of research is improving comprehension by facilitating connection tracking between different representations. The second research area is visualizing program transformations as they happen in compiler optimization passes. Both of these focus on helping human experts grapple with the complex data needed to ask high level questions about a program, make hypotheses, and develop a richer understanding of program structure.

Supply Chain Triage

The supply chain triage research effort is focused on helping analysts rapidly assess the software present on a device of interest in order to identify supply chain attacks. Many modern devices contain entire computing platforms, often with a general purpose OS like Linux, and hundreds of programs. The triage effort tries to eliminate as many programs as possible from consideration before passing a small number of “programs of interest” to a human analyst for final evaluation.

GroundTruth in Disassembly

Measuring the quality and effectiveness of binary analysis tools comes with numerous challenges, particularly regarding the accuracy of the initial disassembly used for further analysis. The task of generating a correct disassembly is recognized as a difficult problem, as various common code structures like jump tables, object hierarchies, and pointer manipulation can pose difficulties for these tools. Moreover, malware creators employ obfuscation techniques, packing, and code-rewriting to thwart analysis tool chains. Several tool chains, such as IDA Pro, Ghidra, angr, Radare, Ddisasm, and others, attempt to address these challenges with varying degrees of success.

Comparing the effectiveness of these tools presents real difficulties. Historically, such comparisons have been done by examining the instructions produced by each tool. However, this approach has limitations due to the legitimate choices a tool may make in determining which instructions to include in the disassembly. We propose a solution in the form of truth boundaries, termed "min-truth" and "max-truth," to establish a baseline for measuring disassembly accuracy and further discuss the strengths and limitations of this approach.

Areas Of Interest

Program Analysis

Reverse Engineering

Virtual Reality

Machine Learning

Cyber Security

Adversarial Analysis