A Comparative Study on the Accuracy and the Speed of Static and Dynamic Program Classifiers (CC 2025 - Main Conference)

Who

Anderson Faustino da Silva, Jeronimo Castrillon, Fernando Magno Quintão Pereira

Track

CC 2025 Main Conference

Time Zone

The program is currently displayed in (GMT-08:00) Pacific Time (US & Canada).

Use conference time zone: (GMT-08:00) Pacific Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Sat 1 Mar 2025 16:00 - 16:30 at Acacia A - Binary Analysis and Hardware I Chair(s): Sara Achour

Abstract

Classifying programs based on their tasks is essential in fields such as plagiarism detection, malware analysis, and software auditing. Traditionally, two approaches are employed for this classification: static classifiers analyze program syntax, while dynamic classifiers observe their execution. Although dynamic analysis is commonly regarded as more precise, it is often considered impractical due to high overhead, leading the research community to largely dismiss it. In this paper, we revisit these assumptions by comparing static and dynamic analyses using the same classification representation: opcode histograms. We show that dynamic histograms—generated from instructions actually executed—are only marginally (4-5%) more accurate than static histograms in non-adversarial settings. However, if an adversary is allowed to obfuscate programs, the accuracy of the dynamic classifier is twice higher than the static one, due to its ability to avoid observing dead-code. Obtaining dynamic histograms with a state-of-the-art Valgrind-based tool incurs an 85x slowdown; however, once we account for the time to produce the representations for static analysis of executables, the overall slowdown reduces to 4x: a result significantly lower than previously reported in the literature.

Anderson Faustino da Silva

State University of Maringá

Brazil

Jeronimo Castrillon

TU Dresden, Germany

Germany

Fernando Magno Quintão Pereira

Federal University of Minas Gerais