When analyzing any corpus of programs, care must be taken to ensure that the corpus is truly representative of the entire ecosystem, otherwise the observed features might be far from reality. A naive approach is to increase the size of the dataset, thus diminishing the chance that an interesting feature will be left out. However, such approach may easily lead to overemphasis on features that are mostly present, but not frequently executed.
To tackle this issue, the code duplication patterns in the corpus and the ecosystem must be understood and correlated to the actual frequency of the code in the wild.
Wed 18 JulDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
14:00 - 15:30
Saam Barati AppleFile Attached
|Building a Node.js Benchmark: Initial Steps|
Petr Maj Czech Technical University, François Gauthier Oracle Labs, Celeste Hollenbeck Northeastern University, USA, Jan Vitek Northeastern University, Cristina Cifuentes Oracle LabsFile Attached
|A Micro-Benchmark for Dynamic Program Behaviour|