MSR 2023
Dates to be announced Melbourne, Australia
co-located with ICSE 2023
Tue 16 May 2023 11:50 - 12:02 at Meeting Room 109 - Development Tools & Practices II Chair(s): Banani Roy

An embedded system is a system consisting of software code, controller hardware, and Input/Output hardware that performs a specific task. There are several challenges when developing an embedded system. First, the code often involves hardware configurations that require domain-specific knowledge. Second, the hardware may have code usage patterns that should be followed. To overcome such challenges, we propose a framework called ArduinoProg towards automatic generation of Arduino applications. ArduinoProg takes natural language queries as input, then outputs hardware configurations and code usage patterns of the hardware for the query. Motivated by our findings on the characteristics of real-world queries posted in the official Arduino forum, we formulate ArduinoProg as three components, i.e., Library Retriever, Hardware Classifier, and API Generator. First, Library Retriever preprocesses the input query and retrieves a set of relevant library names using either lexical matching or vector-based similarity. Second, given Library Retriever’s output, Hardware Classifier infers the hardware configuration by classifying the method definitions from the implementation files of a library into certain communication protocol classes. Third, API Generator leverages a sequence-to-sequence model to generate the code usage patterns also based on the Library Retriever’s output. Having instantiated each component of ArduinoProg with various machine learning models, we have evaluated ArduinoProg on real-world queries. The performance of Library Retriever ranges from 44.0%-97.1% in terms of Precision@K; the Hardware Classifier classifier can achieve 0.79-0.92 in terms of the area under the Receiver Operating Characteristics curve (AUC); API Generator can yield 0.45-0.73 in terms of Normalized Discounted Cumulative Gain (NDCG)@K. Demo: https://youtu.be/d8E4Zjrs_KQ

Tue 16 May

Displayed time zone: Hobart change

11:50 - 12:35
Development Tools & Practices IIData and Tool Showcase Track / Industry Track / Technical Papers / Registered Reports at Meeting Room 109
Chair(s): Banani Roy University of Saskatchewan
11:50
12m
Talk
Automating Arduino Programming: From Hardware Setups to Sample Source Code Generation
Technical Papers
Imam Nur Bani Yusuf Singapore Management University, Singapore, Diyanah Binte Abdul Jamal Singapore Management University, Lingxiao Jiang Singapore Management University
Pre-print
12:02
6m
Talk
A Dataset of Bot and Human Activities in GitHub
Data and Tool Showcase Track
Natarajan Chidambaram University of Mons, Alexandre Decan University of Mons; F.R.S.-FNRS, Tom Mens University of Mons
12:08
6m
Talk
Mining the Characteristics of Jupyter Notebooks in Data Science Projects
Registered Reports
Morakot Choetkiertikul Mahidol University, Thailand, Apirak Hoonlor Mahidol University, Chaiyong Ragkhitwetsagul Mahidol University, Thailand, Siripen Pongpaichet Mahidol University, Thanwadee Sunetnanta Mahidol University, Tasha Settewong Mahidol University, Raula Gaikovina Kula Nara Institute of Science and Technology
12:14
6m
Talk
Optimizing Duplicate Size Thresholds in IDEs
Industry Track
Konstantin Grotov JetBrains Research, Constructor University, Sergey Titov JetBrains Research, Alexandr Suhinin JetBrains, Yaroslav Golubev JetBrains Research, Timofey Bryksin JetBrains Research
Pre-print
12:20
12m
Talk
Boosting Just-in-Time Defect Prediction with Specific Features of C Programming Languages in Code Changes
Technical Papers
Chao Ni Zhejiang University, xiaodanxu College of Computer Science and Technology, Zhejiang university, Kaiwen Yang Zhejiang University, David Lo Singapore Management University