Write a Blog >>
EASE 2021
Mon 21 - Thu 24 June 2021
Wed 23 Jun 2021 10:30 - 10:52 at Zoom - Artificial intelligence in software engineering Chair(s): Torgeir Dingsøyr

Decisions run through the whole software development and maintenance processes. Explicitly documenting these decisions helps to organize development knowledge and to reduce its vaporization, thereby controlling the development process and maintenance costs. It can also support the knowledge acquisition process for stakeholders of the project. Meanwhile, developers (e.g., architects) and managers will be able to rely on the decisions made in the past to solve the problems encountered in their current projects. However, identifying decisions from massive textual artifacts, which involves considerable human effort, time, and cost, is usually unaffordable due to limited resources. To address this problem, we conducted an experiment to automatically identify decisions from textual artifacts using machine learning techniques. We created a dataset of 1,300 sentences labelled from the Hibernate developer mailing list, containing 650 decision sentences and non-decision sentences respectively, and trained machine learning models using 160 configurations regarding text preprocessing, feature extraction, and classification algorithms. The results show that (1) the text preprocessing method with Including Stop Words, No Stemming and Lemmatization, and No Filtering Out Sentences performs best when preprocessing posts to identify decisions; (2) the simple Bag-of-Words (BoW) model works best when extracting features to identify decisions; (3) the Support Vector Machine (SVM) algorithm gets the best result when training classifiers to identify decisions; and (4) the SVM algorithm with Including Stop Words (ISW), No Stemming and Lemmatization (NSaL), Filtering Out Sentences by Length (FOSbL), and BoW achieves the best performance (with a precision of 0.640, a recall of 0.932, and an F1-score of 0.759), compared with other configurations when identifying decisions from the mailing list.

Wed 23 Jun

Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

10:30 - 12:00
Artificial intelligence in software engineeringEASE 2020 at Zoom
Chair(s): Torgeir Dingsøyr Norwegian University of Science and Technology
10:30
22m
Full-paper
Automatic Identification of Decisions from the Hibernate Developer Mailing List
EASE 2020
Xueying Li Wuhan University, Peng Liang Wuhan University, Zengyang Li Central China Normal University
Pre-print Media Attached
10:52
22m
Full-paper
A Bigram-based Inference Model for Retrieving Abbreviated Phrases in Source Code
EASE 2020
Abdulrahman Alatawi , Weifeng Xu University of Baltimore, Dianxiang Xu University of Missouri
11:15
22m
Full-paper
A Multinomial Naive Bayesian (MNB) network to automatically recommend topics for GitHub repositories
EASE 2020
Claudio Di Sipio University of L'Aquila, Riccardo Rubei University of L'Aquila, Davide Di Ruscio University of L'Aquila, Phuong T. Nguyen University of L’Aquila
Pre-print
11:37
22m
Other
MLCQ: Industry-relevant Code Smell Data Set
EASE 2020
Lech Madeyski , Tomasz Lewowski Wrocław University of Science and Technology
Pre-print