IASESE Advanced SchoolESEIW 2023
TITLE: Analytical Observational Studies in Software Engineering
DURATION: Full day
DESCRIPTION:
Running Mining Software Repositories (MSR) studies has become popular over the years. However, as MSR studies use observational data, they lack the required level of control to identify causality. Consequently, papers report correlational results—in the best case—and state that causality cannot be revealed. This is a major limitation for MSR studies. In other disciplines this issue has been addressed by developing strategies that replacing control by choice, are able to come closer to revealing causality in observational studies. The objective of the school is to introduce the participants to Analytical Observational Studies (AOS). This type of studies can also be applied in the context of MSR. After the school, the participants should be able to conduct an AOS following the steps and techniques taught.
The school will explore the concepts of correlation and causation, along with the requirements that a study should meet to be able to identify causality. Special attention will be given to the concept of extraneous variables, as it is central in running AOS. The main content of the school is how to design and analyze an AOS.
The school consists of lectures and an illustrative example that will be run in parallel. The lectures introduce the concepts while in the illustrative example the participants are divided into smaller groups in which they apply the introduced concepts in a real MSR study. The goal is to give the participants ability to apply AOS in their own research.
Aims and Objectives
The school aims to incorporate the study of cause-effect relationships in the Mining Software Repositories (MSR) field by learning to run Analytical Observational Studies (AOS). This aim will be achieved by means of three objectives:
- O1. Understand the difference between correlation and causation, and its implications.
- O2. Learn how AOS can be used in MSR.
- O3: Apply AOS methods to MSR studies.
The school will have an important practical component.
Outline of the covered topics
The school will be organized in the following way:
1. Introduction
Correlation is totally different from causation. The school starts by exploring their differences and discussing the principles to be met when trying to establish a causal relationship (temporal precedence, association, and non-spuriousness). It will be followed by recalling what controlled experiments in SE are; their distinctive features (control and causality); and the mechanisms they use to obtain causality: independent variable manipulation, local control, randomization and data replication (Several datapoints are needed since one single datapoint is not generalizable.)
Finally, some motivational examples are presented that compare MSR studies with controlled experiments and illustrate the current problems when studying causality in MSR studies. These examples will be used to present the aim of the school, along with its learning goals.
2. Observational Studies
Other empirical disciplines have developed methods to be able to study causal relationships from observational data. We will show the different types of studies used in medicine and will introduce the example of epidemiology, which uses a certain type of observational studies called analytical when looking for cause-effect relationships and it is not possible to run controlled experiments, and therefore an observational study is used.
Next, we will introduce a well-known success story of the use of AOS in epidemiology: the identification of smoking as a cause of the development of lung cancer. To identify this causal relationship experiments were never used (given the ethical impossibility of carrying them out, since it would imply asking a large group of non-smokers to start smoking).
3. Illustrative Example
A real example is presented. Participants will be asked to plan an AOS in MSR following the given steps. The example will be run in parallel with the description of the steps.
4. Methods in AOS for SE
We will start recalling the steps to conduct a controlled experiment in SE. From now on, for the steps in the experimental process, we will give an overview of the similarities and differences between experiments and AOS to later explore in detail their differences, and discuss the methods used by AOS.
4.1 Hypothesis Formulation
Both experiments and AOS examine broad theories in narrow, focused, controlled circumstances. However, unlike controlled experiments, the step from association to causation in AOS needs to be clarified by making theories elaborate.
4.2 Variables Selection and Instrumentation
The type of variables involved in an AOS are the same as in a controlled experiment: independent, dependent, and extraneous. Special attention will be paid to extraneous variables, as they particularly affect AOS. Extraneous variables (or third variables) are variables that the researcher is not investigating but can potentially affect the response variable of a study. We will discuss its importance and the different types that can be found. Finally, the data collection and measurement procedures need to be explained for each variable of interest.
4.3 Context and Subject Selection
We will introduce the aspects needed to describe the context of an AOS (setting, locations, and relevant dates). Next, we will focus on describing the different populations of interest. Finally, we will discuss the steps that have to be followed for selecting study subjects.
4.4 Design: Avoiding Extraneous Variables
Researchers might want to rule some extraneous variables out, by counteracting their effect. This implies that during analysis, it will not be possible to assess its effect. There are two techniques to do this: restriction and matching. Here, we explain how to perform them, and when to choose each one.
4.5 Analysis: Identifying Extraneous Variables
While the extraneous variables that have been restricted or matched do not have to be incorporated into the analysis, the remaining ones do have to. We will explain how this should be done. Additionally, AOS require that the influence of potential unmeasured extraneous variables (mainly unknown, but also possibly known ones that have been impossible to measure) on the causal conclusions is examined. This is named sensitivity analysis. We will discuss how to perform it.
4.6 Interpretation
Differently from experiments, during this stage, criticism must be exercised and competing theories need to be evaluated. This is of great importance.
4.7 Validity Evaluation
The list of potential validity threats defined in epidemiology for this type of study adapted to the SE context will be presented here.
5. Wrap-up
The main issues raised during the school will be briefly highlighted, and additional questions and concerns from the participants will be answered.
Wed 25 OctDisplayed time zone: Central Time (US & Canada) change
08:30 - 10:00 | |||
08:30 45mOther | Introduction IASESE Advanced School Sira Vegas Universidad Politecnica de Madrid, Davide Taibi University of Oulu and Tampere University | ||
09:15 45mOther | Observational Studies IASESE Advanced School |
10:30 - 12:00 | Illustrative example and Methods in AOS: hypothesis formulation, variables selection, instrumentationIASESE Advanced School at Oak Alley | ||
10:30 45mOther | Illustrative Example IASESE Advanced School | ||
11:15 45mOther | Methods in AOS: hypothesis formulation, variables selection, instrumentation IASESE Advanced School Sira Vegas Universidad Politecnica de Madrid, Nyyti Saarimäki Tampere University, Davide Taibi University of Oulu and Tampere University |
13:30 - 15:00 | |||
13:30 90mOther | Methods in AOS: context, subject selection, design IASESE Advanced School Valentina Lenarduzzi University of Oulu, Sira Vegas Universidad Politecnica de Madrid, Nyyti Saarimäki Tampere University |
15:30 - 16:30 | |||
15:30 60mOther | Methods in AOS: analysis, interpretation, validity evaluation IASESE Advanced School |
16:30 - 17:00 | |||
16:30 30mOther | Wrap-up IASESE Advanced School Valentina Lenarduzzi University of Oulu, Nyyti Saarimäki Tampere University, Davide Taibi University of Oulu and Tampere University , Sira Vegas Universidad Politecnica de Madrid |