Natalia Juristo – Are we Getting Reliable Evidence? Methodology is Critical in Empirical Studies
Abstract: Empirical studies make it possible to generate evidence about software development identifying relationships between variables and thus improving understanding software building. Conducting each type of empirical study: experiments, observational studies (for instance MSR studies), case studies, etc. keeps up with a methodology (i.e. procedure). Methodological rigor in carrying out empirical studies is essential to obtain trustable findings.
Being aware of the importance of sticking to the methodology when running an empirical study is critical for advancing the generation of evidence about software development. Proper adherence to methodology guarantees results rightness. On trustworthy findings, evidence can be built. Untrustworthy findings are useless or even misleading.
Every time my research group faced with conducting a new type of empirical study or a new activity regarding empirical studies (replicating, aggregating, etc.) we carried out a literature review on how such type of study or activity was being accomplished in the software engineering community to later study in more empirically advanced domains (as medicine, psychology, etc) guides on how to perform such type of study or activity. Over the years, we have accumulated a set of “photographs” about the state of methodological practice regarding empirical software engineering. Unfortunately, in all cases, the picture is somehow ugly. It is evident that we, SE community, despise methodological rigor. Such an attitude makes it difficult to advance and guarantees to keep on stumbling over the same stone.
During the talk I will discuss cases the methodological status of two types of empirical studies (experiments and observational studies), some empirical activities (data analysis and aggregation among others), and several empirical “domains” (as laboratory, MSR, or Deep neural networks). More specifically:
- Experiments:
- Are Crossover experiments being properly analyzed?
- S. Vegas, C. Apa, N. Juristo Crossover Designs in Software Engineering Experiments: Benefits and Perils. TSE 2016
- Are families of experiments being properly aggregated?
- A. Santos, O. S. Gómez, N. Juristo Analyzing Families of Experiments in SE: A Systematic Mapping Study. TSE 2020
- Are Crossover experiments being properly analyzed?
- Are experiments on deep neural networks (to support SE tasks) being properly conducted?
- S. Vegas, S. Elbaum. Pitfalls in Experiments with DNN4SE: An Analysis of the State of the Practice. ESEC/FSE 2023.
- MSR:
- Are MSR researchers aware of the type of empirical study they are conducting and the type of evidence they get?
- C. Ayala, B. Turhan, X. Franch, N. Juristo Use and Misuse of the Term Experiment in Mining Software Repositories Research. TSE 2021
- Are MSR researchers aware of the type of empirical study they are conducting and the type of evidence they get?
- Are analytical observational studies properly conducted?
- N. Saarimäki, V. Lenarduzzi, S. Vegas, N. Juristo, D. Taibi. Cohort Studies in Software Engineering: A Vision of the Future. ESEM 2020
Dr. Natalia Juristo is full professor of software engineering with the Computing School at the Technical University of Madrid (UPM).
She began her career as software engineer in the European Space Agency (Rome) and the European Center for Nuclear Research (Geneva). Back in 1992, she was postdoc visitor in the Software Engineering Institute at Carnegie Mellon University (USA).
Natalia held a Finland Distinguish Professorship at University of Oulu from January 2013 to June 2018. She was the Director of the UPM MSc in Software Engineering from 1992 to 2002 and the coordinator of the Erasmus Mundus European Master on SE (whith the participation of the University of Bolzano, the University of Kaiserslautern and the University of Blekinge) from 2007 to 2012.
She has been member of several journals Editorial Boards including IEEE Transactions on SE, the Journal of Empirial SE, Software Testing, Verification and Reliability and IEEE Software. Juristo has served in several conferences Program Committees (ICSE, RE, REFSQ, ESEM, etc.) and has been General Chair of ICSE 2021 and ESEM 2007 among others.
Natalia has written several books among which highlights Basics of Software
Engineering Experimentation (Kluwer, 2011). She has been ranked 10th most active experienced SE researcher in top-quality journals in the period 2010-2017 (A bibliometric assessment of SE scholars and institutions in JSS, 2019).
In appreciation of her contributions to the research field, Natalia was awarded in 2009 with an Honorary Doctor from the Blekinge Institute of Technology (Sweden)
Lionel Briand – Applications of Language Models to Software Engineering: An Empirical Perspective
Abstract: Language models are increasingly applied to address software engineering problems such as program repair, test generation, or requirements analysis.
Such applications require thorough empirical evaluations that present specific challenges.
I will report on my experience and insights regarding the empirical aspects of such studies.
Lionel C. Briand is professor of software engineering and has shared appointments between (1) School of Electrical Engineering and Computer Science, University of Ottawa, Canada and (2)The Lero SFI Research Centre for Software and University of Limerick, Ireland. He is a Canada research chair in Intelligent Software Dependability and Compliance (Tier1) and the director of Lero. He has conducted applied research in collaboration with industry for more than 25 years, including projects in the automotive, aerospace, manufacturing, financial, and energy domains. He is a fellow of the ACM, IEEE, and Royal Society of Canada. He was also granted the IEEE Computer Society Harlan Mills Award (2012), the IEEE Reliability Society Engineer-of-the-year award (2013), and the ACM SIGSOFT Outstanding Research Award (2022) for his work on software testing and verification. His research interests include: software testing and verification (including security aspects), trustworthy AI, applications of AI in software engineering, model-driven software development, requirements engineering, and empirical software engineering.