1.The Upper Bound of Information Diffusion in Code Review

Authors:Michael Dorner, Daniel Mendez, Krzysztof Wnuk, Ehsan Zabardast, Jacek Czerwonka

Abstract: Background: Code review, the discussion around a code change among humans, forms a communication network that enables its participants to exchange and spread information. Although reported by qualitative studies, our understanding of the capability of code review as a communication network is still limited. Objective: In this article, we report on a first step towards evaluating the capability of code review as a communication network by quantifying how fast and how far information can spread through code review: the upper bound of information diffusion in code review. Method: In an in-silico experiment, we simulate an artificial information diffusion within large (Microsoft), mid-sized (Spotify), and small code review systems (Trivago) modelled as communication networks. We then measure the minimal topological and temporal distances between the participants to quantify how far and how fast information can spread in code review. Results: An average code review participants in the small and mid-sized code review systems can spread information to between 72% and 85% of all code review participants within four weeks independently of network size and tooling; for the large code review systems, we found an absolute boundary of about 11000 reachable participants. On average (median), information can spread between two participants in code review in less than five hops and less than five days. Conclusion: We found evidence that the communication network emerging from code review scales well and spreads information fast and broadly, corroborating the findings of prior qualitative work. The study lays the foundation for understanding and improving code review as a communication network.

2.MuRS: Mutant Ranking and Suppression using Identifier Templates

Authors:Zimin Chen, Malgorzata Salawa, Manushree Vijayvergiya, Goran Petrovic, Marko Ivankovic, Rene Just

Abstract: Diff-based mutation testing is a mutation testing approach that only mutates lines affected by a code change under review. Google's mutation testing service integrates diff-based mutation testing into the code review process and continuously gathers developer feedback on mutants surfaced during code review. To enhance the developer experience, the mutation testing service implements a number of suppression rules, which target not-useful mutants-that is, mutants that have consistently received negative developer feedback. However, while effective, manually implementing suppression rules require significant engineering time. An automatic system to rank and suppress mutants would facilitate the maintenance of the mutation testing service. This paper proposes and evaluates MuRS, an automated approach that groups mutants by patterns in the source code under test and uses these patterns to rank and suppress future mutants based on historical developer feedback on mutants in the same group. To evaluate MuRS, we conducted an A/B testing study, comparing MuRS to the existing mutation testing service. Despite the strong baseline, which uses manually developed suppression rules, the results show a statistically significantly lower negative feedback ratio of 11.45% for MuRS versus 12.41% for the baseline. The results also show that MuRS is able to recover existing suppression rules implemented in the baseline. Finally, the results show that statement-deletion mutant groups received both the most positive and negative developer feedback, suggesting a need for additional context that can distinguish between useful and not-useful mutants in these groups. Overall, MuRS has the potential to substantially reduce the development and maintenance cost for an effective mutation testing service by automatically learning suppression rules.

3.Fix Fairness, Don't Ruin Accuracy: Performance Aware Fairness Repair using AutoML

Authors:Giang Nguyen, Sumon Biswas, Hridesh Rajan

Abstract: Machine learning (ML) is increasingly being used in critical decision-making software, but incidents have raised questions about the fairness of ML predictions. To address this issue, new tools and methods are needed to mitigate bias in ML-based software. Previous studies have proposed bias mitigation algorithms that only work in specific situations and often result in a loss of accuracy. Our proposed solution is a novel approach that utilizes automated machine learning (AutoML) techniques to mitigate bias. Our approach includes two key innovations: a novel optimization function and a fairness-aware search space. By improving the default optimization function of AutoML and incorporating fairness objectives, we are able to mitigate bias with little to no loss of accuracy. Additionally, we propose a fairness-aware search space pruning method for AutoML to reduce computational cost and repair time. Our approach, built on the state-of-the-art Auto-Sklearn tool, is designed to reduce bias in real-world scenarios. In order to demonstrate the effectiveness of our approach, we evaluated our approach on four fairness problems and 16 different ML models, and our results show a significant improvement over the baseline and existing bias mitigation techniques. Our approach, Fair-AutoML, successfully repaired 60 out of 64 buggy cases, while existing bias mitigation techniques only repaired up to 44 out of 64 cases.