Table of Links
2. Methodology and 2.1. Research Questions
3. Results and Interpretation and 3.1. Type of Problems (RQ1)
4. Implications
4.1. Implications for the Copilot Users
4.2. Implications for the Copilot Team
4.3. Implications for Researchers
6. Related Work
6.1. Evaluating the Quality of Code Generated by Copilot
6.2. Copilot’s Impact on Practical Development and 6.3. Conclusive Summary
2.5. Data Analysis
To answer the three RQs formulated in Section 2, we conducted data analysis by using the Open Coding and Constant Comparison methods, which are two widely employed techniques from Grounded Theory during qualitative data analysis (Stol et al., 2016). Open Coding is not confined by pre-existing theoretical frameworks; instead, it encourages researchers to generate codes based on the actual content within the data. These codes constitute descriptive summarizations of the data, aiming to capture the underlying themes. In Constant Comparison, researchers continuously compare the coded data, dynamically refining and adjusting the categories based on their similarities and differences.
The specific process of data analysis includes four steps: 1) The first author meticulously reviewed the collected data and then assigned descriptive codes that succinctly encapsulated the core themes. For instance, the issue in Discussion #10598 was coded as “Stopped Giving Inline Suggestions”, which was reported by a user who noticed that his previously functioning Copilot had suddenly stopped providing code suggestions in VSCode. 2) The first author compared different codes to identify patterns, commonalities, and distinctions among them. Through this iterative comparison process, similar codes were merged into higherlevel types and categories. For example, the code of Discussion #10598, along with other akin codes, formed into the type of FUNCTIONALITY FAILURE, which further belongs to the category of Operation Issue. Once uncertainties arose, the first author engaged in discussions with the second and third authors to achieve a consensus. It should be noted that, due to the nature of Constant Comparison, both the types and the categories underwent several rounds of refinement before reaching their final form. 4) The initial version of the analysis results was further verified by the second and third authors, and the negotiated agreement approach (Campbell et al., 2013) was employed to address the conflicts. The final results are presented in Section 3.
Authors:
(1) Xiyu Zhou, School of Computer Science, Wuhan University, Wuhan, China ([email protected]);
(2) Peng Liang (Corresponding Author), School of Computer Science, Wuhan University, Wuhan, China ([email protected]);
(3) Beiqi Zhang, School of Computer Science, Wuhan University, Wuhan, China ([email protected]);
(4) Zengyang Li, School of Computer Science, Central China Normal University, Wuhan, China ([email protected]);
(5) Aakash Ahmad, School of Computing and Communications, Lancaster University Leipzig, Leipzig, Germany ([email protected]);
(6) Mojtaba Shahin, School of Computing Technologies, RMIT University, Melbourne, Australia ([email protected]);
(7) Muhammad Waseem, Faculty of Information Technology, University of Jyväskylä, Jyväskylä, Finland ([email protected]).
This paper is