International Journal on Advanced Science, Engineering and Information Technology, Vol. 12 (2022) No. 4, pages: 1410-1420, DOI:10.18517/ijaseit.12.4.15195

Software Traceability in Agile Development Using Topic Modeling

Nuraisa Novia Hidayati, Siti Rochimah, Agus Budi Raharjo


Tracing the implementation of requirements for making better software identifies whether the application fulfils users' desires; progress of development; problematic areas in the testing process, and how far those apply to the source code. In this paper, the software development method we studied was the agile method, Extreme Programming (XP). The artifacts in the agile approach considered vital include the requirement documents, test documents, and source codes. We used Topic Modelling to map the content similarities from those documents to make trace links. The three topic modelling methods we compared consist of Latent Semantic Analysis (LSA), Latent Dirichlet Allocation (LDA), and Non-negative Matrix Factorization (NMF). The NMF method proved itself the most stable, with an accuracy value of 67% for the requirement, 59% for testing, and 48% for defect lists. The second application results proved more accurate with 70%, 79%, and 54%. Although NMF lost to LSA in the second application (LSA achieved an accuracy of 79%, 84%, and 56%), the precision and recall values showed almost similar results. We successfully found the link in the source code based on keywords extracted from each topic. This research provides a way of explaining the requirement in detail, simplifying it for tracing purposes such as the consistent use of terms, technical details inclusion, and mentioning all the variables involved. In the future, sentence structure and synonyms need recognition as part of pre-processing to build better trace links.


Software traceability; Agile; topic modeling; latent semantic analysis; latent dirichlet allocation; non-negative matrix factorization.

Viewed: 1462 times (since abstract online)

cite this paper     download