Automatic Source Code Similarity Detection Science Academic Year 2023 Rejected Computer Science Academic dishonesty in computer programming assignments is present in courses that require students to submit computational solutions written in a programming language. Although platforms for checking code similarity submissions exist, their interfaces and functionalities still need to be updated with more recent similarity detection techniques. This project considers the problem of finding similarities between source code files. Such comparison requires transforming the source code to a representation of logic and behavior independent of the programmer’s fingerprint (e.g., comments, variable and function names, syntax, and control structures). This code transformation is achievable by following compiler design principles and techniques. Additionally, given the number of students enrolled in programming courses, the project must consider efficient similarity detection algorithms and parallel design. The scope of this solution has applications in Computer Science and Programming Education, Source Code Optimization, and good practices in Software Development. The results of this project will be of immediate application in selected core courses in Computer Science. Andres Mauricio M Bejarano Posada Andres Mauricio M Bejarano Posada The selected students are expected to conduct independent work every week. That includes consulting material (e.g., research papers, websites, textbooks, tutorials), proposing ideas, implementing solutions, and thorough testing. There will be instances for the students to present material to diverse audiences (e.g., fellow group members, teaching assistants, professors, and general audience). By the end of the project, each student should have contributed to the source code similarity detection tool/platform by developing one of its components. Having passed a course in Data Structures and Algorithms (CS251 or ECE 368) is required. Having passed CS381 is preferred but not required. Knowledge of Java, C/C++, and Python is preferred. 3 6 (estimated)

This project is not currently accepting applications.