JADE 6th edition | Page 104

104 | JADE ED DE QUINCEY ET AL Running the simple K-means algorithm on this set of data revealed the two most prominent classes of students with the following centroids (average values of the attributes considered): Full Data (66 students) Cluster 0 (40 students) Cluster 1 (26 students) programmeID P11361 P11361 P03657 CW Mark 48% 34% 70% Attendance 61% 55% 70% Total File Views 40 24 64 Tutorial Views 24 15 37 Lecture Views 13 6 22 CW Spec. Views 2 1 3 Attribute Table 2: Returned clusters from the K-means algorithm The above two descriptors of the two classes show a clear distinction between the performance of students within each cluster (according to their coursework mark). The better performing students in Cluster 1 (i.e. those who have achieved a 70% average mark) attended the lectures and tutorials more regularly and accessed all types of material on the CMS intranet more frequently than the students in Cluster 0. Of greater interest however are “the exceptions” to the above inferences. The figure below shows the distribution of student marks compared to their degree programme (represented by the “P” code on the x-axis). Each point, representing a student, has been assigned a colour that relates to one of the 2 clusters detailed in Table 2 above.