CBM for Text Classification


Project information

Because case-based classification techniques are local learners they are particularly susceptible to the problem of noisy training data. Previous work by this group has shown that in certain domains, particularly spam filtering, standard case-base editing techniques are unsuitable. This project aims to investigate why the newly developed blame-based noise reduction technique is particularly successful in the spam domain and whether its success can be transferred to the broader, but related, domains of text classification and fraud detection. Furthermore, the project will investigate whether the techniques of feature free learning and active learning can be applied to the noise reduction problem.

Project funding

SFI The project is funded by SFI Research Frontiers Programme (RFP)


Research Students

Rong Hu

Gurpreet Singh