We live in a world that is being driven by data. This leads to challenges of extracting and analyzing knowledge from large volumes of data. An example of such a challenge is intrusion detection. Intrusion detection data sets are characterized by huge volumes, which affects the learning of the classifier. So there is a need to reduce the size of the training sets. Fortunately, inspection and analysis of available intrusion detection data sets showed that many instances are very similar and do not provide relevant information to the classification process. This prompted to look for possibilities to use a fast algorithm that, as much as possible, removes similar instances in intrusion detection data sets while enforcing the detection rate. In this work, a new fast instance reduction algorithm is presented. The proposed algorithm provides greater efficiency during the training stage, without significantly affecting the efficacy during the intrusion detection task.
Bibliographical noteGreen Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project https://www.openaccess.nl/en/you-share-we-take-care
Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.
- Data mining
- Data preprocessing
- Data reduction
- Instance selection
- Intrusion detection
- Supervised classification