Applying Machine Learning Techniques In Diagnosing Bacterial Vaginosis
Abstract
Bacterial Vaginosis (BV) is the most common of vaginal infections diagnosed amongst women of child bearing years. Yet, there is very little insight as to how it occurs. There are a vast number of criteria that can be taken into consideration in determining the presence of BV. The purpose of this thesis is two-fold: first, to discover the most significant features necessary to diagnose the infection, and second, to apply various classification algorithms on the selected features. In order to fulfill our purpose, we conducted an array of experiments on the data. We tested the full set of raw data, removed the time series features, tested the medical and clinical features in isolation, cleaned the data and performed the same experiments on the clean full, clean clinical and clean medical datasets. We compared the accuracy, precision, recall and F-measure and time elapsed for each feature selection and classification grouping. It is observed that certain feature selection algorithms provided only a few features; however, the classification results were as good as using a large number of features. After comparing all of the experiments, the algorithms performed best on the raw full and clean full datasets. However, the raw full dataset returned better comprehensive results.