Abstract |
: |
A common problem encountered by many data mining techniques is the missing data. A missing data is defined as an attribute or feature in a dataset which has no associated data value. Correct treatment of these data is crucial, as they have a negative impact on the interpretation and result of data mining processes. Missing value handling techniques can be grouped into four categories, namely, complete case analysis, Imputation methods, maximum likelihood methods and machine learning methods. Out of these imputation methods are the widely used solution for handling missing values. However, there are situations when imputation methods might not work correctly. This study studies and analyzes the performance of two algorithms, one imputation based and another without imputation based classification on missing data. |