|
ABSTRACT
ISSN: 0975-4024
Title |
: |
Unmasking Outliers in Large Distributed Databases Using Cluster Based Approach: CluBSOLD |
Authors |
: |
A. Rama Satish, P. Bala Krishna Prasad, D. Naga Raju, Ravi Kumar Saidala |
Keywords |
: |
Knowledge Discovery, Outliers detection, Clustering, Distributed databases |
Issue Date |
: |
Apr-May 2016 |
Abstract |
: |
Outliers are dissimilar or inconsistent data objects with respect to the remaining data objects in the data set or which are far away from their cluster centroids. Detecting outliers in data is a very important concept in Knowledge Data Discovery process for finding hidden knowledge. The task of detecting the outliers has been studied in a large number of research areas like Financial Data Analysis, Large Distributed Systems, Biological Data Analysis, Data Mining, Scientific Applications, Health monitoring, etc., Existing research study of outlier detection shows that Density Based outlier detection techniques are robust. Identifying outliers in a distributed environment is not a simple task because processing with a distributed database raises two major issues. First one is rendering massive data which are generated from different databases. And the second is data integration, which may cause data security violation and sensitive information leakage. Handling distributed database is a difficult task. In this paper, we present a cluster based outliers detection to spot outliers in large and vibrant (updated dynamically) distributed database in which cell density based centralized detection is used to succeed in dealing with massive data rendering problem and data integration problem. Experiments are conducted on various datasets and the obtained results clearly shows the robustness of the proposed technique for finding outliers in large distributed database. |
Page(s) |
: |
1212-1222 |
ISSN |
: |
0975-4024 |
Source |
: |
Vol. 8, No.2 |
|