A Method of Data Clustering for Detecting Outlier from K-Means Clusters

Hide Journal Menu

<< Previous Article

Current Issue

Next Article >>

A Method of Data Clustering for Detecting Outlier from K-Means Clusters

**Muhammad Shaheen¹* and Abdullah²**

Author Affiliations

ABSTRACT

Classification in data mining is one of the major functionalities that is performed either by predicting the value of unknown class labels on the basis of previously labeled data or to make groups of the dataset on the basis of some implicit similarity measure. Clustering works on unsupervised datasets and converts datasets to groups on the basis of some measures like Euclidean distance in K Means Clustering. The performance of K Means can significantly be affected by outliers. Outliers are not dealt in the K Means algorithm. This paper proposes a change in the K Means algorithm to accommodate the method for outlier detection on the basis of the threshold value. The threshold value of the outlier named as clus_span is computed by taking distance of each point from each other point and dividing it by the total number of points. All the points of a dataset that do not qualify the value of the minimum threshold are considered as outliers. New K Means with this add-in is tested on benchmark dataset for identification of outliers and compared with the existing K means algorithm in terms of accuracy. An improvement in performance is evident.

To share on other social networks, click on any share button. What are these?

This Issue

December 2021

Vol. 40, Iss. 2, pp. 91-134

Journal of Engineering and Applied Sciences

A Method of Data Clustering for Detecting Outlier from K-Means Clusters

Special Issues

Membership/Association

A Method of Data Clustering for Detecting Outlier from K-Means Clusters

**Muhammad Shaheen¹* and Abdullah²**

ABSTRACT

Journal of Engineering and Applied Sciences

Featuring

Thermally Radiative Flow of Non‐Newtonian Fluid in an Inclined Channel Through Variable Permeability

Social Impact Assessment of Construction Related to an Infrastructure Development Project in Karachi

Enhancing OCR: A Novel Segmentation Approach for Pashto Text Images into Characters

A Comprehensive Review of Statistical and Machine Learning Based Forecasting Techniques and its Application Domains

Journal of Engineering and Applied Sciences

A Method of Data Clustering for Detecting Outlier from K-Means Clusters

Special Issues

Membership/Association

A Method of Data Clustering for Detecting Outlier from K-Means Clusters

Muhammad Shaheen1* and Abdullah2

ABSTRACT

Journal of Engineering and Applied Sciences

December

Featuring

Thermally Radiative Flow of Non‐Newtonian Fluid in an Inclined Channel Through Variable Permeability

Social Impact Assessment of Construction Related to an Infrastructure Development Project in Karachi

Enhancing OCR: A Novel Segmentation Approach for Pashto Text Images into Characters

A Comprehensive Review of Statistical and Machine Learning Based Forecasting Techniques and its Application Domains

Subscribe Today

**Muhammad Shaheen¹* and Abdullah²**