An enhanced K-means clustering algorithm for bigdata in cloud

I. Singh; P. Dwivedi; T. Gupta; Shynu P-G

Background and Objective:This paper tries to solve the problem that occurs when K-means clustering algorithm is applied to big data. When K means clustering algorithm is applied to big data then it is hard for the system to keep the track of all the records and search from them. Cloud computing on big data is an open source service provided us to store, manage and process data on demand creating a virtual group. K-means is a clustering algorithm which aims to optimize thedistance between the data point and centre point. K-means is an approach that runs on Hadoop system. Hadoop is afree and open source which is available in Apache foundation. Materials and Methods: This algorithm is not efficient for big data as searching of thehuge index table is difficult, therefore we are introducing k-map with hashing algorithm to increase the efficiency of searching algorithm. Hashing is a process which generates hash keys with a specific algorithm and the searching is done on the basis of those hash keys. A hash function is any function that can be used to map data keys to hashes. Results:The values returned by a hash function (hash values) were used for computing table index from thekey and we got a significant improvement in the results when hashing is used with K-means algorithm.Conclusion:Hashing is a very powerful technique that can be applied to K means clustering algorithm for more effective and efficient performance in the searching for Big Data in the Cloud. © 2016, International Journal of Pharmacy and Technology. All rights reserved.

Journal	International Journal of Pharmacy and Technology
Publisher	International Journal of Pharmacy and Technology
ISSN	0975766X