Although cloud computing ensures benefits like disaster recovery, remote access, large resource pools and increased collaboration, it comes with its own problems like managing and monitoring virtual machines during consolidation. Cloud service providers run multiple servers simultaneously to quench the requirements of the clients. Since these servers have static energy consumption, they consume a lot of energy against their utilization. They run multiple virtual machines in each server parallel, distributing the available resources like number of cores in CPU, memory, HDD storage, CPU execution time, etc., among multiple client applications, thereby ensuring complete usage of resources. Perpetual monitoring generates huge amounts of log data, and it becomes computationally expensive to perform analysis on the data. We propose an effective method by which we reduce the amount of data involved in analysis, thereby reducing computation and storage costs. In this paper, we have focused on its efficiency to cluster into two known clusters based on real traces and how accurate the results are with respect to these two clusters, namely fast and RND depending upon various characteristics including CPU cores, CPU capacity provisioned, CPU usage, memory provisioned and memory usage. We have compared the use of the proposed methodology autoencoders to choose the most pertinent information to group similar VMs and compared it with existing methodology PCA-based components using clustering techniques including k-means, hierarchical and a probabilistic approach Gaussian mixture model. Autoencoders-based components perform far better concerning PCA-based components in terms of predicting which cluster does the VM belongs to. © 2021, Springer Nature Singapore Pte Ltd.