Energy, today, has become one of the most important factors in almost every industry domain. The energy reserves of a country largely influences its growth rate. Particularly in case of the textile industry, this contributes 2% to the world GDP. This paper aims at demonstrating through experimentation, how reinforcement learning, a branch of machine learning, can be adopted to help manage and optimize energy usage in a cotton textile mill. The models explored in this paper are established as nonlinear programming mode with the experiment result showing the performance of the penalty based reinforcement-learning algorithms in comparison with results obtained in [1] per annum. It has been calculated that there is a reduction of approximately 1.3% using Thompson Sampling and UCB algorithms, and 1.26% in energy consumption using random selection. © 2020 IEEE.