This study describes Fast Fourier Transform implementation using fused floating point operations in parallel fashion. The Fast Fourier Transform processors use butterfly unit for computations on complex data. These operations are performed by FFT processors using complex butterfly operations that consist of multiplication, addition and subtraction operations. The main contribution in this research includes a radix-8 butterfly unit with higher efficiency. Also this butterfly unit performs faster than the conventional butterfly. The area required is reduced with the use of FFT Floating Point Butterfly unit as compared to the conventional butterfly unit. The complete architecture is synthesized and simulated using Xilinx ISE Software. The comparison of our proposed method with similar FFT architecture using radix-4 exhibited about 26.36% reduction in area and about 50.22% reduction in overall power consumption. © Medwell Journals, 2016.