High-speed video coding and compression are extensively used in many IoT applications with optimum data usage and resolution using three-dimensional discrete cosine transforms (3D-DCT). We propose an efficient hardware implementation for high-speed vector-radix decimation-in-frequency (VR-DIF) 3D-DCT with an optimum area and power consumption. In the previous implementation, the data path arithmetic units used a fixed word length (either 16 or 18 or 21 bits), whereas the proposed architecture uses the range of word length from 11 bits (1-bit sign, 1-bit integer and 9-bit fraction) to 20 bits (1-bit sign, 10-bit integer and 9-bit fraction) to achieve lower silicon area and power consumption. The architecture is optimally pipelined to achieve high processing speed (above 3 Giga samples/s). To test the proposed architecture, an 8 × 8 × 8 video cube with a pixel depth of 8 bits is considered. The arithmetic functional units such as signed adder/subtractor and cosine coefficient multipliers required for implementing 8 × 8 × 8 3D-DCT/IDCT processor is designed with the proposed variable word length. The core of VR-DIF 3D-DCT/IDCT with the variable word length is implemented using TSMC 90 nm technology library. The proposed architecture consumes 26.5% and 23.2% lesser area and power, respectively, than the existing fixed word length 3D-DCT-II implementation tested with a maximum frequency of 653 MHz. © 2020, Springer Science+Business Media, LLC, part of Springer Nature.