Effect of neural network structure in accelerating performance and accuracy of a convolutional neural network with GPU/TPU for image analytics
In deep learning the most significant breakthrough in the field of image recognition, object detection language processing was done by Convolutional Neural Network (CNN). Rapid growth in data and neural networks the performance of the DNN algorithms depends on the computation power and the storage capacity of the devices.
In this paper, the convolutional neural network used for various image applications was studied and its acceleration in the various platforms like CPU, GPU, TPU was done. The neural network structure and the computing power and characteristics of the GPU, TPU was analyzed and summarized, the effect of these on accelerating the tasks is also explained. Cross-platform comparison of the CNN was done using three image applications the face mask detection (object detection/Computer Vision), Virus Detection in Plants (Image Classification: agriculture sector), and Pneumonia detection from X-ray Images (Image Classification/medical field).
The CNN implementation was done and a comprehensive comparison was done on the platforms to identify the performance, throughput, bottlenecks, and training time. The CNN layer-wise execution in GPU and TPU is explained with layer-wise analysis. The impact of the fully connected layer and convolutional layer on the network is analyzed. The challenges faced during the acceleration process were discussed and future works are identified