Header menu link for other important links
X
SIMD acceleration of SpMV Kernel on multi-core CPU architecture
Published in International Institute for General Systems Studies
2015
Volume: 15
   
Issue: 3
Pages: 273 - 293
Abstract
As the field of Sparse Matrix vector multiplication (SpMV) matures and its breadth of application increases, the need for parallel implementation becomes necessary. SpMV is proved to be a bottleneck due to its irregular access patterns. Various storage formats for sparse matrices have been proposed to solve this issue. The Quad tree-Compressed Row Storage (QCSR) format shows good performance improvement over other formats such as Compressed Row Storage (CSR) and Blocked CSR (BCSR) for SpMV. This paper extends QCSR format to exploit the Single Instruction Multiple Data (SIMD) registers which are available in current processors. Programming with SIMD registers in a single core processor achieves parallelism with reduced power and without any additional hardware requirement, as in the case of Graphics Processing Unit (GPU) computing. To program effectively in SIMD units Intels Streaming SIMD Extension (SSE) instructions are used. In this paper computational performance of QCSR-SpMV is determined for over a collection of 10 benchmark matrices on SIMD units of X86 architecture. Experimental results demonstrate QCSR-SIMD achieves significant average speedup of 2.0 x compared to CSR-SIMD. © 2015, International Institute for General Systems Studies. All rights reserved.
About the journal
JournalAdvances in Systems Science and Applications
PublisherInternational Institute for General Systems Studies
ISSN10786236