116 Slices
Medium 9781601323262

Design of an FPGA-Based FDTD Accelerator Using OpenCL

Hamid R. Arabnia, Lou D'Alotto, Hiroshi Ishii, Minoru Ito, Kazuki Joe, Hiroaki Nishikawa, Georgios Sirakoulis, William Spataro, Giuseppe A. Trunfio, George A. Gravvanis, George Jandieri, Ashu M. G. Solo, Fernando G. Tinetti CSREA Press PDF

Int'l Conf. Par. and Dist. Proc. Tech. and Appl. | PDPTA'14 |

371

Design of an FPGA-Based FDTD Accelerator Using OpenCL

Yasuhiro Takei, Hasitha Muthumala Waidyasooriya, Masanori Hariyama and Michitaka Kameyama

Graduate School of Information Sciences, Tohoku University

Aoba 6-6-05, Aramaki, Aoba, Sendai, Miyagi, 980-8579, Japan

Email: {takei, hasitha, hariyama, kameyama}@ecei.tohoku.ac.jp

Abstract— High-performance computing systems with dedicated hardware on FPGAs can achieve power efficient computations compared with CPUs and GPUs.

However, the hardware design on FPGAs needs more time than the software design on CPUs and GPUs.

We designed an FDTD hardware accelerator using the

OpenCL compiler for FPGAs in this paper. Since it is possible to design a hardware automatically from an OpenCL code, we can implement applications on

FPGAs in a short time compared with the design by using a hardware description language. According to the result of the implementation of the FDTD accelerator on the FPGA, the processing speed is faster than a CPU. Moreover, its power consumption is about onetenth of a GPU.

See All Chapters
Medium 9781601323262

Session - Parallel Computing and Algorithms + Multi-Core and Energy-Aware Computing

Hamid R. Arabnia, Lou D'Alotto, Hiroshi Ishii, Minoru Ito, Kazuki Joe, Hiroaki Nishikawa, Georgios Sirakoulis, William Spataro, Giuseppe A. Trunfio, George A. Gravvanis, George Jandieri, Ashu M. G. Solo, Fernando G. Tinetti CSREA Press PDF
Medium 9781601323262

Session - Communication Systems and Input Output Systems + Interconnection Networks and Topologies + Routing Methods + New Protocols + VANET + Peer-to-Peer Networks + Sensor Networks and Applications

Hamid R. Arabnia, Lou D'Alotto, Hiroshi Ishii, Minoru Ito, Kazuki Joe, Hiroaki Nishikawa, Georgios Sirakoulis, William Spataro, Giuseppe A. Trunfio, George A. Gravvanis, George Jandieri, Ashu M. G. Solo, Fernando G. Tinetti CSREA Press PDF
Medium 9781601323262

Processing Hard Sphere Collisions on a GPU Using OpenCL

Hamid R. Arabnia, Lou D'Alotto, Hiroshi Ishii, Minoru Ito, Kazuki Joe, Hiroaki Nishikawa, Georgios Sirakoulis, William Spataro, Giuseppe A. Trunfio, George A. Gravvanis, George Jandieri, Ashu M. G. Solo, Fernando G. Tinetti CSREA Press PDF

Int'l Conf. Par. and Dist. Proc. Tech. and Appl. | PDPTA'14 |

35

Processing Hard Sphere Collisions on a GPU Using OpenCL

Zachary Langbert1 and Mark C. Lewis1

1 Computer Science, Trinity University, San Antonio, TX, USA

Abstract— Physically accurate hard sphere collisions are inherently sequential as the order in which collisions occur can have a significant impact on the resulting system. This makes processing hard sphere collisions on parallel hardware challenging. We present an approach to solving this problem that can be implemented using OpenCL that runs on current hardware. This approach makes significant use of atomic operations to prevent race conditions, even across thread groups. We find that an unoptimized implementation of the approach provides speed on modest GPUs that is on par with our earlier OpenMP parallel CPU approach and the OpenCL running on a CPU is faster than the OpenMP code. Full timing results using commodity GPU and using

OpenCL on multi-core chips are presented.

Keywords: Simulation, collisions, parallel, discrete-event, GPU

See All Chapters
Medium 9781601323262

Multi-Gbps Fano Decoding Algorithm on GPGPU

Hamid R. Arabnia, Lou D'Alotto, Hiroshi Ishii, Minoru Ito, Kazuki Joe, Hiroaki Nishikawa, Georgios Sirakoulis, William Spataro, Giuseppe A. Trunfio, George A. Gravvanis, George Jandieri, Ashu M. G. Solo, Fernando G. Tinetti CSREA Press PDF

Int'l Conf. Par. and Dist. Proc. Tech. and Appl. | PDPTA'14 |

299

Multi-Gbps Fano Decoding Algorithm on GPGPU

Ozgur Ates, Selcuk Keskin and Taskin Kocak

Department of Computer Engineering, Bahcesehir University, Istanbul 34353, Turkey

Abstract— The bandwidth requirements for the nextgeneration wireless applications are increasing. The newest standards such as the WirelessHD aim to transmit signals at high speed in the range of multi-Gigabit per second (Gbps). At this rate, the processing effort of the baseband signals becomes challenging. In this paper, we propose to use GPGPU for parallel processing to offer multi-Gbps throughput for a sequential convolutional decoding algorithm; namely, the Fano algorithm. NVIDIA’s latest Kepler architecture based K20c GPU and their

CUDA programming platform are used. Some algorithmic and CUDA-based optimizations are developed to achieve a throughput of 4.6 Gbps.

Keywords: Fano algorithm, CUDA, high throughput decoding

1. Introduction

Convolutional coding is a subject that can be said to have started with P. Elias [1]. Inspired by Shannon’s mathematical material in communication [2] and Hamming’s paper on error-correcting code [3], Elias introduced the concept of convolutional coding. This well-known mechanism in telecommunication is used in signal transmission over a noisy channel. Decoding a signal is one of the most time consuming tasks done at the baseband along with the fast Fourier transform (FFT).

See All Chapters

See All Slices