Cite Article

Design and Evaluation of a Scalable Engine for 3D-FFT Computation in an FPGA Cluster

Choose citation format

BibTeX

@article{IJASEIT8308,
   author = {Roberto Ammendola and Pierpaolo Loreti},
   title = {Design and Evaluation of a Scalable Engine for 3D-FFT Computation in an FPGA Cluster},
   journal = {International Journal on Advanced Science, Engineering and Information Technology},
   volume = {9},
   number = {2},
   year = {2019},
   pages = {677--684},
   keywords = {3D-FFT; FPGA; high-performance computing; cluster.},
   abstract = {

The Three Dimensional Fast Fourier Transform (3D-FFT) is commonly used to solve the partial differential equations describing the system evolution in several physical phenomena, such as the motion of viscous fluids described by the Navier–Stokes equations. Simulation of such problems requires the use of a parallel High-Performance Computing architecture since the size of the problem grows with the cube of the FFT size, and the representation of the single point comprises several double precision floating- point complex numbers. Modern High-Performance Computing (HPC) systems are considering the inclusion of FPGAs as components of this computing architecture because they can combine effective hardware acceleration capabilities and dedicated communication facilities. Furthermore, the network topology can be optimized for the specific calculation that the cluster must perform, especially in the case of algorithms limited by the data exchange delay between the processors. In this paper, we explore an HPC design that uses FPGA accelerators to compute the 3DFFT. We devise a scalable FFT engine based on a custom radix-2 double-precision core that is used to implement the Decimation in Frequency version of the Cooley–Tukey FFT algorithm. The FFT engine can be adapted to different technology constraints and networking topologies by adjusting the number of cores and configuration parameters in order to minimize the overall calculation time. We compare the various possible configurations with the technological limits of available hardware. Finally, we evaluate the bandwidth required for continuous FFT execution in the APEnet toroidal mesh network. 

},    issn = {2088-5334},    publisher = {INSIGHT - Indonesian Society for Knowledge and Human Development},    url = {http://ijaseit.insightsociety.org/index.php?option=com_content&view=article&id=9&Itemid=1&article_id=8308},    doi = {10.18517/ijaseit.9.2.8308} }

EndNote

%A Ammendola, Roberto
%A Loreti, Pierpaolo
%D 2019
%T Design and Evaluation of a Scalable Engine for 3D-FFT Computation in an FPGA Cluster
%B 2019
%9 3D-FFT; FPGA; high-performance computing; cluster.
%! Design and Evaluation of a Scalable Engine for 3D-FFT Computation in an FPGA Cluster
%K 3D-FFT; FPGA; high-performance computing; cluster.
%X 

The Three Dimensional Fast Fourier Transform (3D-FFT) is commonly used to solve the partial differential equations describing the system evolution in several physical phenomena, such as the motion of viscous fluids described by the Navier–Stokes equations. Simulation of such problems requires the use of a parallel High-Performance Computing architecture since the size of the problem grows with the cube of the FFT size, and the representation of the single point comprises several double precision floating- point complex numbers. Modern High-Performance Computing (HPC) systems are considering the inclusion of FPGAs as components of this computing architecture because they can combine effective hardware acceleration capabilities and dedicated communication facilities. Furthermore, the network topology can be optimized for the specific calculation that the cluster must perform, especially in the case of algorithms limited by the data exchange delay between the processors. In this paper, we explore an HPC design that uses FPGA accelerators to compute the 3DFFT. We devise a scalable FFT engine based on a custom radix-2 double-precision core that is used to implement the Decimation in Frequency version of the Cooley–Tukey FFT algorithm. The FFT engine can be adapted to different technology constraints and networking topologies by adjusting the number of cores and configuration parameters in order to minimize the overall calculation time. We compare the various possible configurations with the technological limits of available hardware. Finally, we evaluate the bandwidth required for continuous FFT execution in the APEnet toroidal mesh network. 

%U http://ijaseit.insightsociety.org/index.php?option=com_content&view=article&id=9&Itemid=1&article_id=8308 %R doi:10.18517/ijaseit.9.2.8308 %J International Journal on Advanced Science, Engineering and Information Technology %V 9 %N 2 %@ 2088-5334

IEEE

Roberto Ammendola and Pierpaolo Loreti,"Design and Evaluation of a Scalable Engine for 3D-FFT Computation in an FPGA Cluster," International Journal on Advanced Science, Engineering and Information Technology, vol. 9, no. 2, pp. 677-684, 2019. [Online]. Available: http://dx.doi.org/10.18517/ijaseit.9.2.8308.

RefMan/ProCite (RIS)

TY  - JOUR
AU  - Ammendola, Roberto
AU  - Loreti, Pierpaolo
PY  - 2019
TI  - Design and Evaluation of a Scalable Engine for 3D-FFT Computation in an FPGA Cluster
JF  - International Journal on Advanced Science, Engineering and Information Technology; Vol. 9 (2019) No. 2
Y2  - 2019
SP  - 677
EP  - 684
SN  - 2088-5334
PB  - INSIGHT - Indonesian Society for Knowledge and Human Development
KW  - 3D-FFT; FPGA; high-performance computing; cluster.
N2  - 

The Three Dimensional Fast Fourier Transform (3D-FFT) is commonly used to solve the partial differential equations describing the system evolution in several physical phenomena, such as the motion of viscous fluids described by the Navier–Stokes equations. Simulation of such problems requires the use of a parallel High-Performance Computing architecture since the size of the problem grows with the cube of the FFT size, and the representation of the single point comprises several double precision floating- point complex numbers. Modern High-Performance Computing (HPC) systems are considering the inclusion of FPGAs as components of this computing architecture because they can combine effective hardware acceleration capabilities and dedicated communication facilities. Furthermore, the network topology can be optimized for the specific calculation that the cluster must perform, especially in the case of algorithms limited by the data exchange delay between the processors. In this paper, we explore an HPC design that uses FPGA accelerators to compute the 3DFFT. We devise a scalable FFT engine based on a custom radix-2 double-precision core that is used to implement the Decimation in Frequency version of the Cooley–Tukey FFT algorithm. The FFT engine can be adapted to different technology constraints and networking topologies by adjusting the number of cores and configuration parameters in order to minimize the overall calculation time. We compare the various possible configurations with the technological limits of available hardware. Finally, we evaluate the bandwidth required for continuous FFT execution in the APEnet toroidal mesh network. 

UR - http://ijaseit.insightsociety.org/index.php?option=com_content&view=article&id=9&Itemid=1&article_id=8308 DO - 10.18517/ijaseit.9.2.8308

RefWorks

RT Journal Article
ID 8308
A1 Ammendola, Roberto
A1 Loreti, Pierpaolo
T1 Design and Evaluation of a Scalable Engine for 3D-FFT Computation in an FPGA Cluster
JF International Journal on Advanced Science, Engineering and Information Technology
VO 9
IS 2
YR 2019
SP 677
OP 684
SN 2088-5334
PB INSIGHT - Indonesian Society for Knowledge and Human Development
K1 3D-FFT; FPGA; high-performance computing; cluster.
AB 

The Three Dimensional Fast Fourier Transform (3D-FFT) is commonly used to solve the partial differential equations describing the system evolution in several physical phenomena, such as the motion of viscous fluids described by the Navier–Stokes equations. Simulation of such problems requires the use of a parallel High-Performance Computing architecture since the size of the problem grows with the cube of the FFT size, and the representation of the single point comprises several double precision floating- point complex numbers. Modern High-Performance Computing (HPC) systems are considering the inclusion of FPGAs as components of this computing architecture because they can combine effective hardware acceleration capabilities and dedicated communication facilities. Furthermore, the network topology can be optimized for the specific calculation that the cluster must perform, especially in the case of algorithms limited by the data exchange delay between the processors. In this paper, we explore an HPC design that uses FPGA accelerators to compute the 3DFFT. We devise a scalable FFT engine based on a custom radix-2 double-precision core that is used to implement the Decimation in Frequency version of the Cooley–Tukey FFT algorithm. The FFT engine can be adapted to different technology constraints and networking topologies by adjusting the number of cores and configuration parameters in order to minimize the overall calculation time. We compare the various possible configurations with the technological limits of available hardware. Finally, we evaluate the bandwidth required for continuous FFT execution in the APEnet toroidal mesh network. 

LK http://ijaseit.insightsociety.org/index.php?option=com_content&view=article&id=9&Itemid=1&article_id=8308 DO - 10.18517/ijaseit.9.2.8308