Introduction - If you have any usage issues, please Google them yourself
This repository includes a pure Vitis HLS implementation of matrix-matrix multiplication (A*B=C) for Xilinx FPGAs, using Xilinx Vitis to instantiate memory and PCIe controllers and interface with the host.
Experiments run on a VCU1525 achieved 462 GFLOP/s, 301 GFLOP/s and 132 GFLOP/s for half, single, and double precision, respectively, with routing across the three SLRs being the primary bottleneck preventing further scaling. The code is not device-specific, and can be configured for any Xilinx FPGA supported by the Xilinx OpenCL runtime. Kernels have also been verified to execute on TUL KU115, Alveo U250, and Alveo U280 boards with similar results.
The implementation uses a systolic array approach, where linearly connected processing elements compute distinct contributions to the outer product of tiles of the output matrix.
The approach used to implement this kernel was presented at FPGA'20 [1]. For a general descr iption of the optimization techniques that we apply, we refer to our article on HLS transformations [2]. We also gave a tutorial on HLS for HPC at SC'21, ISC'21, SC'20, HiPEAC'20, SC'19, SC'18, and PPoPP'18.
Packet : xapp1170_floating_point_matrix_multiplication-main.zip filelist
xapp1170_floating_point_matrix_multiplication-main/
xapp1170_floating_point_matrix_multiplication-main/README.md
xapp1170_floating_point_matrix_multiplication-main/block_design.PNG
xapp1170_floating_point_matrix_multiplication-main/fp_mmult.ipynb
xapp1170_floating_point_matrix_multiplication-main/hls/
xapp1170_floating_point_matrix_multiplication-main/hls/mmult.h
xapp1170_floating_point_matrix_multiplication-main/hls/mmult_accel.cpp
xapp1170_floating_point_matrix_multiplication-main/hls/mmult_test.cpp
xapp1170_floating_point_matrix_multiplication-main/hls/run_hls_script.tcl
xapp1170_floating_point_matrix_multiplication-main/vivado/
xapp1170_floating_point_matrix_multiplication-main/vivado/fp_mmult.bit
xapp1170_floating_point_matrix_multiplication-main/vivado/fp_mmult.hwh