High-Performance Computing
Designing scalable acceleration strategies for compute-intensive workloads on supercomputers and GPU clusters.
I explore hardware-aware acceleration for computation-intensive workloads, spanning FPGA-based machine learning, sparse neural network architectures, and high-performance computing. Recent papers investigate how supercomputers and large-scale GPU clusters shorten complex numerical workloads, while reconfigurable processors orchestrate inference and training pipelines. A major thrust is leveraging sparsity-aware dataflows and memory-light data paths on FPGA and CGRA platforms to accelerate neural operators with high fidelity. By tailoring the software stack and hardware implementations together, I aim to transition these AI acceleration capabilities into resilient, production-ready systems.
Leading the Sparsity-aware Coarse-grained Reconfigurable Accelerator project with support from the Google Silicon Research Grant (FY2024–2025).
Designing scalable acceleration strategies for compute-intensive workloads on supercomputers and GPU clusters.
Designing FPGA-oriented architectures that balance agility and throughput for data-intensive applications.
Building sparse neural network accelerators and near-memory computing fabrics for deep learning workloads.
Publications spanning FPGA training accelerators, sparse CNN deployment, and high-performance computing.
Regular contributions to FPT, FPGA, FCCM, and related venues on reconfigurable and AI accelerators.
Active projects backed by Google, JSPS, and industrial partners on sparse computing architectures.
For complete publication, award, and grant records, please visit the English publications page.