Matrix Multiplication Flowchart in Python

tritonBLAS: A Lightweight Triton-based General Matrix Multiplication (GEMM) Library

This project is intended for research purposes only. Use it at your own risk and discretion. Triton is a language and compiler for writing highly efficient ML primitives, one of the most common ...

IEEE

DenSparSA: A Balanced Systolic Array Approach for Dense and Sparse Matrix Multiplication

Abstract: Numerous studies have proposed hardware architectures to accelerate sparse matrix multiplication, but these approaches often incur substantial area and power overhead, significantly ...

GitHub

KAMI: Communication-Avoiding General Matrix Multiplication within a Single GPU

This repository contains the artifact for the SC '25 paper submission "KAMI: Communication-Avoiding General Matrix Multiplication within a Single GPU." The NVIDIA GH200 is installed with Ubuntu 22.04 ...

IEEE

MH-SpGEMM: Efficient Sparse General Matrix-Matrix Multiplication on Modern GPUs via Masking and Hashing Cooperative Optimization

Abstract: Sparse General Matrix-Matrix Multiplication (SpGEMM) is a core operation in high-performance computing applications such as algebraic multigrid solvers, machine learning, and graph ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results