Work in progress: this is a wiki post, everyone is welcome to modify it directly
See the previous published edition.
Welcome to the tenth issue of the MLIR (bi)Weekly, a newsletter (published on Friday) covering developments in MLIR, and related projects in the ecosystem. MLIR (bi)Weekly is brought to you by a collective effort of contributors, we welcome your contributions!
Optimizations and Code Generation
In the Ecosystem
Flang, the LLVM Fortran Compiler
IREE : An Experimental MLIR Execution Environment
mlir-npcomp: Prototype for compiling numpy programs
In this paper, we describe a polyhedral approach to generate efficient CUDA kernels for matrix multiplication using inline assembly instructions for programming tensor cores on NVIDIA Volta GPUs. Furthermore, we build on this approach to generate fused kernels for computation sequences involving matrix multiplication and pointwise operations such as bias addition, ReLU activation etc. Experimental evaluation of these techniques show that automatically generated kernels can provide significantly better performance than manually tuned library implementations, with speedups ranging up to 2.55×, especially through kernel fusion, which reduces the overhead of data transfer through global memory
MLIR is an ongoing project which aims to unify the compiler infrastructure for machine learning by providing the ability to embed multiple IR dialects in it e.g. linear algebra dialect or an affine dialect, with a progressive lowering and transformation of IR dialects. Overall, we believe our work is complementary and could be integrated with many of these frameworks as a library for targeting tensor cores.