MLIR News, 30th edition (3/19 - 4/2/2021)

See the previous published edition.
Welcome to the thirtieth issue of the MLIR (bi)Weekly, a newsletter covering developments in MLIR, and related projects in the ecosystem. MLIR (bi)Weekly is brought to you by a collective effort of contributors, we welcome your contributions!

MLIR Core

Infrastructure

Codegen

  • The website documentation was updated to provide instructions about running the integration test, in particular the use of Intel emulator for running AVX-512, AMX, and other vector extension without the need of the most recent CPU.
  • Sparse compiler progress:

Other

  • Progress continues on the TOSA support, in particular with more lowerings to Linalg.
  • Support for registering runtime functions (or callbacks) was added to the C API for the JIT, and an example shows how to use this to/from Python.

In the Ecosystem

IREE : An Experimental MLIR Execution Environment

  • IREE’s CPU and GPU backends can use Linalg on tensor based lowering to lower MobileBERT and MobileNetV2
    • Performance of these are on par with the previous Linalg on buffers based lowering
  • PR out for enabling fusion Linalg named ops (like Matmul and Conv variants) to be fusible with consumer elementwise operations (which would cover Bias Add, Sigmoid, etc). With these the performance of MobileNetV2 is now better on the Linalg on tensors path. Updated numbers of MobileBERT will be available soon once this lands.
  • IREE Cuda backend now vectorizes all elementwise operations. This pretty hits peak for these classes of operations. Basic tiling and vectorization (not tuned) is enabled for matmul ops.
  • Vectorization has been turned on by default for all GPU backends.

TensorFlow / MLIR-HLO

XLA-GPU is able to take a pure LMHLO module (with a few MLIR attributes specific to XLA) and run it to the end using existing infrastructure. All XLA/GPU production now goes through LMHLO. Individual debugging tools still depend on XLA HLO, though.

Kernel Generator improved support for fusion of ops with dynamic shapes in presence of dynamic broadcasts, and improved Rank Specialization for ops with arity > 2.

Recent Talks

  • 2021-04-01: Discussion about MLIR Bindings (C API, Python Bindings, other languages) status ; slides - recording

Recent Publications

Phism: Polyhedral High-Level Synthesis in MLIR

Polyhedral optimisation, a methodology that views nested loops as polyhedra and searches for their optimal transformation regarding specific objectives (parallelism, locality, etc.), sounds promising for mitigating difficulties in automatically optimising hardware designs described by high-level synthesis (HLS), which are typically software programs with nested loops. Nevertheless, existing polyhedral tools cannot meet the requirements from HLS developers for platform-specific customisation and software/hardware co-optimisation. This paper proposes ϕsm (phism), a polyhedral HLS framework built on MLIR, to address these challenges through progressive lowering multi-level intermediate representations (IRs) from polyhedra to HLS designs.