MLIR News, 31th edition (4/3 - 4/16/2021)

See the previous published edition.
Welcome to the thirty-first issue of the MLIR (bi)Weekly, a newsletter covering developments in MLIR, and related projects in the ecosystem. MLIR (bi)Weekly is brought to you by a collective effort of contributors, we welcome your contributions!


  • The builtin tensor type has a new member: an opaque “encoding” attribute.
  • The mlir-npcomp project reached an important end-to-end milestone with the ability to compile and execute simple PyTorch examples (see below for the specifics).
  • A new publication about Tensor Processing Primitives was just posted, and it’ll be presented next week during the Open Meeting.



Table-driven Infrastructure



  • SPIR-V conversion now allows explicitly controlling bitwidth emulation for bitwidth unsupported in the target environment.
  • A few fixes landed in SPIR-V conversion to handle dynamic ranked memref better.
  • A few utility functions are added in SPIR-V conversion for creating push constant blocks.
  • Boolean memrefs are now properly handled when converting to SPIR-V.

In the Ecosystem

IREE : An Experimental MLIR Execution Environment

  • IREE now has moved to use linalg on tensors based compilation flow by default. In the coming weeks any potential regressions will be addressed and legacy path will be deprecated
  • The CUDA backend now has promotion of operands to use shared memory on NVIDIA GPU enabled. This is one step closer towards getting the CUDA backend on par with the SPIR-V backend (with the goal of targeting MMA intrinsics in NVVM). Eventually want to get CUDA backend generate code similar to CUTLASS

mlir-npcomp: Prototype for compiling numpy programs

  • Basic infra for annotate shapes and dtypes on arguments PR
  • MILESTONE: TorchScript unary tanh runs on reference backend PR, PR

TensorFlow / MLIR-HLO

Kernel Generator project:

  • We added support for select are landing kernels that require it. We are also expanding support for complex numbers in code generation.
  • The next goal is to complete the support for unsigned integers in HLO based code generation

TFRT: A New TensorFlow Runtime

TFRT JIT compilation can now specialize compiled kernels to operand shapes, and this allows to get rid of broadcasts at runtime, and improve performance (github commit). Longer term plan is to specialize to shape constraints to support partial dynamism at runtime without recompilation.**

CIRCT : Circuit IR Compilers and Tools aka ‘MLIR for hardware’

Recent Talks

Recent Publications

TPPs define a compact, yet versatile set of 2D-tensor operators (or a virtual Tensor ISA), which subsequently can be utilized as building-blocks to construct complex operators on high-dimensional tensors. The TPP specification is platform- agnostic, thus code expressed via TPPs is portable, whereas the TPP implementation is highly-optimized and platform-specific.
[…] TPPs fit in the MLIR ecosystem/stack as a lowering dialect, and in this way the TPP back-end could be leveraged by multiple TC frameworks.

This line of research proposes a compiler-based approach for optimizing the accelerator memories on top of traditional HLS. The main idea is to use domain-specific annotations to pass useful information to the compiler, transform the intermediate representations, and interface directly with modern HLS tools.
[…] We target novel multi-level representations, like MLIR [6], to include more hardware-related information early in the compilation flow to make progressive refinements of the architecture at proper levels of abstraction


Maybe it’s just me, but there isn’t any sound in the recording.

Works for me. If you browse random YouTube videos, do you have sound?

1 Like

Yep. My audio is busted. I also tried it on my phone for sanity checking and it didn’t work there. Turns out my phone volume was turned down all the way.

Sorry for the noise.