MLIR News, 43rd edition (9/18 - 10/1/2021)

mehdi_amini · September 24, 2021, 2:53am

See the previous published edition
Welcome to the forty-third issue of the MLIR (bi)Weekly, a newsletter covering developments in MLIR, and related projects in the ecosystem. MLIR (bi)Weekly is brought to you by a collective effort of contributors, we welcome your contributions!

Highlights

A proposal has been sent on the PyTorch forums about Torch-MLIR!

MLIR Core

Infrastructure

The current OpConversionPattern::matchAndRewrite are deprecated and being removed in favor of the OpAdaptor overloads.

Codegen

Sparse compiler progress:
- Sparse constants no longer “expand” into a dense iteration space, but are directly converted to sparse tensor storage at runtime (courtesy Bixia)
- Generalized support for reductions beyond just SUM
- Fixed ABI issue on ARM64 in support lib (thanks to Javier for debug help)
Min and max ops have been added to std and capture FP/nan semantics better.
Reduction detection has been refactored and improved across dialects. Linalg vectorization now supports more cases including min/max.
Linalg.pad_tensor gains a “nofold” attribute which prevents folding and keep the ops around to enable packing even in the cases where sizes divide evenly.
Various improvements in progress to Linalg comprehensive bufferization and refactorings to allow better interop with external projects such as IREE.
Older C++ only ops are being retired in favor of their OpDSL equivalents.
Codegen strategy refactored to make better use of the pass infrastructure and become more usable.
New foldings of vector.transfer and tensor.insert/extract_slice have been added.

TOSA

Some general improvements to quantization
- Relaxing quantized tensor type requirements
- Ranked constraints fixed on quantization builders
Type verification expanded for basic shape manipulations
- was causing crashes during shape inference.

In the Ecosystem

IREE : An Experimental MLIR Execution Environment

Making progress towards fixing some gaps in the codegen backends
- All ops that are to be executed on the device need to be tiled and distributed. A couple of ops remaining, after which all ops are default parallelized on CPU and on GPUs
- Looking into using the newly added fusion transformations in MLIR core for doing fusion at vector level, which will allo
CUDA Backend
- Added option to control tile and workgroup size from IR to enable search for CUDA using the same mechanism as CPU
- Integration of IREE in mmpref (https://github.com/mmperf/mmperf) to allow contiguous comparison of IREE GEMM with cuBlas and TVM
- Misc bug fixes and configuration tweaks for Bert learning

TensorFlow / MLIR-HLO

Kernel Generator

Implementation of kernel generator JIT mode is complete. We are now completing final steps for the launch approval in the next TF release.
We have further optimized the calling convention for unranked results. The memref descriptor is now allocated on the stack of the caller, avoiding heap allocations in the call.

CIRCT : Circuit IR Compilers and Tools aka ‘MLIR for hardware’

A lot of progress has been made on lowering the SCF dialect to the Calyx dialect. Woo! Since the Calyx compiler is not completely fleshed out in CIRCT, there is also an emitter to the native Calyx compiler IR (documentation for this is found here). That means we can currently lower:
SCF dialect → Calyx dialect → Calyx native compiler (with spunky optimizations) → SystemVerilog.

Topic	Replies	Views
MLIR News, 39th edition (7/24 - 8/7/2021) Newsletter	768	July 29, 2021
MLIR News, 37th edition (6/26 - 7/9/2021) Newsletter	1012	June 29, 2021
MLIR News, 35th edition (5/29 - 6/12/2021) Newsletter	898	June 1, 2021
MLIR News, 44th edition (10/2 - 10/15/2021) Newsletter	661	October 14, 2021
MLIR News, 41st edition (8/21 - 9/3/2021) Newsletter	788	August 25, 2021