See the previous published edition.
Welcome to the tenth issue of the MLIR (bi)Weekly, a newsletter (published on Friday) covering developments in MLIR, and related projects in the ecosystem. MLIR (bi)Weekly is brought to you by a collective effort of contributors, we welcome your contributions!
Highlights / General
- Multiple interesting RFCs have been shared on Discourse over the last two weeks:
- There is a new git pre-push hook and script to sanitize your commit messages: just run
- Interface internal storage have been revamped: this will speedup the lookup of interfaces, and it provides a generalization on which we’ll build Interface support for Types and Attributes.
- RewritePatterns are no longer required to provide a specific root operation, and may omit the root to match any operation type: this enables more generic patterns.
- The way that dialect conversion converts block argument types has been refactored. Patterns are now responsible for converting region block arguments via
generate-test-checks.pyutility now supports attaching CHECK lines directly to a source file.
gen-op-declsnow supports filtering which operations to generate via
HasParenttrait can now be used to specify multiple possible parent operations.
Op::OperandAdaptorclasses have been replaced by more general
Optimizations and Code Generation
- The lowering to the LLVM dialect supports now returning unranked memref.
- Buffer allocation has grown support for region based control flow, which allows to buffer-allocate the SCF dialect. Next up is support for operations that return an alias of their inputs.
- Speedups on x86-avx2 go up to 24x for
vector.create_mask, using “SIMD” form in the 1-D case. Even though a LLVM constant always results for the former, rather lengthy IR would occur as an intermediate step, sometimes crashing the compiler for long vectors.
- Floating-point “horizontal” vector reduction are now going through the X86 backend when adding the
reassocfast-math flag on the LLVM vector reduction intrinsic. Speedups on x86-avx2 range from 8x to over 20x compared to strict order scalar reductions (this “super” linear behavior is due to much cleaner, spill-free SIMD code).
- More support for SPIR-V matrix types landed. @hazem added op definitions for
spv.Transpose, and improved
spv.AccessChainindex type handling.
- A new pattern by Denis Khalikov to rewrite sequential chains of
- The SPIR-V to LLVM conversion GSOC project is making good progress. @george added conversions covering more logical/cast ops, more bitwise and bitfield ops. spv.func and spv.module can also be translated now.
- The vulkan runner supports more memref element type bitwidths and is fixed to use staging memory and GPU local memory.
- The shape dialect is evolving very rapidly and gaining multiple lowering ability through
mlir-vulkan-runnergained the ability to use GPU device memory
In the Ecosystem
IREE : An Experimental MLIR Execution Environment
- Cross-compilation towards Android via CMake has landed. This supports IREE core runtime (both VMLA and Vulkan HAL drivers) at the moment. Smoke Tests for both VMLA and Vulkan pass on Android 10.
- A new table to summarize IREE’s TensorFlow end-to-end test case coverage is online.
- The plan for using the MLIR code generation for XLA GPU was shared on the TensorFlow MLIR mailing-list.