Welcome to the first issue of the MLIR (bi)Weekly, a newsletter (published on Friday) covering developments in MLIR, and related projects in the ecosystem. MLIR (bi)Weekly is brought to you by a collective effort of contributors, we welcome your contributions!
The mlir-vulkan-runner has landed! This allows to execute MLIR snippets on actual Vulkan devices!
Now we have a way to perform integration tests for the SPIR-V/Vulkan CodeGen path from higher level abstractions like other runners.
- The Toy tutorial was updated to discuss custom assembly formats (see in the next section)
- A new TypeRange class was added that functions similarly to ValueRange.
- The verifyConstructionInvariants methods were refreshed to now always emit errors.
- TypeConverter was refactored to use composition instead of inheritance for registering type conversions.
- DenseElementsAttr now uses hex when there are a large number of elements, greatly improving the performance of the parser/printer for large tensors.
- Symbol/Symbol Table are documented online and functionalities have expanded:
- Logging in DialectConversion was improved to better reflect the structure of the conversion process.
- ConversionTargets may now mark “unknown” operations as dynamically legal.
- cmake -DBUILD_SHARED_LIBS=on now works.
- ODS operations can now declaratively define their assembly format instead of writing C++ for custom parser/printer.
- Terminator successors can now be defined in ODS.
- (Standard) Implemented basic optimizations for indexCast.
- (LLVM/GPU) Modified the default calling convention for MemRefs to avoid stack exhaustion and performance issues on GPUs.
- (GPU) Landed the initial attribute-based mapping from parallel loops to GPU kernels.
- (Loops) Implemented simple fusion of parallel loops.
- (LLVM) Cleanups in LLVM IR dialect and target, intrinsic generator simplification pending.
- (Vector) Implemented vector reduction operations and WIP on progressive lowering of vector contractions through that.
- (Vector) Added support for progressive lowering of fused multiple-adds on Vectors down to LLVM intrinsics.
- (Linalg) Implemented fusion of generic Linalg operations on tensors.
- (Linalg) Added support for fusion 3+ Linalg ops.
- The mlir-vulkan-runner has landed! Now we have a way to perform integration tests on higher level abstractions like other runners.
- Added resource limits to SPIR-V target environment. This will be used in the future for guiding CodeGen.
spv.funcas a better modelling for functions.
Fleshed out lots of
spv.GroupNonUniform*ops in the SPIR-V dialect.
Introduced a pattern to convert Linalg reduction to
spv.GroupNonUniform*ops, which requires special capabilities to be available.
Introduced dialect-specific attribute,
#spv.target_env, for expressing the target environment.
- Progressing on improving SPIR-V lowering patterns/passes composability and reusability; landed a few patches, still work to do.
In the Ecosystem
- Some preparatory work has landed to prepare the FIR Dialect landing in the master branch.
- Richard Barton summarized the current plan to land Flang in the monorepo on llvm-dev@
- [TF Support] Prototyping TensorList support in progress: added a
tf_tensorlistdialect, its companion IREE
tensorlistdialect, and introduced a VM custom module for it.
- [TF Support] Prototyping strings support in progress: added a dialect for TF strings.
- [GPU CodeGen] Landed the pipeline that goes from HLO to Linalg to Loops to GPU to SPIR-V with correctness tests for pointwise ops, and working on expanding op coverage at different lowering steps.
- [HAL CPU Backend] Working on bring up LLVM JIT as IREE HAL backend.
- [HAL Intepreter] Working on bring up VMLA as the new HLO-level interpreter.
The work is divided between three areas:
- TensorFlow to TensorFlow Lite converter: there has been significant progress recently, the tool is getting close to release. Interesting work on quantization is happening and a more complete doc is coming.
- TF/XLA bridge: with TPU as the first target, the sequence of passes is almost complete. The tests are a good way to exercise and play with individual passes.
- General infrastructure development to prepare the “after” GraphDef as an optimization and runtime format. The recent layout optimization pass is an example of the direction to use MLIR for the core of TensorFlow rewrites.
A functional pattern-based language in MLIR
AccML 2020: Accelerated Machine Learning 2020
Martin Lücke, Michel Steuwer, and Aaron Smith