Welcome to the third issue of the MLIR (bi)Weekly, a newsletter (published on Friday) covering developments in MLIR, and related projects in the ecosystem. MLIR (bi)Weekly is brought to you by a collective effort of contributors, we welcome your contributions!
See the previous published edition.
Highlights
- A new dialect to model AVX512 intrinsics has been added.
- The initial MLIR Fortran IR (FIR) has landed!
- TensorFlow has a new runtime: TFRT! We had a deep dive at the last Open Design meeting (slides - recording)
- The
hasNoSideEffect
traits on operations has be removed in favor of a richer interface.
MLIR Core
Infrastructure
-
PatternMatchResult
has been replaced withLogicalResult
, as PatternMatchState has been deprecated and removed. - Patterns can now notify the rewriter why a match fails via
rewriter.notifyMatchFailure
.- The output is visible in DialectConversion via
-debug-only=dialect-conversion
- The output is visible in DialectConversion via
- The inliner now detects, and erases, single-use callables.
- The MemoryEffectOpInterface has replaced HasNoSideEffect making DCE/CSE more powerful. Documentation for this feature should come soon!
- Attribute names are no longer required to be identifiers.
Table-driven Infrastructure
- The declaration of dialect classes can be generated by ODS via
gen-dialect-decls
Code Generation
- “Named ops” on tensors and buffers: RFC and sketch of Linalg “named ops”
- Vector dialects: vector.contract lowering to llvm.intr.matrix_multiply.
- Vector dialects: work starts on llvm.intr.masked.load/store and vector.transpose
- Vector dialects: AVX512 dialect RFC, dialect landed in core
- ModelBuilder for E2E Runs on CPU: llvm.intr.matrix_multiply benchmarking and comparison with naive vanilla MLIR lowering (~8x benefit measured).
- Collaborations: @fhahn started exposing row-major LLVM vector intrinsics.
SPIR-V
- SPIR-V type hierarchy gains the ability to query required capabilities/extensions.
- A new attribute, spv.vce, is introduced to contain the (version, capabilities, extensions) triple; spv.module op is revamped to take advantage of this new attribute.
- Conversions towards SPIR-V are now generally target environment aware. This includes both ops and types: types not available on the target will be rewritten; ops not available will not be generated.
- A pass is added to deduce a spv.module’s (version, capabilities, extensions).
- The Vulkan runner now uses C wrapper for calling into the Vulkan runtime.
In the Ecosystem
Flang, the LLVM Fortran Compiler
The initial MLIR Fortran IR (FIR) has landed!
IREE : An Experimental MLIR Execution Environment
- Codegen and ops (all correctness focused right now):
- Enabled many misc op integration tests to use new LinAlg based codegen on CPU through HLO->LinAlg conversions
- Switched from simple hand-coded conv shader to LinAlg based Conv for CPU/GPU (test case)
- Enable gemm through LinAlg path on GPU (commit)
- Handle multiple reduction dimensions on LinAlg lowering of HLO reduce (commit)
- Simple lowering of HLO reshape to (linalg.reshape + linalg.copy)
- Integrations:
- Initial lowerings of TensorFlow string type and ops (commit)
- Compiler entry points for compiling from HLO protos (for JAX demo)
- Started work on JNI bindings
- Build:
- MSVC compatibility and vulkan integration demos working via CMake on Windows
- Python bindings can set LLVM options and dump nice stack traces
- Misc:
- Updated workload calculations to be dynamic shape friendly and be more ready for codegen of CPU kernel concurrency
- Work in progress work to plumb (ranked) dynamic shapes through from frontend to codegen backends (eta of ~1 more week for example dynamic shape test cases, including dynamic LSTM, batch MLP, etc)
Teckyl: An MLIR frontend for Tensor Operations
Reusing the TensorComprehension frontend, this project enables to take input like:
def mm(float(M,K) A, float(K,N) B) -> (float(M,N) C) {
C(i,j) +=! A(i,k) * B(k,j) where i in 0:M, k in 0:K, j in 0:N
}
and produce MLIR Linalg from it!
TensorFlow
- Some informations about how to play with TensorFlow MLIR can be found in this mailing-list thread.
- First uses of shape inference for code generation have landed. The XLA buffer allocation works on dynamic shapes in a more principled way based on the InferShapedTypeOpInterface from the shape inference dialect. Buffer allocation will next be refactored to work on other dialects, as well.
- HLO reduce can now be lowered to parallel loops
PlaidML
- Added some search / auto-tuning for finding the best tiles in the “affine-stencil” pass.
Recent Talks
- The MLIR Shape Dialect work was presented at the MLIR Open Design meeting (slides - recording)
- MLIR: Accelerating TF with compilers ; recording was presented at the TF Dev Summit 2020
- Google announced a new runtime for TensorFlow! We had a deep dive on TFRT at the last Open Design meeting (slides - recording)