It doesn’t always have to be micro-ops in descendent dialects but indeed, progressive lowering + transformations that break up an op into smaller ops seem to be a reasonable way to get started. You could look at how vector.contract
gets progressively lowered to either:
vector.matrix_multiply
→llvm.matrix_multiply
vector.outerproduct + vector.transpose
→insert/extract/fma
- directly to
insert/extract/fma
From any of these levels it would make sense to go to a HW-specific dialect e.g. GPU.
However there are also implications on chains of ops with sources / sinks to memory (e.g. load
/store
and vector.transfer
); see e.g. @ThomasRaoux’s commits to iree to see a bit of the gradient here.
There is also some WIP that adds a lowering to vector.reduce
.
A lot more work is needed to get a meaningful set of useful primitives such as described by @Lichtso .
For example, Linalg
supports a primitive lowering of ops to library calls for semantically named ops. This needs to be extended (e.g. like discussed here) but it shows that starting from high-level, semantically charged ops, we can mix transformations, codegen and library calls. This is also why I have been talking about ops whose semantics is captured by attributes: this allows building generic mechanisms to mix codegen + library calls.
I imagine similar mechanisms can be generalized and serve the same purpose. However, it is unclear at this point whether all/most of the ops discussed in the meeting have such attribute-based representations but the least the compiler / analysis and transformations need to know about e.g. my_special_scan_and_conv_op
, the better.